Good results with reader interface, light pedestal and other tasks, week ending Jan 19, 2014


I had documented the correlation of output board signals incorrectly, but have corrected that in both documentation and logic. However, until I get my transactional protocol working properly, I am not moving forward. It is time to get more basic in debugging. I hauled out the Tektronix 1220 logic analyzer with my probes and cables, in order to record the state of key signals as I run through the protocol.

No luck with the 1220, but switched to my fpga based logic analyzer and accomplished the tests. I began with a snapshot of the I2C link itself with a sequence representing the first few initializations.

Right away I saw that the values being output were associated with a later step in the initialization, not the first transaction - a timing problem with the MCP23017 chip still coming out of reset but my logic racing ahead assuming all is ready on the remote end. Furthermore, I was advancing through the states without getting confirmation that the link is operational, as it simply 'errored' out and moved to the next command.

I believe I should impose a fixed delay that is long enough for my logic to be pumping out clocks and for the slave chips to be fully operational. I can make this change to the i2c_master code I am borrowing, or to my own logic. The advantage of the centralized change is that it will support every I2C serial link I run. The only disadvantage is that I will need to set up a worst case wait time which is imposed on all links whether or not their specific hardware implementation would be ready sooner.

I chose 80 milliseconds as a reasonable pause after coming out of reset, to allow external chips to come out of their hardware reset and be ready to receive our transactions. It is implemented as an additional state in the i2c_master FSM, initially entered where it counts down by one on each I2C clock cycle until the counter reaches zero, wherein it moves to the normal start state.

At reset, the counter is loaded with 40,000 (for a 400KHz clock rate) and the FSM is pushed into the new reset state. The module reports it is in startup mode now, so that my other logic can wait until it is ready to accept transactions before the FSMs for the serial link will advance state.

My debugging is proceeding up the communications stack, first solving physical layer issues like termination, signal degradation and voltage margins. Next, I am solving I2C protocol issues and later I will focus on the messages themselves and their effect on the slave devices.

With this set up and everything synchronized, I ran the logic analyzer to see what I am sending to the chips and what they are responding. I still see the first string going out as {start x4e x12 x12 x00 start x4e x05 x05 00 .. . } which interestingly and pathologically has no stop bits. I suspect this is a consequence of how I turn the bidirectional open collector data line around to listen for NAK or ACK from the slaves. I must not gate the data logic output to transmit a stop.

A stop is generated by data pulled low while clock low, letting the clock go high and then while the clock is high, letting data jump up to high state. I am not seeing that sequence where it should be happening.The fault is in a modification I made recently to that module. I corrected the error in the module and pressed on.

No w I am recording start and stop bits and well formed messages. Next up, however, is the issue that my i2ctransaction module is not working correctly, because it is repeating the first field and not setting up the output patterns correctly.

Made adjustments, but failed to keep the transaction trigger flag high, causing each word to be sent as a unique message. Thus where I might send x09 (register address) followed by x02 (value to put in register), what is sent over the bus is {start x4e (chip address) x09 stop}, then {start x4e x02 stop}. I have made a fix to that issue.

One other problem is still occuring - the first initialization transaction, writing xff to registers x02 and x03, has somehow been sent before the link is operational, such that I begin with the second transaction, writing x00 to registers x0c and x0d. I put more interlocking into the code, keeping the FSMs from moving until I have first received the flag that a startup pause is over in the i2c_master module, then waiting for that module to drop the busy status when it is 'open for business'.

I saw a clean sequence of all the initialization transactions in the first sample, although the buffer wasn't long enough to get through the main read/write loop. My logic if flagging quite a few of the transactions as having sustained an error, but I don't see it from the protocol trace - no NAK, always ACK responses. I did another run at a slower logic analyzer clock rate in order to fit in the entire sequence including the full main loop, after which I should know a bit more about what is happening. I may have false detection of errors, or I may have errors I didn't spot.

I suspect I may have transient errors, not a hard error, so I will route the error signal out to the logic analyzer where I can trigger a sample based on this flag. I will capture a few error situations and see what is occuring in those cases.

I seem to receive these errors when addressing the chip, where it sends a NAK because it doesn't recognize its address. Looking at the logic analyzer, it seems as if when the data line is released and is accelerated by the buffer chip to the high value, the clock is read as two pulses, a very short one when the data line goes high and then the real one. That changes the address as it is read, from x4E, for example, to x67 or x6F, which no chip will match. It seems to be on almost every read or write based on the logic analyzer, yet I made it through the initialization sequence without issue.

This may be power supply issues - I have a decent sized capacitor at the buffer boards but the power and ground lines may not be robust enough to provide all the power that the buffer board demands. Time to look into power quality with the scope.

In any case, hammered back into the *#*$&@ analog domain again. Yes, I know that I2C is a protocol that wasn't designed for 4 meter cable runs, but with all the buffer chips and other considerations I added, I hoped I could get a reasonable level of robustness. I had tweaked the pullup resistors on the long cable side of the buffer boards to tidy up the signal, although it looked very good already.

After I bolstered the ground connections, the link was rock solid, no errors while waiting two minutes for the analyzer to trigger, no visual sign of the error LED flickering on. Now I have very high confidence in the link quality. Back to upper layers of the stack.

I can see that I am initializing the chips as I intended. Time to flip the polarity on the three buttons on the 2501 panel, so they are 0 normally and show up as a 1 when depressed. I cleaned up the logic, made outputs registered and more of the decisions are clock synchronous now.

The link is now bulletproof, reliably delivering the data to and from the chips. I have to tweak the connections to the 2501 panel to get the lamps lighting again, just to verify the final aspects of the interface. Once the boxes themselves are ready, they will get closed up and moved back over to connect to the documation reader itself.

Everything seems excellent with the documation interface boxes, so they were carefully closed up and moved back over to be attached to the card reader itself. I verified everything worked well and that the status makes sense as displayed from the fpga and the 2501 panel.

Time to move up the stack to the next layer of debugging. First, I worked on the electrical and other aspects of the physical I2C link, until it was working very solidly at full speed. Next, I worked on the protocol of the MCP23017 chip commands and registers, until I was configuring the chips, reading and writing as expected. Now, the third stage is upon me. I will ensure that the Documation signals and controls are working through the link - that I am getting the 12 rows of data for each of the 80 columns, plus the timing and status signals.

I did a first run, triggering the physical reader remotely and seeing that the data values did flash as the cards went by. Of course, way too fast to visually tell if it is reporting every column as it reads or if the data matches the cards themselves. I have set up some diagnostic port lines with key signals that I can trap on the logic analyzer, to resolve this in more detail.

My first runs allowed me to check out quite a few of the key signals from the reader. Most of the control signals are working properly - busy, hopper check, read check, my outbound command to feed a card, the start, stop and NPRO buttons - plus I see some data values flashing as the cards stream through. What appears wrong is that the ready signal never rises and the index marker line appears frozen in the on state.

Using the logic analyzer during reads, I clearly see data rows being read. Prior to debugging from this end, I will first begin by watching key signals directly on the Documation reader, using my scope and logic analyzer test gear.

After just a minute looking at signals coming from the Documation, it was clear that the signals are all inverted - the index marker signal is actually not index marker, the data values are Not Row 12, and so forth. The documentation states that signals are TTL level, positive if true, and they are described as IM and D12, without the bar over the top that is used to denote an inverted signal.

For almost every signal, this is not a significant issue because the MCP23017 can be set up to invert the values of each signal pin, thus I can convert a not D12 to a D12 simply by changing the initialization pattern I send to the chip at startup. The one signal where polarity matters is IM - Index Marker - because it is used to control the latch board.

The 7475 D Latch chip will pass the input signals through to the output while its enable pin is high, then it locks in the last value when the enable pin goes low. I used the 7475 chips to latch all 12 row values, with the 6 us IM signal acting as enable. That locks in the values that existed while IM is high, but they persist up until the next IM (next card column). This means I don't have to read the signals precisely during the IM, I just need to know that a new IM locked in values for the next card column.

With IM inverted, the latches are passing the instantaneous value of the 12 rows at all times, only latching them for the mere 6 us of the index marker. The solution is to invert the IM signal before it goes to the latch board. I have space on the perfboard that holds the power transistor driving the pick signal that triggers a card feed in the Documation.

Having wired up the inverter on the perfboard and resolved one problem with signals going through chip 2 - the ground pin of the voltage level adjusting chip was not in constant contact with the trace - I am back to debugging through my serial link interface. I flipped the polarity of the signals from the reader, such that a one bit from my link would represent the intended true state of a signal, eliminating the inversion at the Documation connection.

Nothing was coming through - no data bits nor the index marker. A quick check on the inverter pins showed that the index marker is arriving quite nicely at the inverter but not coming out. I pulled the chip and noticed that when I dug into my supply cabinet in the inverter bin, I failed to notice that I picked a 7406 chip not a 7404. The 7406 is open collector, yet I had no pullup resistor.

With the correct chip in place, the signals are latching well and the interrupt is arriving to flag the valid data. My scan round, reading 16 bits from chip 1, 8 bits from chip 2 and then writing 8 bits to chip 3, consumes about 300 microseconds of time. The data values from the 12 rows of a card are guaranteed to remain valid for 600 microseconds after the index marker goes back to zero. That should allow me to guarantee I capture the value for each column.

However, I would like a bit more margin between my sampling rate and the guaranteed data validity time. As I thought about it, the output signals on chip 3 are pretty glacial compared to the change rate of the inputs. There are five lights to control, the pick signal to ask for a card to be read, and the signals to remotely stop or reset the Documation. I decided to change the logic so it only writes when the data for the output board is different from the prior value we wrote there. This reduces the writes to an infrequent event, compared to handling 10 cards each second having a total of 800 columns of data to capture.

I will test the new faster sampling logic Sunday and do a more thorough test for data validity over all columns on a card. There is one anomaly I am chasing down - the ready signal from the reader is always true, even when it is stopped with an empty hopper. The signal should be down in that case. I should also validate the remote reset and remote stop control signals from the output box are working correctly.

All the card rows are reading well, although I discovered that row 4 and row 5 are swapped in position. The input board has six of the eight signals for side A of the MCP23017 running vertically, but the first two are side by side beneath the others. The order of 0 and 1 signals is optimized for the traces to be routed on a single sided PCB. I must have mentally swapped the two when wiring the board. Easily accommodated by changing assignments and annotating the documentation. The data signals are signed off.

Control signals are still not nailed down, particularly ready, IM and busy. The busy signal was working but is not appearing now, staying low but should go high while cards are being read. The ready signal is constantly on, when it should be off if the reader has any stop condition such as hopper empty. The IM pulses are occasionally missing, so that I lose 4-6 columns of an 80 column card.

I have a good ready signal up to the level shifter chip input on the input board, so I just need to trace this onward to discover the fault. Probably an intermittent or open connection between trace and pin, but could be a broken trace. I found an open trace for the ready signal, bridged it, and will confirm it now works in the next test. Good news - after the first power up, it was clear that ready is now functional.

The IM signal is caught by the interrupt mechanism of the MCP23017 chip, which sends an interrupt occured signal that is what I actually use to latch in a column of data. Some timing condition occurs that does not give me the interrupt bit high, although I do see that the data values changed which means my latch card is seeing the IM pulse. I can rule out the card reader as the source of the error.

 I needed to dig into the workings of the MCP23017 interrupt mechanism, in order to formulate a hypothesis. The interrupt condition is cleared by reading the port of the chip, at the time the last bit is sent out on the I2C line. I have tied the interrupt signal to one of the pins of the port, pin 1, which is how I see that the interrupt occurred since my last read, since that read reset the interrupt flag.

I can imagine an error mode where the IM line goes high after we have clocked out bit 1, thus we already looked at the interrupt status, but as the interrupt goes high we finish clocking out bit 0 (IM) and shut off the interrupt. In that race condition, the interrupt is lost although I should be able to see the IM signal as high unless the timing is so tight that pin 0 is going high as we finish the clocking of its prior state and shut off the interrupt.

I believe I could fix this speculated problem by moving the IM signal from pin 0, the last bit to be clocked out, to the unused pin 2. That would not allow interrupt to go high unless I have also seen the IM itself.
To test this, I need to monitor both IM and the interrupt, look at all the cases where the interrupt is missing, and see if IM is on in those cases. If it is never on, the issue may be elsewhere.

The other scenario that could occur is a transient error on the I2C bus, while we are reading the port and it has the interrupt on. My logic does not update the input signals if the read has an error, in order to avoid contaminating the system with junk status. If the port is read when an interrupt occurred, the read is resetting the interrupt but if the read was ignored due to an error, I have lost my interrupt.

To test this, I will pass out the successful status signal from the I2C transactions, allowing me to look for any instances where this line went low signalling that a link error occurred. The test showed zero occurrences of transaction errors, even over a long deck of cards. This is eliminated as the source of the error.

I have to conclude that the interrupt mechanism is not dependable, due to some race condition that is poorly understood, with the precise timing undocumented. Since I am doing something unusual, routing the interrupt line as a signal into the chip rather than out to interrupt a real CPU, it does not appear to be a situation for which the design is suitable.

An alternative to the interrupt mechanism is to use the live IM signal to fire off an LM555 timer in monostable mode, to produce a 300 microsecond signal which is long enough to ensure I see it. I can use the rising edge of that monostable signal as the latching point for my card columns, since it occurs during the guaranteed valid period and is good even if I miss one round of signals due to a transient error.

If there are errors for more than one round of reads, I should detect this and flag a read error to ensure the correctness of the data I am reading. I will need to think through this mechanism. This should be extraordinarily rare, so not a high priority.

I designed, prototyped and then built the one-shot emitter, which will make pin 2 of the second input chip go high for about 280 microseconds when the IM signal from the reader goes negative. I am altering the logic in the fpga side to decode and use this. I need to test that this comes through and that the timing appears appropriate.

My pulses come through, reliably strobing for each of the eighty card columns, even in those cases where the interrupt method doesn't deliver a signal. I was able to validate that all the rows are correctly read.  I can get the data in from the cards at full speed of 600 cards per minute.

The wiring of the connections from the output box to control the remote stop and reset functions was wrong. I took some time to clean up and validate the documentation, then brought everything into order. Now I have a bit to do inside the card reader to wire in the stacker full switch signal, the remote stop, and the remote reset controls. I will do this next week.

The only remaining problem before I can sign off on the entire Documation layer is the absence of the busy signal. I will need to find the root cause and repair it, then I will be able to move another step up the stack, working on the adapter logic where I control the Documation to deliver data to the 2501 emulator logic that appears as a virtual 2501 card reader to the 1130's card reader adapter logic.


I drilled all the holes in the middle board, then marked the desired outlines of all three drilled and prepped boards. I will make another try at the left board tonight, now that I have identified the problem causing the poor board results of the past week. If that works out, I will mark it up, but in either eventuality I will cut the three boards on my table saw. It will take a bit of relocating of parts and assemblies to clear the saw, but the straight, clean cut is worth the effort.

My try to make the left board ended in shambles - 20 minutes in the developer to get a faint, faint image which had so much resist still on the board that it was not etchable and no signs of it progressing any further. Same mixing, temperatures, exposure times. I then gave it a try with the right board, to eliminate the ugly wire bridges on the current board. Also the same mixing, temps, exposure, but this time it developed faster. Still, however, the edges were left with some green resist even though the central part of the image was very clean and bright.

The new right board has a bit of left over copper impinging on the very top and bottom traces of one side, but that can be cleaned up with only a bit of work tomorrow. I am happy with the nice thick copper traces that were produced. When I clean it up, drill the holes and mark off the edges, it will be ready for when I cut all three boards down to size.

I am not finding anybody else mentioning such widely varying experiences from run to run. I am at a loss to understand the problem with the PCB manufacturing process. I have the boards I need, even if two of them are a bit ugly.

The boards were cut to size and then hand trimmed to fit in the close confines of the light panel assembly. The right panel was still a bit long, extending off to the right into the space where the attachment/support post is attached. I trimmed it down, knowing I will need to improvise a bit on the wire attachments as a result.

I soldered all the cabling on the boards - these are now ready to have the LEDs soldered in place. I expect this to be a more resilient and reliable assembly than the prior version. The LEDS are scheduled to come in the afternoon tomorrow, which will allow me to start soldering when I get back from my Wednesday visit to the Computer History Museum.

The LEDs are going on, slowly but steadily. Almost 150 to install, getting about 15-20 done in each session before my back begins to cramp up a little from hunching over the workbench. Fortunately, plenty of other tasks to take on while I relax the back muscles. As of Wednesday night, after receiving the parts, I had 30 installed and tested.

The next day I soldered all the remaining LEDs to the boards and verified they all lit properly. A final test of the panel hooked to the LED driver board and the link to the 1130 fpga showed all lamps working properly, displaying the correct data and doing so reliably.

The lamps may be a bit too bright right now, causing some shine through of the black portions of the front plexiglass panel, but that could have been because the light panel wasn't fully against the plexiglass, allowing light spillage in unintended areas. If it still shows through too much, I can dim all the lights with a simple command issued as I initialize the LED driver chip - the dimmer light should limit the shine-through effect.

I have to secure the PCBs to the black acrylic masks that isolate the light from each LED, then work on an alternative mounting for this assembly, the light panel. I am also evaluating the  wisdom of pulling the LED PCBs back a bit from the black acrylic mask, which will narrow the angle of the beam striking the plexiglass panel and reduce scattering of light to surrounding areas. This may be sufficient to address the shine-through I noticed.

Having decided to pull the LEDs back in the acrylic mask, and to also move the mask as close as reasonable to the face of the plexiglass panel, I have to work out some standoffs and mounts for the circuit boards. I can screw the black acrylic masks into place independently of installing the light panel circuit boards into position.


Permanent cables in the post somewhere between Greece and here. I haven't made all the temporaries yet to test the readout board, since the scope upgrade is not a priority. I did knock out segments of cable with peltola one end and sma on the other. I have some long cable segments with molex pin connectors; those will get sma on each end and then I can use female-female adapters to link this stuff to any length I need.

I do need a few more SMA plugs and adapters to finish the full complement of cables. More good news - the replacement cables from Greece cleared customs two days ago and are now moving through the USPS across the country to me. Shouldn't be many more days.

The substitute cables I made are hooked up and it is time to fire up the machine and test it. First up is a test of its normal functioning, then a readout capable plugin should be inserted and the 'identify' button pushed. The results are mixed. The oscilloscope works just fine in its own mode, but it appears that the readout board is not working properly. I see some dots stream across the trace line when I think the board is writing characters, but they are not moving the beam to the proper y point to paint the pixels. It seems the Z axis is doing something but with the dots moving on the trace line, the X axis is likely also not right.

These issues could be the result of my handmade extended cables, but I should first triple check all the connections and continuity of my cables. If that doesn't turn up a problem, I can try a few of the debugging tests from the manual but won't spend much time until I have 'real' cables in place.

Turned out that one of my cables had a short - I had thought I checked all the segments as I made them, but it was clearly shorted to ground. I had another way to make the connection, put it in, and had success. My oscilloscope now supports readout - a nice clean "identify" and the values of the switch settings on the vertical amplifer are shown on the display.

I hauled out my Tek 1220 Logic Analyzer, found that the 50 conductor ribbon cable on one of the two probe pods (6442 type) was torn, and the other didn't seem to register any signals when I hooked it up. I will do some diagnostic work later to see whether this is a probe pod issue or a problem in the analyzer acquisition and triggering circuitry.

I have repaired the probe pod that had a torn cable, by cutting the cable just past the tear, removing the bad fragment from the pod ribbon cable connector and connecting the surviving cable part. I still haven't gotten a good capture on the logic analyzer but I still am not sure where the problem may lie. It might even be improper configuration, since I am new to this device.


The curve tracer works well, but is tied to the power transformer of a bench power supply, which is currently opened and disabled to donate its 30V center tap windings to the tracer. I want to install a DPDT switch to throw the windows to either the bench supply or the curve tracer regulator, plus install some connector for the power that will allow me to close the supply up neatly. If I can find a suitable enclosure to hold the curve tracer, that would be a nice touch as well. No priority on this, but it takes up more space while open like it is than it will when I put everything neatly in boxes.

I finished the changes to my bench power supply. It now has a small switch that flips power between the regulated outputs of the unit and a new set of binding posts that offer regulated +15, 0 and -15V. The regulator for the curve tracer is now mounted on the inside of the top of the bench power supply. The curve tracer itself remains a naked circuit board with its three wires to hook to the power supply and the two cables that run to the oscilloscope. If I put that board in an enclosure, I would need to add the test terminals for transistors, two small switches to handle transistor polarity and current range, connectors for the cables to the scope, and connectors to the bench power supply. Not worth the effort right now.

No comments:

Post a Comment