Away from project dealing with family matters

I have had to spend a few weeks with my in-law after a bad fall, helping him to recover and sorting out modifications to his home. That also caused a work backlog, so it may be a week or two more before I can get back to the replica and testing.

Documenting the recreation of an IBM 1130 computer

NEWEST POSTS AT END - this blog is sequenced with first post at top. To achieve this, I had to date the posts with false dates, this one at the end of 2013 and the ones below on earlier days. Thus the archive list on the bottom right of the blog will refer to Dec, Nov, Oct and Sep 2013 for the posts. To see more posts, go to bottom of page and click on "Older Posts" but for convenience, a list of the most recent are on the top right of this page. . 

I have very fond memories of the IBM 1130 computer, as it was the first computer that I could spend hours on in the wee hours of the morning, after all the official work it had to process was completed. This helped me really grok the way it worked, stepping through programs and watching how it worked by way of the operator console, lights and switches primarily.

The system was produced in the same era as the IBM 360 series of mainframes, back in the 1960s, primarily using punched cards to submit programs and a high speed printer for the output of your work, but also providing a typewriter and keyboard that was used infrequently for certain programs or tasks.

It would be a blast to own one and toy with it for nostalgia, but impractical in several ways. Very few of these still exist, perhaps less than ten worldwide, and only two or three are even in operating condition. They are power hungry and space wasting - particularly when you add the card reader, keypunch to create the punched cards, line printer and other devices you need for a functional system. In some ways, the devices attached to the computer are as rare in their own light and challenging to acquire. My garage does not have room to host a working 1130 system, nor is it a high priority to spend the substantial amounts likely required to buy and restore the system.

Fortunately, modern technology has advanced incredibly far from the capabilities available decades ago. In a physical IBM 1130, the logic required hundreds of printed circuit boards each with dozens of components, while a single logic chip today can encompass all of the 1130 circuits and much more. Further, the speed of today's technology is so much faster that timing is not a challenge - no matter how complex a set of logical steps might be necessary to recreate the behavior of any circuit in the 1130, those steps can be stuck in the infinitessimal gaps between the glacially slow logic signals of the 1130, given the enormity of the speed advantage available today.

This blog will document my journey from desire to idea to recreated IBM 1130. It will not be a recreation in the sense that a museum would attempt, as that would seek to build a machine that looked almost identical to the real 1130 and used as close to identical parts and materials as is possible. Instead, my aim is to maximize the 1130 "experience", where I would be able to replicate the looks, sounds and behaviors from my youthful 1130 interactions, within some practical bounds.

The irony of that adjective - practical - will be apparent as you follow this journey to the recreation. As time went on, my threshold of practicality moved and moved again, involving more and more detailed recreations of the actual appearance and behavior. Not to spoil the reading of the blog too much, but at some point I decided I could recreate the console typewriter, the light panel, switches, buttons and keyboard. This involves modifying IBM Selectric typewriters, creating sheetmetal, formica enclosures, installing frosted lighted buttons and other details that drove up the project complexity and cost, but also increased the experiential fidelity and thus my ultimate satisfaction.

I had essentially zero hardware design experience or training at the outset of this project. I have poked and fooled with electrical and electronic items all my life, but with very limited understanding. I could look at simple circuits and understand them from a basic DC standpoint - trace the wiring and the switches. I had a rudimentary understanding of an RC constant but otherwise didn't grasp AC circuits at all. I understood basic logic gate types and ways to simplify or understand their connections, but that was nearly all I knew. Ohms law was the limit of my analytical toolkit for electronics. I could put kits together (thank you Heathkit and others) and could make crude modifications as long as I didn't need to calculate specific values or do too much engineering.

My entire working life I was exposed to hardware and would have loved to be capable of designing, modifying and deeply understanding it, but didn't have the skills. This project was a means of diving in and acquiring those skills, learning by doing, and having a driving goal that would keep me engaged until my understanding rose to levels where I could accomplish all the electronics tasks I had always wished I could accomplish.

The resulting designs and devices produced during this project are going to reflect this learning process. Some of the work will not be up to the standards of a working professional electronics engineer. As I built up skill in digital hardware design and VHDL coding, there were times when I used poor practices out of ignorance, or because I hadn't reached the skill level to properly see and implement the correct approach. I have tried to go back through the project periodically and improve sections more in line with current best practices.

This blog documents the 1130 effort, not the learning journey I undertook as a necessary part, but there will be places where I allude to my personal learning, perhaps covering some detail that is blindingly obvious to those readers who are experienced digital designers, but when it wasn't blindingly obvious to me as a tyro, I suspect it might not be obvious to the readers who are not engineering professionals.

From dream to idea and project - the trigger

I am subscribed to the IBM1130 group on google groups and read with interest a post at the beginning of 2012 where a member mentioned a past hardware implementation of an 1130 using FPGA (field programmable logic array, a chip that can be structured to create almost any kind of hardware logic through programming).

Another member, Richard Stofer, replied that he was the creator and that his design was 'still available'. He had described it at an "IBM 1130 party" that was hosted once a year (on Nov 30th - 11/30) by the owners of one of the few remaining 1130s - Brian Knittel and Norm Aleks - http://ibm1130.org/. The annual parties no longer take place, sadly, but video clips of the last party in 2008 including a presentation by Richard Stofer about his project. In it, he described not only the design and the hardware he built, but the tools and products needed to do it.

Brian Knittel has built a software simulator of the 1130, available for download on their web site, that I was using to assuage my nostalgia. However, the simulator was not fully accurate, in that it didn't attempt to recreate the hardware console modes that I had used to single step my way through instructions to learn the 1130.

It was accurate from the view of software executing on the machine, whether it was user programs or the operating system - DMS2 - and other IBM provided software. It was not, however, accurate to a person who would have been working with the 1130, thus many buttons and switch settings did not work and some of the peripheral implementations were incomplete. For example, it is possible to use formats for punched cards that the 1130 simulator did not accomodate, since the software Brian wrote took input files that were either ascii 80 column images or 1130 binary format, but it was not dealing with the hollerith code (cards had punches in rows 12, 11, 0, 1, 2 . .. . 9 which in various patterns represented characters or data. The letter C was a punch in the 12 and 3 row of a column, for example. Thus, while a great pleasure to see the 'printed output' files created in ascii as I ran the 1130 and ran programs, it didn't satisfy the itch.

I contacted Richard who kindly agreed to share his designs and other information. I bought the hardware and acquired all the tools he described and created my own copy of his 1130 in fpga.

For those who care about the technical details, this paragraph will detail what was involved, otherwise skip to the following paragraph to continue. The design was implemented on a FPGA development board from Digilent (www.digilentinc.com), the Nexys2 with 1.2 million gate equivalents, written in VHDL which is one of the two main languages used to program hardware. Nexys2 is based on the Xilinx Spartan 3E FPGA and I had to install their ISE web edition development tools. In addition, I needed to program an MBED microcontroller to implement the emulation of the 1130's plotter (IBM 1627) which Richard accomplished by translating the plotter commands to the HPGL language from HP, used to drive their plotters (and the many HP normal page printers that support HPGL).

I built my version of Richard's fpga 1130 and learned a lot about logic design, VHDL and working with hardware. It was a bit more faithful to the 1130, but Richard did not have the same mania for an experiential recreation that drove my interests, he was focused on the fidelity to software, just as Brian was with his simulator, and did not implement the single cycle, single step and other modes I had hoped to recreate. His machine was an FPGA implementation of the Functional Characteristics, the manual that defined the way the machine would behave and is the main document that is used to provide the software level fidelity that was the aim of both Brian and Richard. In some ways, it was the simulator but built in hardware rather than software, although I want to be clear that this was an original effort by Richard that built the 1130 from the ground up from IBM documentation; it did not derive at all from Brian's simulator.

I worked with Richard on several improvements, both those I triggered and quite a few improvements that he created. We still swap ideas on ways that the fpga 1130 could be enhanced. The itch remained unsatisfied, however.

I discovered copies of some IBM manuals on a site called Bitsavers (www.bitsavers.org) that is preserving key historical documents and software from the earlier days of computing. Al Kossow drives that effort as well as serving as software curator at the Computer History Museum near me in Mountain View, CA. The FE Theory of Operation manual described enough about the way that the 1130 worked that I believed it might be possible to build a more faithful hardware version, one that would produce the same results on the console lights and support all the switch settings and buttons of the actual 1130. That trigger began my quest to build my hardware 1130.

Complications related to IBM technology used in 1130 - 1

In order to recreate the 1130, I began researching the logic design and technology employed to build the 1130. This is the same technology IBM created to design and build their 360 series of mainframes - Solid Logic Technology (SLT). With enhancements as MST, it was used for their 370 generation as well.

A small bit of technology history is necessary to understand the reasons that IBM created SLT. These lead to the peculiarities that made my project more daunting than I had assumed when I began this quest a year ago. At the time that IBM was developing SLT for the upcoming "bet the company" S/360 launch, the industry was just transitioning from discrete transistors to integrated circuits in 1958, after the invention of the IC by Kilby at TI and by Noyce at Fairchild. For the first few years of the 1960s, ICs were very expensive and primarily used in aerospace and other defense areas where weight or other characteristics of the IC were all important. The industry was still debating the best way to build logic gates, with adherents promoting  schemes such as DTL (diode oriented with transistors as active switches), TTL (transistor-transistor), and RTL (resistor-transistor). Traces of all these approaches are visible within SLT, which tends to the use of diodes and transistors in many of its building blocks. These were legitimate debates because the tradeoffs of speed, cost, size and power requirements were important choices when a large computer would fill a room just by itself. The eventual economics of ICs weren't obvious, as the industry was delivering small volumes at very high prices to demanding military specs that yielded many rejected chips during manufacture.

However, the cost advantages were obvious from reducing the number of discrete components on a board and the amount of wiring that had to occur between these parts. In addition, the circuits could operate faster because of shorter distances between the devices and other characteristics that reduced delays. IBM had to make some important choices then gamble on these by ramping up an entire supply chain from silicon to delivered computers that were based on the approach they would choose.

Had the choices been made just two or three years later, the core of SLT would undoubtedly have been different and based on integrated circuits. What IBM chose to base their entire next generation on was continued use of germanium transistors, diode oriented gates but to place the transistors and diodes on a ceramic substrate in little square modules.
SLT ceramic module built up with transistors, diodes and resistors
A circuit board made with SLT consisted of a number of these ceramic squares, each containing some small number of transistor, resistor and diode devices wired together, and a fair amount of discrete traditional electronic components like resistors, crystals, transformers and capacitors to complete the desired function of the card.
Typical SLT card

The SLT modules were portions of a logic gate or function, completed with external components or requiring more than one of the modules to comprise a single logic function such as a flip-flop. The basic module used most often in SLT has an AND gate, an OR gate and an inverter (NOT gate) inside in an arrangement they called AOI. Three diodes were wired together to form the AND function, that was joined to another diode that provide the OR function, and those four wired together diodes connect to a single transistor that is the inverter.
IBM's AOI module

The flip flop used in many places in an 1130 or 360 computer required two AI (AND -Inverter) modules, a resistor pack (several resistors molded into one long component, and an RC pack (a mix of resistors and capacitors molded into one component).

Reading the IBM logic diagrams (called Automated Logic Diagrams or ALDs) requires you to understand how this technology works and how it is combined in modules on cards to make up logical functions that a modern digital designer would recognize.

This becomes much more difficult because IBM terminology at that time does not match the way that modern digital engineers refer to technology. IBM will refer to DC set or DC reset of a flip flop, or say that it is AC triggered (showing the connection on the ALD with a capacitor symbol just in front of the input), or refer to 'binary' triggering.

The IBM flip flop is not any of the well understood flip flop types a digital designer would recognize - not D, SR, or JK or other variants - and it has behaviors that are decidedly odd. If a flip flop is already in the 'on' state and a signal is applied to switch it on, the flip flop will produce a short pulse or signal from the opposite (off) output. This would wreak havoc with most designs if the engineer didn't expect it, as this would be a defect that would not be accepted from a modern logic component.

IBM Description of Flip Flop operation

I had to do quite a bit of research and really understand what was going on, down inside the analog components in the SLT modules as well as in the digital designs shown in the SLTs. In order to do that, I would need to understand electronics and electrical engineering to a much deeper level than I had ever attained as a hobbyist/hacker/bludgeoner of electrical things. This leads to the next post - my pursuit of enough EE knowledge to move forward with the 1130 project.

Complications of the IBM technology in the 1130 - 2

At just the right moment in time, when I realized that my skills in electrical circuits and electronics were woefully inadequate to the task ahead, Anant Agarwal at MIT was just launching his MITx educational initiative, beginning with a pilot course based upon the MIT 6002 course he teaches as the foundational EE course for MIT undergrads - Circuits and Electronics.
I joined that inaugural course of what has become EDx, expanding to a joint effort of MIT and Harvard under Dr. Agarwal's leadership. It was a great experience, often straining my very rusty math skills and forcing me learn new skills, for while I had done basic calculus years ago, I had to rapidly learn enough about matrix math, differential equation solving and other needed techniques to keep up with the class.  


It was exactly what I needed, developing enough skill in both analog and digital circuit design that I was able to dive into this project and take on whatever need be done. I only wish I had learned this material when I was young - I would have had so much more success with all the projects and hobby activities I undertook over the years armed only with ohms law and some rudimentary awareness of electronics. 

I was now able to understand what was occurring inside the SLT modules and in the 1130 machine. This wasn't the last hurdle, however, by a long shot. I will have to dive into a couple of technical subjects to properly communicate the basis of the new difficulties and complications. 

Modern digital design strongly favors, almost insists upon designs that are clock synchronous - that a single master clock signal is used to control each change of state in the machine. Often, logical conditions (signals) must be used to determine whether a given state is turned on or what value should be output by a circuit. Synchronous designs need only ensure that the necessary signals have arrived sufficiently ahead of the 'tick' of the clock and will remain stable for a safety margin after that tick, then implement the state change or new data value exactly at the tick of the clock. 

Alternative asynchronous design approaches, where the various signals are combined states change as soon as conditions align, can suffer from many problems. If one signal arrives early or late, the wrong value or new state might be implemented before all the intended conditions are in place. The output might waver between correct and incorrect values, short term signals might trigger changes that are undesired (glitches), and a successful async design needs careful attention to knowing and controlling the timing and duration of every involved signal. 

The 1130 and 360 systems may have a system clock, but it  is not used for synchronous logic. Rather, signals trigger a change of a flip flop as soon as they arrive. If conditions must be combined to determine what change to make, they must arrive at the proper time to avoid all the timing issues I discussed above. 

At many places in the theory of operations manual, this is alluded to by comments such as "a slight overlap of I-cycle FFs may occur" or "the gates remain active, because of circuit delays, beyond the end of . . .". With clock synchronous designs, all the signals needed to determine the next cycle must be in place before the clock tick, but in the 360 era designs they can trigger the change whenever they arrive even partway into a cycle. 

This would make it quite difficult to convert the design of the 1130 to a clock synchronous one, as some signals may not exist when they are needed. Possibly they could be created by a different set of logic so that they occur early enough to be used, but this is never universally true. 

The next difficulty pertains to the choice of technology to build my 1130 - an FPGA. FPGAs are particularly unsuited to asynchronous designs and anyone designing logic for FPGAs learns that clock synchronous logic is almost essential for proper machine operation. The heart of an FPGA is the look up table, where all the input signals are used to address a set of values that become the output signals. The design does not use AND, OR and other gates, it instead codes the values for outputs that would be produced from some set of logic gates and uses the look up table to implement it. The design tools render a logic design into values to be loaded into the lookup tables (thus making it field programmable) and the implementation of given design may change each time the design is touched because the design tools makes different choices and assigns different locations for the lookup tables inside the FPGA chip. 

What this means is that the timing delay for a signal is not well controllable by the designer. Further, techniques that are used to introduce delays using traditional gates are barely feasible with an FPGA.
If you wanted to build a 360 or 1130, the last technology you would consider would be an FPGA. 

Between the strangely behaving logic technology, async design practices and unsuitability of FPGA for async and timing dependent purposes, this project was going to be much more work than it appeared when I began.

False starts and research

I began building the 1130 design several times over the past months, building up from the heart, the basic clocking, registers and arithmetic unit. Each time I reached a point where it became clear that the approach I was taking would not work, I tossed out all the code and hunted for a new idea that might work. I believe there were four fairly substantial starts that I discarded before I came on the idea for the current approach.

The four failed tries all attempted to create a clock synchronous design appropriate for FPGA instantiation, largely because I believed it was the only way the machine would work. I had copies of the IBM ALDs, the logic diagrams of the 1130, thanks to bitsavers.org, which up until then I was using as a reference source to understand what behavior I had to create, but not as a model for logic I would design. 

Typical ALD page
That meant quite a bit of research and analysis, digging through all the manuals and thinking about what was needed to make the machine function properly. What conditions would require data to move between a register pair, when to suppress normal activities, when to recognize error conditions . . . the deep detail that is needed to complete the design of a working computer. Once I had access to ALDs, I had another way to conduct research and was getting a fairly complete picture of the conditions, signals and what had to occur in which cycle for any instruction or condition. The list of what I didn't understand or what didn't yet make sense kept shrinking. For example, there is a ballet of tricks and special logic needed to create the Program Load behavior that the machine uses to bootstrap from a single boot card to end with the disk based monitor system running on the machine. How to force the store of each column from the card reader or paper tape reader, when the machine was not executing instructions? How to start the machine executing the first 
instruction once the load of that first boot card was complete? You have to develop a very complete understanding of the interplay of signals and the normal operation of large parts of the machine before you can really understand what has to occur to do a Program Load. 

This all changed a few months ago, triggered by a conversation I had with a collector who owns an 1130 and an 1800 - neither operational - and who is an engineer and creative hobbyist. I bought a paper tape reader on ebay and decided to pick it up locally because the seller was nearby. Bob Rosenbloom was that seller, and offered to show me his extensive computer and technology collection when I picked up the reader. Somewhere in the conversation, when he was showing me the start of his all-relay computer project and I was talking about the 1130 project, I mentioned the core challenge of building an async design in FPGA. Bob didn't this as an impossible thing at all and chatted about a similar issue he overcame several years back. 

When I was back staring at the 1130 project notes, Bob's confidence that an async computer could be built successfully in an FPGA took hold. That was the first trigger. The second was my decision to implement the machine according to the original IBM design, attempting to build it as exactly to the ALD as possible. The third was the idea that I should build FPGA synchronous code to model any behaviors of the SLT logic that were not consistent with current logic, incompatible with FPGA or necessary. By encapsulating those behaviors in a virtual logic gate that I could combine with basic gates, I could build up logic that looked just like the ALDs and hopefully would behave the same way as well. 

That method is working beautifully, now that I have all of the 1130 and most of its peripheral adapters implemented, have built emulators for most of the peripherals and am well along building up the other machinery to complete the working system. I have been debugging the machine steadily, making use of a logic analyzer as well as extensive use of the Xilinx logic simulator running on my laptop. It seems to be executing all instructions as well as IO interrupts perfectly, but I need to finish my peripheral emulation to get to where I can boot DMS and run card decks before I can finish debugging. There is a limit to how much I can hand assemble and load into the machine - I put in the core of the extended diagnostics, using the listings in the Maintenance Diagnostic Manuals (MDMs) that are also on Bitsavers. 

I do need to relate a funny story about how well this is working and the dangers of using a single set of ALDs for a single 1130 machine.

The only complete set of ALDs on bitsavers is labeled 1130C. There is all but one volume of 1130B, but the missing volume is the crucial volume that contains most of the processor core. I was testing my machine, carefully implemented in my ALD-faithful approach, when I began chasing a defect. The 1130 steps through eight steps or cycles as part of one 'storage cycle' - the T clocks - stepping through T0 up to T7 in one storage cycle. The machine is designed to spend extra T7 cycles if doing variable length activities, such as addition or subtraction, which due to the 1130 approach will take more cycles for certain values of data than for others. My design was not moving directly from T7 of one cycle to T0 of the next, and I was struggling to figure out what I had done wrong to cause the behavior. Gradually it dawned on me that my machine was working exactly like the 1130C in real life would have operated, because that machine in real life was a special model, the slowed down model 4 that IBM sold as an entry price point. To justify the lower price point, the machine had some extra logic to spin its wheels a few cycles between T7 and the next T0, thus slowing down the performance. I had perfectly implemented that cycle wasting logic, because my ALD faithful, timing faithful machine was recreating an unusual slow model. Once I figured out what a normal machine would look like, such as the 1130B which was missing the pages that illustrated a full speed machine, I was able to get normal behavior for the machine I am recreating, a 2.2 microsecond storage cycle time 1130 with 32K of core.

The machine I have built is based on FPGA hardware, malleable hardware that is configured to produce a given hardware design with gates and wiring, the configuration stored on a read only memory that is loaded when power is applied. Make a change to the design, load a changed file to the read only memory, and the new hardware comes into being with a reset or power-on. The configuration is described in a language that circuit designers use to define and create products, with the two major choices being Verilog and VHDL. I chose VHDL and learned to build hardware using the language and the ISE tools from Xilinx ISE Webpack tools that compile the VHDL, load it onto the board and provide simulation facilities to test out designs.

FPGA chip - not the version used in my 1130
The board on which the 1130 is implemented uses a Xilinx Spartan 3E FPGA (datasheet) that provides 1.2 million gate equivalents, far more than the total gates used in an 1130 and enough to support all the peripherals emulation and other additional logic I am adding to make the recreation a usable toy. 

I will spend the next few posts talking about the unique behaviors and my little modules that stand in for logic gates in very place in the ALDs where the oddball behavior is needed. The only other changes I needed were to adjust a few spots where the machine is doing what IBM refers to as a "DC Reset" of a register. It is apparent that the response time of the SLT flip flop is s-l-o-w to such DC reset signals, thus the designers would be gating the movement of the current contents of a register to another spot while simultaneously doing a DC reset. Once I introduced a very minor delay (in terms of the FPGA clock which is 20ns compared to the IBM T cycles which are 280ns long), the current contents were copied successfully before the reset took effect. I expected many such timing issues that would need to be tracked down laboriously, but it was only a handful of adjustments that seemed necessary, everything else is operating very reliably.

It the spots where I introduced the delays, I see that IBM placed pairs of not gates, as they saw the need to introduce signal delays. Unfortunately, the FPGA design tools intelligently remove that, recognizing that in digital logic, two wrongs make a right - the final outcome is identical except for timing. Thus, the intended delay was removed. Even though the FPGA uses lookup tables instead of gates, if it did look up a NOT function twice it would have added some delay and might have done the trick, but my change does the delay explicitly in an encapsulated logic element I call a "delay".

The IBM Multi-input Flip Flop and my implementation - part 1

The basic flip flop in IBM's Solid Logic Technology has multiple modes in which it operates as well as characteristic behaviors that are relied upon heavily in the design of S/360 and 1130 computers. No modern logic gate provides these behaviors sufficient to put together a set of gates from an IBM ALD (automated logic diagram - the documentation of the circuitry of these computers.) I constructed several VHDL modules that I could instantiate and use as a direct substitute, thus producing the same results from a circuit I built with my modules and the circuits in the SLT generation equipment.

Most uses of the flip flop are purely asynchronous, with no clock applied that determines when the flip flop changes to its next state. This most common mode was level sensitive - what determined the output state of the flip flop was the static values of the inputs, much like a combinatorial gate such as an AND function is level sensitive and unclocked. Whenever the inputs A and B are both 1, the AND gate is at 1, this state occurs essentially immediately when the inputs become 1 and 1, and it persists outputing a steady 1 as long as the inputs remain at 1 and 1. For the IBM MI flip flop operated in this "DC" mode, if the set input is 1 then the flip flop virtually immediately flips on, its output now 1. If the reset input is 1, it will rapidly flip off, the output then being 0. The behavior is not well defined if both the set and reset are simultaneously 1. In a different mode ("AC"), the flip flip will toggle between on and off when the pair of input signals are both 1, but not so in the steady or DC mode that is used most widely.

This flip flop is used widely so that flip flops change state asynchronously whenever the input signals appear, in no particular relationship to the overall clock. Such flip flops might be turned on somewhere in the midst of a cycle, and might slop over before turning off well into a cycle after the one in which logical conditions dictate it should be reset.

To implement this DC mode (async, level sensitive) flip flop, very approximately like an SR latch in today's parlance, I had to build logic that would handle the bad situation (both set and reset simultaneously asserted).  This was essentially a pair of NOR gates, cross coupled so that the output of one gate was an input to the other. This provided both a normal and an inverted output (Q and Qnot).

If both inputs are asserted, the flip flop will act as if only the reset were asserted, as this was the safest condition for the flip flop to take if both S and R are 1.

The SLT system typically used inverted inputs for the flip flop, such that the flip flop is set if the set input is 0, while a 1 on the set input is ignored. Because the heart of SLT was diode logic coupled to an inverting transistor, all the gates had to be configurable to deal with both normal signal levels (1 is on) and inverted levels (0 is on). The outputs also had to be configured to be either inverted or normal type.

The 'flipflop' module offers generics that allow the inputs and independently the outputs to be inverted or normal, with the default condition inverted inputs and normal outputs if the generics were not overridden.

If a particular logic circuit in an ALD had some signals combined in an NAND gate to drive the set input of a flipflop and other signals combined, say, with a NOR gate to drive the reset, it would be represented in VHDL as:

  setsignal <= not (inputA and inputB and inputC);
  resetsignal <= not (inputD or inputE);
 setBBFF <= setsignal;
 resetBBFF <= resetsignal;

 . . . and later in the code I instantiated one of my flipflops to implement the DC mode behavior. It specifies inverted inputs, normal outputs, using signal SetBBFF as the 'set' input and ResetBBFF as the 'reset'. Its outputs are normal -  BB -- and inverted -- BBnot.


BBFF: FlipFlop
GENERIC MAP (
invinp => 1
)
PORT MAP(
J => SetBBFF,
K => ResetBBFF,
Q => BB,
Qnot => BBnot,
. . .
Working from ALDs, which named gates with two character codes such as AD, AE, AF, . . . I could directly code the combinatorial gates (A, OR, N, . . .)  and insert my flipflop modules in place of the flipflop types (FF). The reader could take the VHDL and match it very directly to the ALD logic, achieving one of the design principles which was to seek as exact a copy of the original 1130 logic as possible.


In this scheme, the normal combinatorial gates are directly implemented as modern logic functions ('and' and 'or'), since their behavior was consistent with SLT versions of those gates,  but the flip flop has to make use of my flipflop component.

My flipflop component behaved as the SLT FF did in DC mode, well enough that every function in the 1130 and peripherals I have implemented behaves just as intended, as described in the theory of operations manuals, and producing signal timings that match the documented 1130 results extremely well.

This was the simplest of the SLT gates to implement - next post will cover the "AC" mode flipflop, then pulse generators, delays and single shot triggers will be covered in later blog posts.

IBM Multi-Input Flip Flop and my implementation, part 2

The second major mode in which the SLT flip flop is used is what I will call edge-sensitive mode. In addition to the diodes that are tied together to form the "DC" logic gates - e.g. AND of three input signals - the flip flop has an input with a capacitor in series with a diode going into the same junction as the DC mode diodes. Further, it is set up so that the internal side of the capacitor is biased by another input line. This additional input line I call the 'gate' - while the input going through the capacitor I call the 'shift'.

Capacitors 'pass' current flow as the plates of the capacitor charge or discharge, based on changes in voltage across the device, but once the voltage stops changing, the current drops to zero. This makes a capacitor a differentiator, where the current passing through the device is the change in voltage rather than the absolute voltage level.

Because of the diodes in line with the capacitor and its placement relative to the gate electrode of the transistor inverter, it will change the state of the transistor for one specific direction of change - for convenience matching the widespread use in SLT circuits, we will talk about a flipflop whose 'set' input will only turn on if the voltage on the "AC mode" input shifts from 1 to 0. It becomes a falling edge detector, a current flowing to make a change in the flip flop only on the falling edge of that AC mode input. Further, if the bias on the internal side of the capacitor is set to a 0 level, then the falling edge won't generate the proper size and direction of current to trigger the flip flop, but if it was biased to 1, the shift down on the AC input to 0 produces the 'delta' or change current that activates the flip flop.

This is why I call the biasing input a 'gate' - if it is on, the flip flop AC input is going to activate whenever its signal falls from 1 to 0, right at the edge of the logic change. The AC input I call a 'shift' because it is the shift in the logic level from 1 to 0 that is detected. Together, we have a flip flop whose set and reset inputs could be configured with the AC mode inputs so that if some set of combinatorial conditions are true, the gate input is 1 and then it will activate at the precise timing defined by the logic change on the shift input.

In modern logic devices, one could imagine that if an input signal, not a clock, were to be hooked to the clock of a flipflop and if some other combinatorial logic (static level) signal were hooked to the input of the flip flop, then when the 'clock' edge was detected (the shift input) it would take on the state of the traditional input (our gate input) and produce an output that is true only when the gate is 1 and the shift provides the proper direction edge. This is not exactly the same, however, because in the IBM FF the set signal to the flip flop exists for only a short period of time then goes to zero, regardless of its prior state or to the state of the gate or other inputs.

Another way to imagine this part of the circuit is as an edge detector with an enable gate - if the gate input enables the detector, then when it detects an edge it emits a short duration pulse. With discrete circuits one could set up a time constant that produces the single shot positive going output pulse right at the time the edge is detected and only for a fixed time before it returns to zero where it will remain until some future edge, enabled by a suitable gate input, will trigger it again.

Producing signals of a fixed duration asynchronously with FPGAs is not reasonable. They are used to produce a fixed duration by counting cycles or stepping through state machines in a number of clock cycles with nice tidy synchronous designs.

My solution to this conundrum was to produce a hybrid - making use of clock synchronous logic to time out the output of the edge detector, but have the edge detected asynchronously. I had to mix async logic because the shift signal coming in might have its edge at any point, not just aligned with a clock tick of the FPGA. If the signal shift were happening very very close to the proper clock edge, we might violate setup or hold requirements of the hardware or be in an indeterminate state that causes glitchy behavior. Clock synchronous circuits need the non-clock signals to be stable surrounding the time of the clock edge, something that is part of the design thinking when creating such circuits. When you can't guarantee the alignment, you have to convert the inputs into a stable condition to make use of them.

My secret weapon is the huge advantage in speed I have with the FPGA compared to 1960s SLT technology. The cycle time of the FPGA board I am using for this project is 20 nanoseconds, while the T clock cycle time of the 1130 is 280 nanoseconds. This gives me the opportunity to use the hybrid logic in my edge-trigger flipflop to reliably detect the edge, ensure that the gate signal is stable at the time of the shift, then produce a one cycle output pulse as a sync operation. The delay at maximum is about 20 ns from when I complete deciding that we have the right conditions for the triggered condition, to align it with the next FPGA edge. Since the most basic of SLT gates has a delay time on the order of 30 nanoseconds, I ran a relatively small risk of delaying signals that go through edge triggers to make them too late for their next use. This risk is also made lower because the 1130 is not a clock sync design, so that if a signal triggers some state change 20 or 30 ns later than the 'clock' toggled, it still works since nothing is activated by a master clock tick.

Part of the design uses a flip flop with the shift signal used as the clock - this detects the edge, then sets a signal that or clock sync flip flop companion will use to step itself through the sync output generation. The gate signal might be changing right as the shift is detected or in the gap between when the async flipflop sees the edge and a bit later when the clock sync flipflop acts on it to produce a pulse. To protect against this, I route the gate signal through a chain of paired not gates (actually LUTs in the FPGA) and exclusive-or it with the live signal. This guarantees that it has not changed over the span of a handful of nanoseconds and is thus not an in-transition, changing gate that should not be acted upon.

There are a number of versions of this logic, as I have the need in some circuits to produce a pulse that is more than one cycle long to ensure I catch other signals safely in combinatorial logic further along in the 1130. In most cases, an AND or OR gate is edge-triggered, so that its combinatorial output is a pulse only active for the short duration after the shift input has its edge, rather than being a full flip flop. Thus, the edge triggered behavior was generalized and the output pulse of those is passed into the set or reset input of our other flipflip module to activate the change in flip flop state.

In IBM ALDs, the logic gates that are edge triggered are indicated only by a letter on one of the inputs - a P or an N - which declares that it is an AC mode, edge triggered input. In some circuit drawings, a capacitor symbol is placed on the line drawn for the input signal, but in other situations the only clue is the N or P on the logic gate box.

These edge triggers can have positive or inverted gate inputs (trigger only if the gate is 1 or if it is 0), they can detect the rising or falling edge (P or N trigger) of the shift, and they can emit a pulse that is positive going or one that is negative going. Thus, my modules that implement these gates are configurable with generic parameters to specify those variations, along with the duration of the pulse (in fpga cycles).

When writing VHDL to implement a page of the ALD that contains such edge triggered functions, I write the direct logic for normal combinatorial gates but instantiate my edge triggered module for each edge-triggered gate. It would look something like this in the code:



SFgate <= NotReadBit12;
SFshift <= NotCRReadRegLoadSP;

SG <= NotResetReadReg or NotDCReset or SB;

SetSKFF <= SF;
ResetSKFF <= SG;
            . . . . . . . .

  -- shift version of SH gate
SHG: EdgeTriggerGate
GENERIC MAP (
invgate => 1,
invshift => 1,
invout => 1
)
PORT MAP(
Gate => SHgate,
Shift => SHshift,
ClockMaster => ClockMaster,
Output => SH
);
-- SK flip flop
SKFF: FlipFlop
GENERIC MAP (
invinp => 1
)
PORT MAP(
J => SetSKFF,
K => ResetSKFF,
Q => SK,
Qnot => SKnot,
ClockMaster => ClockMaster
);

I set up the gate and shift inputs to SH, it is configured so that the gate and the shift are both inverted, as is the output pulse. That inverted output pulse of SH is what is used to set the flipflop SK, since its input signals are inverted too and a short term 0 inverted pulse from SH will turn on SK. The rest of SK flipflop is a static combinatorial signal that is written out and implemented by the FPGA in normal modern gates (or gates or their LUT equivalent). However, the edgetriggered gate and the flip flop will produce the SLT behavior by using my modules in the place of those gates from the ALD.

The IBM Flip Flop is even stranger than I am modeling, but fortunately those characteristics are not used in the 1130 design. The gate input to an edge sensitive gate will retain its asserted state for 90 ns after the input is removed, thus still allowing the gate to be triggered if the shift occurs during those 90 ns. That is three times the average combinatorial logic gate delay of SLT, long enough that a designer could make use of a signal that is no longer active, as long as it was asserted less than 90ns prior to the shift. Just to add to this, the gate must be held positive for a minimum of 170ns to guarantee that the shift is detected, which is almost half a T cycle long. In modern logic elements, if the delay of the gate is 1, the setup time is a small fraction of 1, not a multiple, and if a signal was deasserted its effect on the gate disappears in much less than 1 unit, compared to the delay time. 

Generalizing an edge sensitive async gate

I generalized the "AC mode" described in the last post concerning the multi-input flip flop used in IBM SLT circuitry. This would be used anywhere that the designer wanted to fire off a control pulse that would activate some sequence of actions, with the timing for the emission of the pulse set by the 'shift' input to the gate and the enabling or disabling of the emission at that time controlled by the 'gate' input.

This allowed the designer to make use of both long lasting signals reflecting logical conditions and of pulses to be driven at specific times. A signal such as "XIO instruction" might be on during most of the cycles when an XIO (execute input-output) instruction is active, and that could be one of the conditions that are combined to activate a specific circuit at some chosen time. If an XIO instruction of a certain type, such as a initiate read, is being executed, then we might want to cause the value read from memory to be set into a specific machine register. If that data is read during the E2 cycle while processing the instruction and the data is valid when the T3 cycle of E2 begins, we might have combinatorial logic mixing the 'XIO instruction' signal with signals for 'E2' and signals that identify this as an initiate-read type of XIO. Those relatively long lasting signals are used as the 'gate' input while the signal for being in a T3 cycle is used as the 'shift'. Our gate will fire a pulse if all of 'XIO instruction', 'E2' and "i nitiate read' are true, at the time that the T3 signal shifts from 0 to 1 (we begin the T3 cycle within the longer E2 cycle of the even longer single instruction XIO). This pulse from our gate might trigger the 'set' of a given register that will cause it to be loaded with the value just read from memory.

This is how an async design can have vaguely clock-like cycles and cause some actions to happen at chosen times. Note that the pulse causing the loading of the register, in our example above, might occur at a time that has no specific relationship to other timings in the machine - it could be ahead or behind other pulses generated at the start of T3, as each portion of the machine operates solely based on pulses and signal levels it sees, unconstrained by any global synchronizing clock.

I called this function EdgeGate and instantiate copies of this module for any gate in the 1130 design that is edge triggered. It receives a gate and a shift input, is secretly clocked by the FPGA master clock running at 50MHz, thus is able to produce a pulse of a chosen duration as its output. Fully configurable for the polarity of the inputs, the polarity of the output pulse and the duration in fpga cycles of the pulse itself.

The entire VHDL of this function is posted for those readers who wish to see how it was implemented in more detail:


library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity EdgeTriggerGate is
Generic ( invgate : integer := 0;
  invshift : integer := 0;
invout : integer := 0);
    Port ( Gate : in  STD_LOGIC;
           Shift : in  STD_LOGIC;
           ClockMaster : in  STD_LOGIC;
           Output : out  STD_LOGIC);end EdgeTriggerGate;

architecture Behavioral of EdgeTriggerGate is

signal Edge : std_logic := '0';
signal Pulse : std_logic := '0';
signal Gated : std_logic := '0';
signal Trigger : std_logic := '1';
signal DelayedGate : std_logic := '1';
constant empty :  integer := 0;

begin

-- this just delays a copy of the gate signal, so that we can
-- verify that it is stable and not changing or glitching
DelayedGate <= not (not (not (not  Gated))) after 2 ns;

-- output is normal or inverse if wedged (invout /= 0)
go1 : if invout = 0 generate
Output <= Pulse;
end generate;
go2 : if invout /= 0 generate
Output <= not Pulse;
end generate;

-- Level sensitive input is 0->1 edge (P) or inverse (N) if invshift /= 0
gs1 : if invshift = 0 generate
Trigger <= Shift;
end generate;
gs2 : if invshift /= 0 generate
Trigger <= not Shift;
end generate;

-- Gating signals normal or inverse (wedged) if invgate /= 0
gg1 : if invgate = 0 generate
Gated <= Gate;
  end generate;
gg2 : if invgate /= 0 generate
Gated <= not Gate;
  end generate;

process (Trigger, Pulse, Gated, DelayedGate)
begin
if Pulse = '1' then
Edge <= '0';
elsif Trigger'event and Trigger = '1' and
(Gated and DelayedGate) = '1' then
Edge <= '1';
end if;
end process;

process (ClockMaster, Edge)
begin
if ClockMaster'event and ClockMaster = '1' then
Pulse <= Edge;
end if;
end process;

end Behavioral;

Other SLT gates I implemented that don't match standard fpga functions

SLT logic features some pulse producing circuit elements which are used in many places inside an 1130 or 360 computer. As these were built and controlled by selecting capacitors and RC networks on the SLT card, something not possible within an FPGA, they also needed to be modeled in a way that would produce the intended behavior. These modules I wrote to accomplish these functions are also instantiated in any code as a one to one substitute for the SLT elements in an ALD diagram. An ALD page could be represented by standard VHDL logic plus the substitute modules for FF, edge sensitive gates and these other element types, thus remaining very true to the circuit design.

Single Shots (SS) are an SLT element that fires off a pulse of some duration chosen by the designer thru the specification of capacitor and resistor values to be connected to the element. For example, a given circuit might want a pulse of 250 microseconds duration to occur. The SS would fire a pulse whenever its input went on - it was edge sensitive. If the input remained on for a long time, it didn't matter, the pulse from the SS element would be 250 microseconds or whatever timing the designer chose. My module SingleShot is a variant written to provide a pure SS drop in substitute.

Sample Pulse Driver (SPD) elements will take a logic signal and fire off a short pulse when the logic signal switches on. It is essentially an edge sensitive gate like my EdgeGate which has has a 'gate' input that is always true, thus it fires the pulse any and every time the 'shift' input produces the edge we are looking for.

SS elements are behaviorally like SPD elements except the duration of the pulse is extended for some target interval in the SS, which could even be hundreds of milliseconds long; the SPD produces a short pulse about the speed of one gate delay in SLT terms.

One final circuit need is for a delay, an element which will cause the signal to be delayed in time but otherwise follow the same pattern as the input signal. I wrote the Delay module as the means to cleanly insert a chosen delay into a circuit.

The SingleShot and Delay modules are hybrids of async and clock sync techniques, much like the EdgeGate from which they are derived, since the triggers could arrive at any point unrelated to the master clock of the FPGA, even one as fast as the 20ns clock in our implementation board. The clock sync portion is used to produce given timings for pulse durations or delays of signals.

Delay is used less than ten times in the entire 1130 implementation, since the 1130 designer implemented any desired delays by passing signals through several gates to introduce minor delays, not by placing a specific 'delay' type gate in the ALD. These were used only in the cases where signals had to be delayed for correct operation, something found to be very rare in the operation of the 1130 logic inside my recreation.

Here is the SingleShot module for the reader interested in the details:


library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

-- This function will fire off a pulse of a preconfigured duration
-- when the trigger edge arrives. It can be configured to trigger
-- on the 0->1 or 1->0 edge, it can be configured to provide a positive
-- or inverted pulse, and the duration is defined for the instance

entity SingleShot is
Generic ( invtrigger : integer := 0;
invreset : integer := 0;
ctrload : integer := 1;
ctrsize : natural := 2;
invpulse : integer := 0);
Port ( Trigger : in  STD_LOGIC;
           ClockMaster : in  STD_LOGIC;
 DCReset : in  STD_LOGIC;
           Output : out  STD_LOGIC);
end SingleShot;

architecture Behavioral of SingleShot is

signal Pulse : std_logic := '0';
signal Shift : std_logic;
signal Edge : std_logic;
signal Reset : std_logic;
signal Ctr :  std_logic_vector(ctrsize-1 downto 0);
constant empty :  integer := 0;
begin
-- output is positive going or inverse if wedged (invpulse /= 0)
go1 : if invpulse = 0 generate
Output <= Pulse;
end generate;
go2 : if invpulse /= 0 generate
Output <= not Pulse;
end generate;

--  trigger is 0->1 edge (P) or inverse (N) if invtrigger /= 0
gs1 : if invtrigger = 0 generate
Shift <= Trigger;
end generate;
gs2 : if invtrigger /= 0 generate
Shift <= not Trigger;
end generate;

-- set up for either pos or inverted reset as requested by caller
gr1 : if invreset = 0 generate
Reset <= DCReset;
end generate;
gr2 : if invreset /= 0 generate
Reset <= not DCReset;
      end generate;

process (Shift, Pulse, Reset)
begin
-- make sure we don't do anything bad at startup like emit or trigger spuriously
if Reset = '1' then
Edge <= '0';
-- if we started emitting an output pulse, shut down detection signal
elsif Pulse = '1' then
Edge <= '0';
-- if our shift occured and we are not already emitting, signal we should start
elsif Shift'event and Shift = '1'
and Ctr = std_logic_vector(to_unsigned(empty,ctrsize)) then
Edge <= '1';
end if;
end process;

process (ClockMaster, Ctr, Edge, Reset)
begin
-- at startup, make sure we don't trigger or emit anything spurious
if Reset = '1' then
Ctr <= (others => '0');
Pulse <= '0';
-- we detected an edge, load our counter and indicate we are emitting
elsif Edge = '1' then
Ctr <= std_logic_vector(to_unsigned(ctrload,ctrsize));
Pulse <= '1';
-- do our clock synchronous thing with the counter
elsif ClockMaster'event and ClockMaster = '1' then
-- if we have a positive count, decrement it and keep emitting
If to_integer( unsigned(ctr)) /= 0 then
Pulse <= '1';
Ctr <= std_logic_vector(
to_unsigned((to_integer( unsigned(ctr) ) - 1 ),ctrsize));
-- otherwise we are done, shut off pulse
else
Pulse <= '0';
end if;
end if;
end process;

end Behavioral;

Hardware involved in the project as of Dec 2012

The 1130 recreation is centered on a Digilent Nexys2 development board which contains a 1.2M gate equivalent FPGA - Xilinx Spartan 3E - plus support circuits and components, many of which I am leveraging.

Nexys2 board bottom left, logic analyzer above and breakout board attached to right

It has a clock running at 50MHz and is field programmable by USB from the Digilent-provided Adept software on a PC, but could be programmed by the standard JTAG method too.

It hosts a flash ROM which contains the programming, loaded whenever the board is powered on or reset.

The board has features which are used for debugging convenience, such as eight switches, four push buttons, eight LEDs and four 7-segment LCD digits, but which will not be active in the final machine. There is support for a VGA interface which is not used. The board also provides more than 50 I/O pins from the FPGA, in eight digilent style "PMOD" connectors and in the 50 pin expansion connector. I use these for logic analyzer debugging but also for connections to other hardware, as will be outlined below.

16MB of flash memory is on board, plus 16MB of fast RAM, both of which were planned to be used in the 1130 project. The flash could hold seven virtual disk cartridges, any one of which could have been mounted in the emulated disk drive on the project. However, timing issues and complexity led to use of separate flash rather than the onboard flash, because of the sharing of the IO pins between RAM and flash that make it impossible to fully overlap their operation. The RAM is used to implement the 32K of 16 bit words in the largest 1130 configuration (plus the 2 parity bits per word).

A PS2 port is provided on the board and used to connect a PC keyboard to stand in place of the 1130s console keyboard. Later in the project I hope to put in a more accurate keyboard replica that provides just the signals which are emitted by the 1130 hardware itself, not PS2 keyboard scan codes. All the scan codes are converted by emulation logic to appear to be the micro-switches on the real keyboard of an 1130, thus the adapter logic from the 1130 is unmodified.

The USB link provided on the board provides a high speed and low speed transfer mode embodied in the Cypress chip on the board - up to 480MB/s transfer rate as a stream connection as well as emulating a PC extended parallel port (EPP) for low speed transfers. This is used to connect to software running on a PC, written in Python, to link the peripheral emulators on the board with external files and simulated lights and buttons of the I/O device. The current emulation planned includes 1132 printer, 1442 card reader/punch, 2501 card reader, 1627 plotter, console printer (typewriter) output, and a means to preload core memory (board RAM) and virtual disk cartridges (board flash).

External hardware I have built provides the console lights, switches, and buttons. The 1130 has more than 100 incandescent lights in a panel that indicates the value of system registers, conditions, the current T and I cycle and other status of the machine.It has several lighted buttons and button switches that sit alongside the console keyboard keys. There is a row of 16 toggle switches that sit on the front of the console printer. Six toggle switches are normally hidden and used by the maintenance engineer (the CE switches). The panel of lights sits on a pedestal above the console printer, which is a modified Selectric typewriter. The panel is built using LEDs rather than incandescent lights, built to appear as similar as possible to the real machine. It has a rotary dial switch on the right of the panel lights which is fully implemented and has an emergency pull switch on the left which I plan to build as a nonfunctional knob. The LEDs are controlled by boards I built using a chain of MAX7219 controller chips which are connected over a high speed serial link to the Nexys2 board.

Another high speed serial link, using the SPI protocol, links to a MCP23S17 chip that multiplexes the 16 toggle switches on the front of the console printer over the four wires of an SPI link. A third high speed serial link, using the I2C protocol, connects to two MCP23017 chips used to multiplex all the other buttons, switches and the virtual disk cartridge number, using only two wires.

All buttons and switches are debounced with MC14490 debounce chips before multiplexing and conveyance onto the Nexys2 board, just because it seemed simpler to do this at the time.

The Nexys2 operates at 3.3V but some of the chips, for example the MAX7219, require 5V signals (TTL levels), thus level shifters are placed at appropriate points on my boards - the MCP23x17 chips operate at 3.3V while the others run at 5V. 74HC4050 and 74HCT00 chips are used for the level shifting.

The console printer is essentially the 1053 console of the S/360 line, a Selectric modified by the addition of solenoids and microswitches to convert the purely mechanical Selectric typewriter to a device that can print data transferred from a computer. The Selectric is a very complex bit of machinery, requiring many solenoids and very, very precise timing of activation to avoid jamming the mechanism. Microswitches communicate the rotary position of various mechanisms as they operate to print, shift, tab, return the carriage, space and other tasks of the typewriter, controlling the timing of solenoids and used to interlock some operations that cannot overlap others. The adapter logic in the 1130 contains everything needed to interact with the switches and solenoids, if one could only find a 1053 or the IO Selectric or 2741 devices that are very similar.

It is one of the goals of this project to implement a Selectric as the console output printer. I have acquired a Selectric and a pile of solenoids, but also recently bought an IBM Electronic 50 which was a hybrid machine combining the Selectric mechanism with some electronics for limited editing and composition. it has the solenoids and switches already installed, thus should be fairly easy to interface as a console printer. I am waiting to receive scanned diagrams and documentation on the E50 from someone I came to meet on the GolfBallTypewriter group that has members interested in Selectrics and related machines - Dave Handyman (not sure if the surname is a nom de plume or his real world name) is a former CE who has quite a few documents and tools that he shares with the group.

I acquired a paper tape reader device which I intend to interface to the Nexys2 board to perform just like the 1130 paper tape reader. I have not found a suitable PT punch but did buy the punch block assembly for a Teletype, the physical part that punches the holes themselves, which I can connect to solenoids and build along with a sprocket drive to produce a paper tape punch that interfaces right to the 1130. No emulation will be necessary on either of these peripherals, just as no emulation is needed for the console printer.

Line printer support will be emulated, with code in the FPGA producing the signals and timing that woudl be seen if a real 1132 and/or 1403 where hooked to the 1130 adapter circuits. I have modeled the rotation of the print disc wheel and the timing and operation of the carriage and brushes, thus it should produce lines of output at the same speed as an actual 1132 without modification to the 1130 adapter logic. The 1403 is intended to be modeled, but i do not have full documentation of the printer adapter for that device yet, thus it is conditional.

Card reader and punch support is also emulated, with code in the FPGA modeling the physical mechanism, timing and behavior of both the 1442 and the 2501 devices. It is quite accurate to the timing of the real devices, with virtual card images transferred from the PC to a FIFO in the FPGA with all 12 holes state for all 80 columns. This means the cards coming in over the USB link with the hgih speed streaming mode are in Hollerith mode, boot card mode, binary mode, or any other mode that exists on the 1130. They are only converted to and from ASCII in the PC, the adapter logic remains unmodified.

The disk drive emulation is faithful to the timing and behavior of the single drive installed in most 1130 systems. I model the rotation of the drive, generate index and other pulses, generate and check the four parity bits, model the zeroes and sync word and all the other signals that are seen by the adapter, at the actual timing experienced on a real disk drive. This means that the programs will experience the same times to seek, read, write and even down to the same rotational delay waiting for a sector from one IO to the next.

If a real card reader can be acquired at a reasonable price, a future stage of this project would interface that reader to appear as a 1442 or 2501 to allow use of real cards. I have a Wright manual keypunch device and a few thousand blank cards, just in case.

I may, in later stages, acquire a DEC RK-05 drive or similar cartridge disk drive as several firms licensed and produced virtually the same drive as the 1130. I own two cartridges for RK-05, for use if a drive can be found at reasonable price and modified to interface fully with the unmodified 1130 adapter.

Physically, I intend to construct the recreation 1130 as a tabletop unit, with the slanted formica top of the real 1130, the Selectric mechanism, keyboard, operator light panel and other gear at full size and as accurate as possible, but not build the remainder of the cabinet down to the floor as that would be almost totally empty wasted space in this implementation.


While not a standard peripheral used with the 1130, magnetic tape is a contemporaneous technology which I might interface to the machine, as I have a tape drive from an IBM p-series system on hand. I would need to invent the adapter, simplifying aspects of tape operation to bring it within the capabilities of the 1130 IO approach, as well as writing a device driver for DMS2, but that would be an interesting project after the system is fully operating.

My most recent acquisition (through Craigslist) is an HP Designjet 750C plotter, a monster that produces 36" wide printed output of both plots and pictures. I will use that as the plotter output for the 1130, generating plots from programs I write or copy from others. Since it also accepts sheet media, I have some 13 x 16 paper that I can use for smaller output. The designjet accepts HP/GL commands, which are produced by the interface unit built by Richard Stofer. That consists of an mbed microcontroller device that links to the fpga, converting the simple plotter operations from the 1130 into HP/GL and sending that over ethernet to a printer. He used it with a laser printer that includes HP/GL support, but I will point the unit to the Designjet 750C for added realism. Unfortunately, this monster unit will not look like the small plotter, which IBM OEMed from CalComp and sold as the 1627 peripheral for 1130 systems.


Another aspiration for a later stage is to record the sounds of a real 1130, separate out the sounds of the basic fans/hum, the disk drive seek sound, line printer sound, card reader and card punch sounds, etc, so these can be mixed and output on a PC speaker at appropriate times to further bolster the experiential fidelity of my recreated 1130.

Watch out for those special slowed down versions of machines

At least twice I have been bitten by studying documentation to understand the logic or the timing of a machine, when I didn't notice that the instance I was studying had been artificially slowed to produce a low end, low price point model.

The first time was when the T clock logic of my machine was first ready for debugging after implementing the machine according to the full set of ALDs online at bitsavers.org, from the machine named 1130C. The T clock steps through 280 nanosecond cycles from T0 to T7, which encompasses one storage access cycle of the 2.2 microsecond core memory (these timings are for the fastest of the 1130 models). Various conditions block or cause the T clock to advance, among them "extra"  T7 cycles which are required in the 1130 when certain operations can't be completed by the end of the first T7 cycle.

Addition in an 1130 occurs iteratively, with a bitwise binary addition in a T cycle producing the carry bits from each bit as a new addend in the D register to be applied for another round of addition. Only when the D register is all zeroes, meaning that all carry values have rippled through the result, is the arithmetic operation complete. This is marked by dropping the signal "arithctl" when D is all zero. If we are at T7 and arithctl is still on, then instead of advancing to T0 as normally would happen, the clock stays at T7 for at least another 280ns cycle.

The logic analyzer showed the T clock advancing cleanly each 280 ns from T0 up to T7, but it was always taking several extended T7 cycles. I assumed that the timing of some signals was off, not arriving in time to allow the T clock to advance to T0, thus began carefully reviewing the timing of all relevant signals. I couldn't see any issue, yet the T clock was delayed in advancing to T0 for every storage access cycle.

Finally it dawned on me, while walking through the ALD pages related to the T clock, that this behavior I was seeing on the recreated 1130 was the intended behavior of the circuits. The ALD had some FFs forming a counter that would count away some 280 ns cycles before it triggered the advance of T clock to T0.

With the philosophy of this recreation stressing as near exact reproduction of the logic circuits gate by gate as possible, I had coded page after page of ALD diagrams into VHDL but hadn't analyzed the purpose and intent of every gate or signal. I found that I had implemented this delay counter because it was there on page KM212, labeled "T7 Extend". To my surprise, it turns out that machine 1130C is a model 4, a special model that runs slower than the others. While its brothers with the 3.6 microsecond core storage run with T0 following directly after T7, the model 4 runs at an speed of 4.5 microseconds because of the extra T7 cycles that are tossed in to waste 900 nanoseconds at the end of every storage cycle. This was the reason for the delay counter and the reason that my recreation was experiencing extra T7 cycles. Had I realized this up front, I would not have coded in the delay counter nor had any wasted T7 cycles to debug. I modified the VHDL to turn this into a full speed machine, no intentional wasted T7 cycles.

The second incident where an artificially slowed model caused me to spend hours of unnecessary time was with the emulation of the 1132 printer. I had to understand the mechanism inside if I were going to provide accurate timing simulation of the printer and reproduce its behavior faithfully. I had to emit various pulses and signals from the emulated printer to the device adapter logic of the 1130, and these had to be at the right time if the adapter logic were to work as intended and any printing would run at the same lines per minute as a real 1132.

This printer was built for the 1130 by taking the printing engine of the pre-computer punched card accounting machine, model 407, and wrapping a minimum of electronics around it for use by 1130. This kept costs down by leveraging and perhaps recycling mechanisms from the base of 407 machines that were being replaced by electronic computers.

At its heart is a cylinder of type rotating in front of the paper, hammers pushing the paper onto the wheel when the intended letter was rotated into position. To decide when to strike a hammer, the machine had a 'print disc' on the end of the cylinder that was read by photocells to emit timing pulses and the seven bit value of the letter that was just rotating into print position next. I had several timing diagrams of the printer which were guiding my emulation - I would put in delay counters to wait for timings based on the documentation, or emit pulses of durations given by the documentation at appropriate points. The emulator hardware sets a print disc rotating at the speed of the 1132, bringing each of the 48 characters than can be printed in the actual order they are arranged around the cylinder of a physical 1132 printer. Based on the rotation speed, the wheel moved from character position to the next position every 11+ milliseconds.

The way this was driven by programs was pretty byzantine. The printer would interrupt the 1130 once every 11+ milliseconds, which the program would respond to with an XIO Read to get the seven bit value of the upcoming character. The program then looked at the line it was printing, setting a 1 bit for every hammer position of the 120 columns that matched this one character. The bits were set in a fixed location, the scan buffer, which the 1132 printer would fetch with cycle stealing when it was time to actually print the character, firing the hammers to strike those columns where this letter was wanted. The program would then wait for the next interrupt, print the positions that contained the next character, and do this up to 48 times until all the character values in the print line had been printed.

One of the diagrams showed the interval between the pulses that caused the interrupt and the actual printing to be 22+ milliseconds. That implied that it would take two full rotations of the cylinder to print a line if it had all 48 character values in it. However, the rated speed of the printer, both for numeric only and general print lines, could never be attained in this case. Even all-numeric lines would involve more than one turn of the cylinder, because the type for the digits were adjacent on the cylinder. If the cylinder rotates to the next digit in 11 ms but it takes 22 ms to read and react, extra rotation was inevitable.

I was quite concerned about this because of the dichotomy between rated speed and the timing diagrams. I spend hours trying to imagine schemes that would still allow a printed line to complete with only one rotation of the cylinder.

In a chance conversation with a docent at the Computer History Museum, in front of one of their 1130 machines on display, I was relaying the problem I hit with the model 4 and its slow-down by wasted cycles. The docent who had quite a bit of 1130 experience in his earlier days mentioned several other places where IBM created slowed down, entry level priced models through delays like this. He happened to mention that slowed down models of the 1132 were offered - the light bulb went on! I had timing diagrams from a slowed down 1132, once again these were from the 1130C machine ALDs. I looked at the timing diagrams for the 1130B machine on bitsavers, whose ALDs were incomplete but did contain the timing diagrams, which gave me a correct timing diagram. Only 11+ ms from the pulse causing the interrupt until printing, not 22. The longer time was a delay built into the slowed printer model. That model would print at half the lines per minute of its normal breathren. Mystery solved and emulation design was easy from that point forward.

However, throughout the construction of this replica, I had to carefully check for missing logic or changes based on such machine specific details. Not solely slowdowns for entry level models, but also address lines and register bits eliminated if a machine had less core than the largest configurations - a 32KW machine needed 15 address lines but an 8KW model would have two of those lines and all the related flip flops eliminated to reduce costs. I had to think through every ALD page that touched on memory addresses to be sure that I had all the logic needed for full 32KW implementation.

As well, if a machine did not have a card reader installed, for example, then cost would be saved by deleting all the signals and circuits related to that device. Interrupt and cycle stealing logic in particular varied quite a bit based on such configuration issues. By comparing the 1130 B and C ALDs from bitsavers and portions of the ALD from the 1130 being restored at the National Museum of Computing, I identified and included logic related to several such options. Sometimes the timing of other signals needs to be delayed or generated earlier to suit a device - the Storage Access Channel (SAC) and the attached multiplexor channel (the 360 mux channel was leveraged by the 1130 to attach the 1403 printer, 2260 graphics stations, 2310 and 2311 disk drives and other such peripherals). I will need a full set of ALD pages for the SAC in order to properly support it - the bitsavers machines do not contain SAC support.

Abandoning plan to use onboard flash as a disk drive

The original plan for this system was to make use of the onboard flash on the Nexys2 board, whose 128MB capacity seemed to offer room to place seven virtual disk cartridges and access them from the disk emulation hardware as needed. The speed of the 1130 disk is slow, with 28 microseconds between each word as the data is streaming in or out of the heads, which would appear at first glance to easily allow for use of flash to read or write the word as needed. The flash has an effective write time of 13.6 microseconds per word, but that is at best case which would require building and writing 32B buffers not individual words, with some kind of FIFO buffer between disk emulation and the flash access module.

A complication comes from the implementation on the Nexys2 board, where the address, data and most control lines are shared between RAM and flash. Only one is accessed at any given time, with RAM the clear priority to ensure the 1130 operates correctly and meets its realistic timing objective. Thus, access to flash would have to fit 'in between' the 1130 CPU and cycle steal accesses. The initial concept was to only allow flash access at phase A of a T7/X7 cycle, finishing up any flash access use of the signal lines before the 1130 itself begins RAM access.

The complexity of overlapping all this, with operations such as a write of a 32byte block taking well over 200 microseconds to complete, then needing to asynchronously empty a FIFO on that long timescale. Plus, the challenges of error recovery are thorny since the disk emulation module and 1130 program itself will have moved well past the IO by the time the last part of a sector is written. Any problem is detected long after we could have presented some meaningful status to the software running in the 1130.

At this point, I suspect that any flash or disk drive used to emulate the disk drive will be hung off a fast serial link of some sort, eliminating the shared signals that ultimately forced me to drop use of the onboard flash. More later as I redesign and implement a replacement mechanism.
\
It is also possible that I will figure out a method that safely exploits the onboard flash, which would have been the most natural approach, most efficient in space and energy use. .

Building the console hardware prototypes

Here is the wiring for half of the console lights panel for the 1130 replica - these are the lights that display the contents of IAR, SAR, SBR, ACC, EXT and AFR registers, bits 0 to 15.

Wiring of 96 LEDs for the left side of console light display

These are placed in 1:1 scale to an IBM 1130 display - the board you see is 4" by 10", the left side of the 1130 panel lights encompass a 5" by 10" space, adding in the right side produces the 20" x 5" black display that lights to indicate the contents or status of key parts of the machine. This board will sit inside the display panel box, with 1/2" clearance to the top and bottom, fronted by a smoked plexiglas plate that will recreate the lettering and cutout numerals that are lit by the LEDs in each position. 

Left side of light panel - contents of six registers
The six registers, from top to bottom, are the Instruction Address, Storage Address, Storage Buffer, Arithmetic Factor, Accumulator, and Accumulator Extension Registers, each 16 bits wide.
IBM 1130 at National Museum of Computing, Milton Keynes, England, UK
Our light panel represents the left half of the black rectangle you can see to the right of the red 'emergency pull" switch. 

LED driver boards
 Prototype boards implementing a chain of three MAX7219 chips that will drive 192 LEDs based on data sent over a three wire serial protocol link from the recreated 1130 FPGA board. This board operates at 5V and has a level shifting 74HCT00 chip to permit it to operate with signals from the FPGA which are based on 3.3V logic levels. 



Switch/Button input multiplexor board
 The board shown above will take 32 button or switch contacts and multiplex them over a two wire I2C serial bus to the FPGA 1130 machine. The board operates at 3.3V to interface with the FPGA board, but has level shifting chips to allow 5V logic levels for inputs. It is used with the debouncer board below. The buttons, toggle switches and rotary switches of the 1130 console, except for the 16 toggle switches that are mounted on the console printer faceplate, are routed through this concentrator and into the hardware on the FPGA board. Uses an MCP23017 to multiplex signals and several 74HC4050 level shifter buffers which tolerate 5V inputs and produce legal 3.3V outputs.

Input debouncer board
This debouncer will remove any bouncing of the state of the buttons as they are operated, where contacts produce a short term blizzard of on and off conditions as the physical switch is pressed or released, but the intent is to record just the selected final condition - 1 or 0 - which the switch will settle down to deliver steadily a few milliseconds after it is operated. The debouncer eliminates those glitchy short term effects and passes through only the intended change, on to off or off to on. This board as built implements debouncers for 24 switches/buttons, more chips are added as more buttons are required. Uses MC14490 debouncer chips which operate on 5V, producing the signals that are shifted down to 3.3V signals by the buffers on the concentrator board above.

The 16 toggle switches, called the Console Entry Switches, are separately multiplexed by an MCP23S17 chip which is very similar to the MCP23017 but uses the SPI protocol, a four wire serial link, rather than the I2C protocol of the other chip. Debouncing is currently done in the FPGA hardware but will be handled by MC14490 chips in the next version of the board (not pictured here).

Together, these boards allow 240 input and output devices (LEDs, buttons, and switches) to be connected using only 9 wires into the FPGA.