Nonblocking minimal spanning switch
||This article includes a list of references, related reading or external links, but its sources remain unclear because it lacks inline citations. (January 2013)|
A nonblocking minimal spanning switch is a device that can connect N inputs to N outputs in any combination. The most familiar use of switches of this type is in a telephone exchange. The term "non-blocking" means that if it is not defective, it can always make the connection. The term "minimal" means that it has the fewest possible components, and therefore the minimal expense.
Historically, in telephone switches, connections between callers were arranged with large, expensive banks of electromechanical relays, Strowger switches. The basic mathematical property of Strowger switches is that for each input to the switch, there is exactly one output. Much of the mathematical switching circuit theory attempts to use this property to reduce the total number of switches needed to connect a combination of inputs to a combination of outputs.
In the 1940s and 1950s, engineers in Bell Laboratories began an extended series of mathematical investigations into methods for reducing the size and expense of the "switched fabric" needed to implement a telephone exchange. One early, successful mathematical analysis was performed by Charles Clos (French pronunciation: [ʃaʁl Cləʊ]), and a switched fabric constructed of smaller switches is called a clos network.
Background: switching topologies
The crossbar switch
The crossbar switch has the property of being able to connect N inputs to N outputs in any one-to-one combination, so it can connect any caller to any non-busy receiver, a property given the technical term "nonblocking". Being nonblocking it could always complete a call (to a non-busy receiver), which would maximize service availability.
However, the crossbar switch does so at the expense of using N2 (N squared) simple SPST switches. For large N (and the practical requirements of a phone switch are considered large) this growth was too expensive. Further, large crossbar switches had physical problems. Not only did the switch require too much space, but the metal bars containing the switch contacts would become so long that they would sag and become unreliable. Engineers also noticed that at any time, each bar of a crossbar switch was only making a single connection. The other contacts on the two bars were unused. This seemed to imply that most of the switching fabric of a crossbar switch was wasted.
The obvious way to emulate a crossbar switch was to find some way to build it from smaller crossbar switches. If a crossbar switch could be emulated by some arrangement of smaller crossbar switches, then these smaller crossbar switches could also, in turn be emulated by even smaller crossbar switches. The switching fabric could become very efficient, and possibly even be created from standardized parts. This is called a Clos network.
Completely connected 3-layer switches
The next approach was to break apart the crossbar switch into three layers of smaller crossbar switches. There would be an "input layer", a "middle layer" and an "output layer." The smaller switches are less massive, more reliable, and generally easier to build, and therefore less expensive.
A telephone system only has to make a one-to-one connection. Intuitively this seems to mean that the number of inputs and the number of outputs can always be equal in each subswitch, but intuition does not prove this can be done nor does it tell us how to do so. Suppose we want to synthesize a 16 by 16 crossbar switch. The design could have 4 subswitches on the input side, each with 4 inputs, for 16 total inputs. Further, on the output side, we could also have 4 output subswitches, each with 4 outputs, for a total of 16 outputs. It is desirable that the design use as few wires as possible, because wires cost real money. The least possible number of wires that can connect two subswitches is a single wire. So, each input subswitch will have a single wire to each middle subswitch. Also, each middle subswitch will have a single wire to each output subswitch.
The question is how many middle subswitches are needed, and therefore how many total wires should connect the input layer to the middle layer. Since telephone switches are symmetric (callers and callees are interchangeable), the same logic will apply to the output layer, and the middle subswitches will be "square", having the same number of inputs as outputs.
The number of middle subswitches depends on the algorithm used to allocate connection to them. The basic algorithm for managing a three-layer switch is to search the middle subswitches for a middle subswitch that has unused wires to the needed input and output switches. Once a connectible middle subswitch is found, connecting to the correct inputs and outputs in the input and output switches is trivial.
Theoretically, in the example, only four central switches are needed, each with exactly one connection to each input switch and one connection to each output switch. This is called a "minimal spanning switch," and managing it was the holy grail of the Bell Labs' investigations.
However, a bit of work with a pencil and paper will show that it is easy to get such a minimal switch into conditions in which no single middle switch has a connection to both the needed input switch and the needed output switch. In fact, if there is one call per input switch, and this results in one call per output switch, then the first sixteen calls of this hypothetical switch actually block up to fifteen additional calls that would need similar connections.[clarification needed (After 16 calls, there are no lines left to block)]
For this reason, a "simply connected nonblocking switch" 16x16 switch with four input subswitches and four output switches was thought to require 7 middle switches. For this reason, sometimes this switch arrangement is called a "2n-1 switch", where n is the number of input ports of the input subswitches.
The example is intentionally small, and in such a small example, the reorganization does not save many switches. A 16x16 crossbar has 256 contacts, while a 16x16 minimal spanning switch has 4x4x4x3 = 192 contacts. Real telephone exchanges have hundreds of thousands of inputs, and tens of millions of switch contacts.
Managing a minimal spanning switch
The crucial discovery was a way to reorganize connections in the middle switches to "trade wires" so that a new connection could be completed.
The first step in the nonblocking minimal spanning switch algorithm is just the naive earlier scheme (above): Search for a middle-layer subswitch that contains the needed in and out connections. If a middle-layer subswitch is found that connects to both the needed input and output subswitches, then it is allocated, and the connection goes through.
However, since the number of middle subswitches is smaller in a minimal spanning switch than in a "2n-1" switch, sometimes this search fails. If a subswitch with the needed pair of connections can't be found, a pair of subswitches must be found. One subswitch must have a connection to the needed input switch; the other must have a connection to the needed output subswitch. These subswitches have to exist, because for each input in a minimal spanning switch, there is a wire from the input subswitches to the middle subswitches. Since the whole switch is for a telephone system (caller and callees are interchangeable), it is also symmetric, so there is also a free wire from one of the middle layer subswitches to the needed output subswitch.
Now, conceptually, the algorithm has to reorganize the connections in these two middle subswitches (call them A and B). The idea is to keep all of the existing connections that are already passing through A and B, to prevent dropping calls, and yet bring together, into either A or B two wires leading to the needed input and output subswitches.
In the standard explanatory drawing, A and Bs' diagrammed connections are actually laid one atop another. The inputs of A must lie on the corresponding inputs of B. The outputs of A must likewise lie upon the corresponding outputs of B.
The connections passing through A and B are placed in a list that also includes the desired new connection.
First, each connection that has only a single input or single output is traced on the superposition of A and B. In a pencil-and-paper drawing, one actually moves the pencil along the drawn connection.
Starting from some input or output, one traces a connection to an output, then traces the other connection at that output to an input, and so forth, back and forth until one comes to an end with no other connections.
Each time one traces from input to output, the connection is allocated to subswitch A. When tracing in the other direction, that connection is allocated to B.
After all the connections with single inputs or single outputs are gone, all that is left are cyclic graphs, or "loops" of connections. Again, one traces each graph completely, assigning connections to subswitches, and removing each connection from the list of connections.
There is less bookkeeping if all the non-loops (acyclic graphs) are traced and removed before the loops (cyclic graphs) are traced and removed. In that way, one never has to check any input or output more than twice, and there's no need to keep a list of which inputs and outputs have been examined.
It doesn't matter whether A or B gets a certain direction of trace, because A and B have the same number of connections: one wire to every input and output subswitch.
After the connections are allocated in arrays in the software, then the electronics of the switch can actually be reprogrammed, physically moving the connections. The electronic switches are designed intentionally so that the new data can all be written into the electronics, and then take effect with a single logic pulse. The result is that the connection moves instantaneously, with an imperceptible interruption to the conversation. In older electromechanical switches, one occasionally heard a clank of "switching noise."
This algorithm is a form of topological sort, and is the guts of the algorithm that controls a minimal spanning switch.
Practical implementations of switches
As soon as the algorithm was discovered, Bell system engineers and managers began discussing it. After several years, Bell engineers began designing electromechanical switches that could be controlled by it. At the time, computers used tubes and were not reliable enough to control a phone system (phone system switches are safety-critical, and they are designed to have an unplanned failure about once per thirty years). Relay-based computers were too slow to implement the algorithm. However, the entire system could be designed so that when computers were reliable enough, they could be retrofitted to existing switching systems.
It's not difficult to make composite switches fault-tolerant. When a subswitch fails, the callers simply redial. So, on each new connection, the software tries the next free connection in each subswitch rather than reusing the most recently released one. The new connection is more likely to work because it uses different circuitry.
Therefore, in a busy switch, when a particular PCB lacks any connections, it is an excellent candidate for testing.
To test or remove a particular printed circuit card from service, there is a well-known algorithm. As fewer connections pass through the card's subswitch, the software routes more test signals through the subswitch to a measurement device, and then reads the measurement. This does not interrupt old calls, which remain working.
If a test fails, the software isolates the exact circuit board by reading the failure from several external switches. It then marks the free circuits in the failing circuitry as busy. As calls using the faulty circuitry are ended, those circuits are also marked busy. Some time later, when no calls pass through the faulty circuitry, the computer lights a light on the circuit board that needs replacement, and a technician can replace the circuit board. Shortly after replacement, the next test succeeds, the connections to the repaired subswitch are marked "not busy," and the switch returns to full operation.
The diagnostics on Bell's early electronic switches would actually light a green light on each good printed circuit board, and light a red light on each failed printed circuit board. The printed circuits were designed so that they could be removed and replaced without turning off the whole switch.
The eventual result was the Bell 1ESS switch (electronic switching system 1). This was controlled by a 3B20 computer, a lock-step dual computer using reliable diode transistor logic. In the 1ESS's 3B20, two computers performed each step, checking each other. When they disagreed, they would diagnose themselves, and the correctly running computer would take up switch operation while the other would disqualify itself and request repair. The 1ESS switch was still in limited use as of 2012, and had a verified reliability of less than one unscheduled hour of failure in each thirty years of operation, validating its design.
Initially it was installed on long distance trunks in major cities, the most heavily used parts of each telephone exchange. On the first Mother's Day that major cities operated with it, the Bell system set a record for total network capacity, both in calls completed, and total calls per second per switch. This resulted in a record for total revenue per trunk.
A practical implementation of a switch can be created from an odd number of layers of smaller subswitches. Conceptually, the crossbar switches of the three-stage switch can each be further decomposed into smaller crossbar switches. Although each subswitch has limited multiplexing capability, working together they synthesize the effect of a larger N×N crossbar switch.
In a modern telephone switch, application of two different multiplexer approaches in alternate layers further reduces the cost of the switching fabric:
- space-division multiplexers are something like the crossbar switches already described, or some arrangement of crossover switches or banyan switches. Any single output can select from any input. In digital switches, this is usually an arrangement of and gates. 8000 times per second, the connection is reprogrammed to connect particular wires for the duration of a time slot. Design advantage: In space-division systems the number of space-division connections is divided by the number of time slots in the time-division multiplexing system. This dramatically reduces the size and expense of the switching fabric. It also increases the reliability, because there are far fewer physical connections to fail.
- time division multiplexers each have a memory which is read in a fixed order and written in a programmable order (or vice versa). This type of switch permutes time-slots in a time-division multiplexed signal that goes to the space-division multiplexers in its adjacent layers. Design advantage: Time-division switches have only one input and output wire. Since they have far fewer electrical connections to fail, they are far more reliable than space-division switches, and are therefore the preferred switches for the outer (input and output) layers of modern telephone switches.
The scarce resources in a telephone switch are the connections between layers of subswitches. These connections can be either time slots or wires, depending on the type of multiplexing. The control logic has to allocate these connections, and the basic method is the algorithm already discussed. The subswitches are logically arranged so that they synthesize larger subswitches. Each subswitch, and synthesized subswitch is controlled (recursively) by the above algorithm.
If the recursion is taken to the limit, breaking down the crossbar to the minimum possible number of switching elements, the resulting device is sometimes called a crossover switch or a banyan switch depending on its topology.