Virtual output queueing
Virtual output queueing (VOQ) is the technique used in input-queued switches where rather than keeping all traffic in a single queue, separate queues are maintained for each possible output location. It addresses a common problem known as head-of-line blocking.[1]
Description
In VOQ each input port maintains a separate queue for each output port. It has been shown that VOQ can achieve 100% throughput performance with an effective scheduling algorithm. This scheduling algorithm should be able to provide a high speed mapping of packets from inputs to outputs on a cycle-to-cycle basis. The VOQ mechanism provides throughput at a much higher rate than the crossbar switches without it.
There are many algorithms for design and implementation of fast VOQ. Nick McKeown and a group at Stanford University for example published a design in 1997.[2]
Quality of service and priority are extensions found in literature of the same time.[3]
VOQ scheduling is often referred to as "arbitration" (resolving the concurrent access wishes), whereas the ordering of packets ("packet scheduling") is an additional task[4] following the VOQ arbitration.
Example
For example, consider a 2×2 crossbar switch.
-------- a--->|-\--/-|--->--0 | \/ | | /\ | b--->| /--\ |---->-1 --------
Let's say that data "0" arriving at port A or B will go to output port 0. Similarly data "1" arriving at port A or B will go to output port 1. So the number of combinations that can happen at the input port A, B are: 00, 01, 10, and 11.
If data at the input is "00", this means both the input data at time t are contending for the same output port 0 and the output port 0 can't route both the data at the same time as it can handle only one unit data per time slot. So in this case the efficiency of the 2×2 switch (without VOQ) is 0.5.
Same is the case for data "11" in which the efficiency is 0.5. Similarly for data "01" and "10" the efficiency is 1 as there is no contention as both the data go to both output ports 0 and 1.
Since it is a 2×2 switch, the probability that any one of out of these four combinations of data will occur = 0.25. So the efficiency of this 2×2 switch is:
(0.25 * 0.5) --> for data 00 + (0.25 * 0.5) --> for data 11 + (0.25 * 1.0) --> for data 01 + (0.25 * 1.0) --> for data 10 --------------- = 0.75 (75%)
So we see that the 2×2 crossbar switch is working at 75% efficiency to give out data from input to output. As n increases, for n×n switches this causes further degradation in efficiency. VOQ overcomes this problem by introducing extra buffers (queues) per port.
Let's revisit the same scenario with 2×2 switch with VOQ support.
-------- a--->|-\--/-|-OO-->--0 | \/ | | /\ | b--->| /--\ |-OO--->-1 --------
Here each output port has n buffers per port to accommodate a given maximum number of simultaneous data packets that each port can receive at a time. This buffering mechanism removes the bottleneck per port on peak time and distributes it over a period of time increasing the switch performance.
References
- ^ Goudreau, Mark W.; Kolliopoulos, Stavros G.; Rao, Satish B. (2000). "Scheduling algorithms for input-queued switches: Randomized techniques and experimental evaluation". Proceedings of IEEE INFOCOM. doi:10.1109/INFCOM.2000.832562. ISBN 0-7803-5880-5. CiteSeerx: 10.1.1.42.5126.
- ^ McKeown, Nick; Izzard, Martin; Mekkittikul, Adisak; Ellersick, Bill; Horowitz, Mark (1997). "Tiny Tera: a packet switch core" (PDF). IEEE Micro. 17: 26–33. doi:10.1109/40.566194.
- ^ Schoenen, Rainer; Post, Guido; Sander, Gerald (1999). "Prioritized arbitration for input-queued switches with 100% throughput". Proceedings of ATM Workshop. doi:10.1109/ATM.1999.786865. ISBN 4-88552-164-5.
- ^ Schoenen, Rainer; Hying, Roman (1999). "Distributed cell scheduling algorithms for virtual-output-queued switches". Proceedings of IEEE Globacom. doi:10.1109/GLOCOM.1999.829963. ISBN 0-7803-5796-5.