User:Sachingf/sandbox
Butterfly Network is a multistage interconnection network topology, which can be used to connect different nodes in a multiprocessor system. The interconnect network for a shared memory multiprocessor systems requires very low latency and high bandwidth compared to other network systems like Local Area Networks (LANs) or internet because[1]:-
- Messages are relatively short as most messages are coherence protocol requests and responses without data.
- Messages are generated frequently because each read or write miss generates messages to every node in the system to ensure coherence.
- Since messages are generated frequently, it is difficult for processors to hide the communication delay.
The major components of an interconnect network are[2]:-
- Processor Nodes - They consist of one or more processors along with their caches, memories and communication assist.
- Switching Nodes (Router) - They connect communication assist of different processor nodes and other switching nodes.
- Link - Physical wires between two switching nodes (routers). They can be unidirectional or bidirectional.
These multi stage networks have lower cost than a cross bar, but still obtain lower contention than a bus. Also, the ratio of switch nodes to processor nodes is greater than one in a butterfly network. Therefore it is an indirect topology[3]. The network derived its name from connections between nodes in two adjacent ranks (as shown in figure 1), which resembles a butterfly. When top and bottom ranks are merged into a single rank, it is called a Wrapped Butterfly Network[3]. In figure 1, if rank 3 nodes are connected back to respective rank 0 nodes, then it becomes a wrapped butterfly network.
BBN Butterfly, a massive parallel computer built by Bolt, Beranek and Newman in the 1980s, used butterfly interconnect network[4].
For a butterfly network with 'p' processor nodes, you need to have p(log2 p + 1) switching nodes. Figure 1 shows a network with 8 processor nodes, which means we have 32 switching nodes. It also represents each node as N(rank, column number). For example, node at column 6 in rank 1 is represented as (1,6) and node at column 2 in rank 0 is represented as (0,2).
For any 'i' greater than zero, a switching node N(i,j) gets connected to N(i-1, j) and N(i-1, m), where 'm' is obtained by flipping the ith most significant bit of j. For example, consider the node N(1,6). Here, i equals 1 and j equals 6. Therefore, m is obtained by flipping the first most significant bit of 6.
Variable | Binary representation | Decimal Representation |
j | 110 | 6 |
m | 010 | 2 |
As a result, the nodes connected to N(1,6) are :-
N(i,j) | N(i-1,j) | N(i-1,m) |
(1,6) | (0,6) | (0,2) |
Interestingly, N(0,6), N(1,6), N(0,2), N(1,2) form a butterfly pattern. We can find several such butterfly patterns in the figure and therefore, this network is called Butterfly Network.
Consider that in a wrapped butterfly network (which means rank 0 gets merged with rank 3. In figure 2, it is shown by replicating the processor nodes below rank 3) , we need to send a message from processor 5 to processor 2. The packet transmitted over the link is of the form :-
Header | Payload | Trailer |
The header contains the destination of the message, which is processor 2 (010 in binary). The payload is the message 'M' and trailer contains checksum. Therefore, the actual message transmitted from processor 5 is :-
010 | M | checksum |
Upon reaching a switching node, one of the two output links is selected, based on a single bit of the destination address. If that bit is zero, the left link is selected. If that bit is one, the right link is selected.
- The above packet reaches N(0,5). From the header of the packet, it removes the leftmost bit, to decide the direction. Since it is a zero, left link of N(0,5) (which connects to N(1,1)) gets selected. The new header is '10'.
- The new packet reaches N(1,1). From the header of the packet, it removes the leftmost bit, to decide the direction. Since it is a one, right link of N(1,1) (which connects to N(2,3)) gets selected. The new header is '0'.
- The new packet reaches N(2,3). From the header of the packet, it removes the leftmost bit, to decide the direction. Since it is a zero, left link of N(2,3) (which connects to N(3,2)) gets selected. The header field is empty.
- Processor 2 receives the packet, which now contains only the payload 'M' and the checksum.
There are several parameters to evaluate a network topology, but the prominent ones relevant in designing large scale multi-processor systems are discussed here. These parameters are summarized below and we will see how they are calculated for a butterfly network with 8 processor nodes as shown in figure 1.
- Bisection Bandwidth: A representative measure of the bandwidth bottleneck which restricts overall communication. It is the maximum bandwidth required to sustain communication between all nodes in the network. We can interpret this as the minimum number of links that need to be severed to split the system into two equal portions. For example, the 8 node butterfly network can be split into two by cutting 4 links that crisscross across the middle. Thus bisection bandwidth of this particular system is 4.
- Diameter: This is the worst case latency (between two nodes) possible in the system. So, in the 8 node butterfly network, it appears that N(0,0) and N(3,7) are farthest away. But upon inspecting carefully, we can see that due to the symmetric nature of the network, traversing from any rank 0 node to any rank 3 node requires only 3 hops. Therefore the diameter of this system is 3.
- Links: Total number of links required to construct the entire network structure. This is an indicator for overall cost and complexity of implementation. The example network requires a total of 48 links (16 links each between rank 0 and 1, rank 1 and 2, rank 2 and 3).
- Degree: Represents the complexity of each router in the Network. This is equal to the number of in/out links connected to each switching node. The butterfly network switching nodes have 2 input links and 2 output links, hence it is a 4-degree network.
Let us compare the butterfly network with some other network topologies like 2-D mesh and hypercube. All their relevant parameters are compiled in the below table[6] (‘p’ represents the number of processor nodes).
Topology | Diameter | Bisection Bandwidth | Links | Degree |
2-D mesh | 2((√p) - 1) | √p | 2√p((√p) - 1) | 4 |
Hypercube | log2(p) | p/2 | log2(p) × (p/2) | log2(p) |
Butterfly | log2(p) | p/2 | log2(p) × 2p | 4 |
It can be clearly seen that hypercube and butterfly networks have better diameter and bisection bandwidth when compared to a 2-D mesh with same number of processor nodes. But butterfly network implementation is more complex and costlier than 2-D mesh due to the higher number of required links.
The difference between hypercube and butterfly lies within their implementation. Butterfly network has a symmetric structure where all processor nodes between two ranks are equidistant to each other, whereas hypercube is more suitable for a multi-processor system which demands unequal distances between its nodes. By simply looking at the number of links required, it may appear that hypercube is cheaper and simpler compared to a butterfly network. But as the number of processor nodes go beyond 16, the router cost and complexity (represented by Degree) of butterfly network is lower than hupercube because its router structure is independent of the number of nodes.
The conclusion is that there is no single network topology which is best for all scenarios. The decision has to be made based on factors like number of processor nodes in the system, bandwidth-latency requirements, cost and scalability.
See Also
[edit]References
[edit]- ^ Solihin, Yan (2009). Fundamentals of Parallel Computer Architecture. Solihin Books. pp. 361–362. ISBN 978-0-9841630-0-7.
- ^ Solihin, Yan (2009). Fundamentals of Parallel Computer Architecture. Solihin Books. pp. 363–364. ISBN 978-0-9841630-0-7.
- ^ a b c d Leighton, F.Thomson (1992). Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. Morgan Kaufmann Publishers. ISBN 1-55860-117-1.
- ^ T., LeBlanc,; M., Scott,; C., Brown, (1988-01-01). "Large-Scale Parallel Programming: Experience with the BBN Butterfly Parallel Processor".
{{cite journal}}
: Cite journal requires|journal=
(help)CS1 maint: extra punctuation (link) CS1 maint: multiple names: authors list (link) - ^ Solihin, Yan (2009). Fundamentals of Parallel Computer Architecture. Solihin Books. pp. 367–369. ISBN 978-0-9841630-0-7.
- ^ a b Solihin, Yan (2009). Fundamentals of Parallel Computer Architecture. Solihin Books. pp. 370–371. ISBN 978-0-9841630-0-7.