The theoretical upper limit of a network's capacity is its liquid throughput (assuming there is no network coding). The liquid throughput corresponds to the flow of a liquid in an equivalent network of pipes. The aggregate throughput of an arbitrarily scheduled collective communication may be several times lower than the maximal potential throughput of the network due to congestions between simultaneous transfers sharing a common communication resource.
This method relies on the knowledge of the underlying network topology and ensures an optimal utilization of all bottleneck links of the network. It properly schedules the transmissions within a time as short as the utilization time of a bottleneck link. This guarantees that the liquid throughput is attained.
Liquid scheduling can be applied in high-performance computing (HPC) networks. It can be also applied in optical networks, for example in optical burst switching (OBS) where the edge IP routers perform liquid scheduling in order to ensure an efficient utilization of the capacities of the interconnecting optical cloud. High-performance computing relies on networks with very low latencies. In such networks large messages are copied from one processor to another across the network. The intermediate switches are directing the content of the message without storing and forwarding the messages at each intermediate hop. Network overhead decreases when the size of the messages increases. From another side, simultaneous transmissions of large indivisible messages across the network may result in congestion when the transmission paths intersect. When the number of parallel transmissions increases, the rate of congestions increases rapidly. The throughput gain achieved by the data aggregation can be undermined by the high rate of congestions.
Optical networks are another example of coarse-grained circuit switching networks. Lightpaths sharing a common wavelength on a common link cannot be established in overlapping periods of time. Increasing the number of parallel transmissions may yield many blocked lightpaths and affect the throughput.
To construct a liquid schedule, one must choose time frames utilizing all bottleneck links and perform as many transfers as possible within each timeframe. The traffic is therefore partitioned into time frames comprising mutually non-congesting transfers keeping all bottleneck links busy during all time frames. The saturated subsets of non-congesting transfers using all bottleneck links are called full teams. An efficient construction of liquid schedules relies on the fast retrieval of full teams. Significant speed up in the construction algorithm can be obtained by carrying out optimizations both in the retrieval of full teams and in their assembly into a schedule.
Measurements on the traffic patterns carried out on real network have shown that it is possible to increase the overall communication throughput by a factor between 1.5 and 2.
The liquid schedules can be found in fractions of a second for traffic patterns consisting of several thousand transfers across networks of up to hundred nodes.
The chart shows the overall network throughput of collective communication patterns when carried out according to liquid scheduling algorithm. 362 different communication patterns are considered, involving up to 32 nodes and comprising up to 1024 parallel transmissions. The gray area shows the theoretical upper limit of the network capacity for each communication pattern. The gray dots show the performance of the network when the communication is carried out according to a topology unaware schedule (round robin scheduling or random topology-unaware scheduling exhibit similar performances). The black dots, representing the communication performance carried out according liquid schedules, are very close to the theoretical limit. The measurements are carried out on the Swiss-Tx cluster supercomputer, consisting of 64 processors and installed in Swiss Federal Institute of Technology, Lausanne (EPFL).
- Emin Gabrielyan, Roger D. Hersch (2004). "Liquid Schedule Searching Strategies for the Optimization of Collective Network Communication". In Hamid R. Arabnia, Laurence T. Yang, Chi-Hsiang Yeh. International Conference on Wireless Networks. pp. 834–850. ISBN 1-932415-39-4.