In telecommunication and computer engineering, the queuing delay or queueing delay is the time a job waits in a queue until it can be executed. It is a key component of network delay. In a switched network, queuing delay is the time between the completion of signaling by the call originator and the arrival of a ringing signal at the call receiver. Queues may be caused by delays at the originating switch, intermediate switches, or the call receiver servicing switch. In a data network, queuing delay is the sum of the delays between the request for service and the establishment of a circuit to the called data terminal equipment (DTE). In a packet-switched network, queuing delay is the sum of the delays encountered by a packet between the time of insertion into the network and the time of delivery to the addressee. 
This term is most often used in reference to routers. When packets arrive at a router, they have to be processed and transmitted. A router can only process one packet at a time. If packets arrive faster than the router can process them (such as in a burst transmission) the router puts them into the queue (also called the buffer) until it can get around to transmitting them. Delay can also vary from packet to packet so averages and statistics are usually generated when measuring and evaluating queuing delay. 
As a queue begins to fill up due to traffic arriving faster than it can be processed, the amount of delay a packet experiences going through the queue increases. The speed at which the contents of a queue can be processed is a function of the transmission rate of the facility. This leads to the classic delay curve. The average delay any given packet is likely to experience is given by the formula 1/(μ-λ) where μ is the number of packets per second the facility can sustain and λ is the average rate at which packets are arriving to be serviced.  This formula can be used when no packets are dropped from the queue.
The maximum queuing delay is proportional to buffer size. The longer the line of packets waiting to be transmitted, the longer the average waiting time is. The router queue of packets waiting to be sent also introduces a potential cause of packet loss. Since the router has a finite amount of buffer memory to hold the queue, a router which receives packets at too high a rate may experience a full queue. In this case, the router has no other option than to simply discard excess packets.
When the transmission protocol uses the dropped-packets symptom of filled buffers to regulate its transmit rate, as the Internet's TCP does, bandwidth is fairly shared at near theoretical capacity with minimal network congestion delays. Absent this feedback mechanism the delays become both unpredictable and rise sharply, a symptom also seen as freeways approach capacity; metered onramps are the most effective solution there, just as TCP's self-regulation is the most effective solution when the traffic is packets instead of cars). This result is both hard to model mathematically and quite counterintuitive to people who lack experience with mathematics or real networks. Failing to drop packets, choosing instead to buffer an ever-increasing number of them, produces bufferbloat.
In Kendall's notation, the M/M/1/K queuing model, where K is the size of the buffer, may be used to analyze the queuing delay in a specific system. Kendall's notation should be used to calculate the queuing delay when packets are dropped from the queue. The M/M/1/K queuing model is the most basic and important queuing model for network analysis.
- Broadcast delay
- Delay encoding
- End-to-end delay
- Little's law – queueing formula
- Network delay
- Packet loss
- Processing delay
- Queueing theory
- Transmission delay
- Wireless communications; Theodore S.Rpappaport