Byte serving is the process of sending only a portion of an HTTP/1.1 message from a server to a client. Byte serving begins when an HTTP server advertises its willingness to serve partial requests using the Accept-Ranges response header. A client then requests a specific part of a file from the server using the Range request header. If the range is valid, the server sends it to the client with a 206 Partial Content status code and a Content-Range header listing the range sent. If the range is invalid, the server responds with a 416 Requested Range Not Satisfiable status code.
Clients which request byte-serving might do so in cases in which a large file has been only partially delivered and a limited portion of the file is needed in a particular range. Byte Serving is therefore a method of bandwidth optimization. In the HTTP/1.0 standard, clients were only able to request an entire document. By allowing byte-serving, clients may choose to request any portion of the resource. One advantage of this capability is when a large media file is being requested, and that media file is properly formatted, the client may be able to request just the portions of the file known to be of interest. This is essential for serving video files; if a server lacks this feature, videos hosted on that server may not be playable until the entire file has been downloaded by the client, and seeking within the file may be disabled. Byte serving has also been reported to work for some PDF files and clients in which a client may request a certain page, rather than the entire file. Other names for byte serving:
- RFC 7233 says the client makes Range Requests when it makes a partial content request
- Clients make range requests 
- Byte Range Serving
- Page on demand
Byte serving can also be used by multihomed clients to simultaneously download a resource over multiple network interfaces. To achieve this type of application-layer link aggregation, multiple HTTP sessions are established and logical file segments are collaboratively downloaded from the server and reassembled at the client. This allows full utilization of several end-to-end paths and therefore leads to an increased download speed.
The use of the Chunked Transfer-Encoding is not byte-serving, but is instead a method in which an HTTP/1.1 server sends the entire resource, but in several separate portions (or chunks) of data. It is often used when a server does not know exactly how much data there will be in the total response, allowing the server to start sending data to the client straight away without having to buffer the response and determine the exact length before it begins sending it to the client. This improves latency and reduces memory requirements while preserving the ability to reuse the connection after the response is completed. Byte serving and chunking are compatible and can be used with or without the other.
- Key Differences between HTTP/1.0 and HTTP/1.1 "A typical example is a server's sending an entire (large) resource when the client only needs a small part of it. There was no way in HTTP/1.0 to request partial objects... HTTP/1.1 range requests allow a client to request portions of a resource."
- Apache Week. HTTP/1.1
- Key Differences between HTTP/1.0 and HTTP/1.1
- byte serving definition of byte serving in the Free Online Encyclopedia
- Enhancing Video-on-Demand Playout over Multiple Heterogeneous Access Networks by D. Kaspar, K. R. Evensen, P. E. Engelstad, A. F. Hansen, P. Halvorsen, and C. Griwodz. In: IEEE Consumer Communications and Networking Conference (CCNC), ISBN 978-1-4244-5176-0, 2010
- HTTP Chunking
- RFC 7233: Hypertext Transfer Protocol (HTTP/1.1): Range Requests