H.261 is an ITU-T video compression standard, first ratified in November 1988. It is the first member of the H.26x family of video coding standards in the domain of the ITU-T Video Coding Experts Group (VCEG), and was the first video coding standard that was useful in practical terms.
H.261 was originally designed for transmission over ISDN lines on which data rates are multiples of 64 kbit/s. The coding algorithm was designed to be able to operate at video bit rates between 40 kbit/s and 2 Mbit/s. The standard supports two video frame sizes: CIF (352×288 luma with 176×144 chroma) and QCIF (176×144 with 88×72 chroma) using a 4:2:0 sampling scheme. It also has a backward-compatible trick for sending still images with 704×576 luma resolution and 352×288 chroma resolution (which was added in a later revision in 1993).
Whilst H.261 was preceded in 1984 by H.120 (which also underwent a revision in 1988 of some historic importance) as a digital video coding standard, H.261 was the first truly practical digital video coding standard (in terms of product support in significant quantities). In fact, all subsequent international video coding standards (MPEG-1 Part 2, H.262/MPEG-2 Part 2, H.263, MPEG-4 Part 2, H.264/MPEG-4 Part 10, and HEVC) have been based closely on the H.261 design. Additionally, the methods used by the H.261 development committee to collaboratively develop the standard have remained the basic operating process for subsequent standardization work in the field. It was developed by the CCITT Study Group XV Specialists Group on Coding for Visual Telephony (which later became part of ITU-T SG16), chaired by Sakae Okubo of NTT.
Although H.261 was first approved as a standard in 1988, the first version was missing some significant elements necessary to make it a complete interoperability specification. Various parts of it were marked as "Under Study". It was later revised in 1990 to add the remaining necessary aspects, and was then revised again in 1993. The 1993 revision added an Annex D entitled "Still image transmission", which provided a backward-compatible way to send still images with 704×576 luma resolution and 352×288 chroma resolution by using a staggered 2:1 subsampling horizontally and vertically to separate the picture into four sub-pictures that were sent sequentially.
The basic processing unit of the design is called a macroblock, and H.261 was the first standard in which the macroblock concept appeared. Each macroblock consists of a 16×16 array of luma samples and two corresponding 8×8 arrays of chroma samples, using 4:2:0 sampling and a YCbCr color space. The coding algorithm uses a hybrid of motion-compensated inter-picture prediction and spatial transform coding with scalar quantization, zig-zag scanning and entropy encoding.
The inter-picture prediction reduces temporal redundancy, with motion vectors used to compensate for motion. Whilst only integer-valued motion vectors are supported in H.261, a blurring filter can be applied to the prediction signal – partially mitigating the lack of fractional-sample motion vector precision. Transform coding using an 8×8 discrete cosine transform (DCT) reduces the spatial redundancy. The DCT that is widely used in this regard was introduced by N. Ahmed, T. Natarajan and K. R. Rao in 1974. Scalar quantization is then applied to round the transform coefficients to the appropriate precision determined by a step size control parameter, and the quantized transform coefficients are zig-zag scanned and entropy-coded (using a "run-level" variable-length code) to remove statistical redundancy.
The H.261 standard actually only specifies how to decode the video. Encoder designers were left free to design their own encoding algorithms (such as their own motion estimation algorithms), as long as their output was constrained properly to allow it to be decoded by any decoder made according to the standard. Encoders are also left free to perform any pre-processing they want to their input video, and decoders are allowed to perform any post-processing they want to their decoded video prior to display. One effective post-processing technique that became a key element of the best H.261-based systems is called deblocking filtering. This reduces the appearance of block-shaped artifacts caused by the block-based motion compensation and spatial transform parts of the design. Indeed, blocking artifacts are probably a familiar phenomenon to almost everyone who has watched digital video. Deblocking filtering has since become an integral part of the more recent standards H.264 and HEVC (although even when using these newer standards, additional post-processing is still allowed and can enhance visual quality if performed well).
Design refinements introduced in later standardization efforts have resulted in significant improvements in compression capability relative to the H.261 design. This has resulted in H.261 becoming essentially obsolete, although it is still used as a backward-compatibility mode in some video-conferencing systems (such as H.323) and for some types of internet video. However, H.261 remains a major historical milestone in the field of video coding development.
- "(Nokia position paper) Web Architecture and Codec Considerations for Audio-Visual Services" (PDF).
H.261, which (in its first version) was ratified in November 1988.
- ITU-T (1988). "H.261 : Video codec for audiovisual services at p x 384 kbit/s - Recommendation H.261 (11/88)". Retrieved 2010-10-21.
- S. Okubo, "Reference model methodology – A tool for the collaborative creation of video coding standards", Proceedings of the IEEE, vol. 83, no. 2, Feb. 1995, pp. 139–150
- ITU-T (1990). "H.261 : Video codec for audiovisual services at p x 64 kbit/s - Recommendation H.261 (12/90)". Retrieved 2015-12-10.
- ITU-T (1993). "H.261 : Video codec for audiovisual services at p x 64 kbit/s - Recommendation H.261 (03/93)". Retrieved 2015-12-10.
- N. Ahmed, T. Natarajan and K. R. Rao, "Discrete Cosine Transform", IEEE Transactions on Computers, Jan. 1974, pp. 90-93; PDF file.