= RTP payload formats =

The Real-time Transport Protocol (RTP) specifies a general-purpose data format and network protocol for transmitting digital media streams on Internet Protocol (IP) networks. The details of media encoding, such as signal sampling rate, frame size and timing, are specified in an RTP payload format. The format parameters of the RTP payload are typically communicated between transmission endpoints with the Session Description Protocol (SDP), but other protocols, such as the Extensible Messaging and Presence Protocol (XMPP) may be used.

==Payload types and formats==
The technical parameters of payload formats for audio and video streams are standardised.
The standard also describes the process of registering new payload types with IANA.

==Text messaging payload types==
Payload formats and types for text messaging are defined in the following specifications:

==MIDI payload types==
Payload formats and types for MIDI are defined in the following specifications:

==Audio and video payload types==
Payload formats and types for audio and video are defined in the following specifications:

Payload identifiers 96–127 are used for payloads defined dynamically during a session. It is recommended to dynamically assign port numbers, although port numbers 5004 and 5005 have been registered for use of the profile when a dynamically assigned port is not required.

Applications should always support PCMU (payload type 0). Previously, DVI4 (payload type 5) was also recommended, but this was removed in 2013.

| Payload type (PT) | Name | Type | No. of channels | Clock rate (Hz) | Frame size (byte) | Default packet interval (ms) | Description | References |
| 0 | PCMU | audio | 1 | 8000 | any | 20 | ITU-T G.711 PCM μ-Law audio 64 kbit/s | |
| 1 | reserved (previously FS-1016 CELP) | audio | 1 | 8000 | | | reserved, previously FS-1016 CELP audio 4.8 kbit/s | |
| 2 | reserved (previously G721 or G726-32) | audio | 1 | 8000 | | | reserved, previously ITU-T G.721 ADPCM audio 32 kbit/s or ITU-T G.726 audio 32 kbit/s | |
| 3 | GSM | audio | 1 | 8000 | 20 | 20 | European GSM Full Rate audio 13 kbit/s (GSM 06.10) | |
| 4 | G723 | audio | 1 | 8000 | 30 | 30 | ITU-T G.723.1 audio | |
| 5 | DVI4 | audio | 1 | 8000 | any | 20 | IMA ADPCM audio 32 kbit/s | |
| 6 | DVI4 | audio | 1 | 16000 | any | 20 | IMA ADPCM audio 64 kbit/s | |
| 7 | LPC | audio | 1 | 8000 | any | 20 | Experimental Linear Predictive Coding audio 5.6 kbit/s | |
| 8 | PCMA | audio | 1 | 8000 | any | 20 | ITU-T G.711 PCM A-Law audio 64 kbit/s | |
| 9 | G722 | audio | 1 | 8000 | any | 20 | ITU-T G.722 audio 64 kbit/s | |
| 10 | L16 | audio | 2 | 44100 | any | 20 | Linear PCM 16-bit Stereo audio 1411.2 kbit/s, uncompressed | |
| 11 | L16 | audio | 1 | 44100 | any | 20 | Linear PCM 16-bit audio 705.6 kbit/s, uncompressed | |
| 12 | QCELP | audio | 1 | 8000 | 20 | 20 | Qualcomm Code Excited Linear Prediction | , |
| 13 | CN | audio | 1 | 8000 | | | Comfort noise. Payload type used with audio codecs that do not support comfort noise as part of the codec itself such as G.711, G.722.1, G.722, G.726, G.727, G.728, GSM 06.10, Siren, and RTAudio. | |
| 14 | MPA | audio | 1, 2 | 90000 | 8–72 | | MPEG-1 or MPEG-2 audio only | , |
| 15 | G728 | audio | 1 | 8000 | 2.5 | 20 | ITU-T G.728 audio 16 kbit/s | |
| 16 | DVI4 | audio | 1 | 11025 | any | 20 | IMA ADPCM audio 44.1 kbit/s | |
| 17 | DVI4 | audio | 1 | 22050 | any | 20 | IMA ADPCM audio 88.2 kbit/s | |
| 18 | G729 | audio | 1 | 8000 | 10 | 20 | ITU-T G.729 and G.729a audio 8 kbit/s; Annex B is implied unless the annexb=no parameter is used | , |
| 19 | reserved (previously CN) | audio | | | | | reserved, previously comfort noise | |
| 25 | CELLB | video | | 90000 | | | Sun CellB video | |
| 26 | JPEG | video | | 90000 | | | JPEG video | |
| 28 | nv | video | | 90000 | | | Xerox PARC's Network Video (nv) | |
| 31 | H261 | video | | 90000 | | | ITU-T H.261 video | |
| 32 | MPV | video | | 90000 | | | MPEG-1 and MPEG-2 video | |
| 33 | MP2T | audio/video | | 90000 | | | MPEG-2 transport stream | |
| 34 | H263 | video | | 90000 | | | H.263 video, first version (1996) | , |
| 72-76 | reserved | | | | | | reserved because RTCP packet types 200-204 would otherwise be indistinguishable from RTP payload types 72-76 with the marker bit set | , |
| 77-95 | unassigned | | | | | | note that RTCP packet type 207 (XR, Extended Reports) would be indistinguishable from RTP payload types 79 with the marker bit set | , |
| dynamic | H263-1998 | video | | 90000 | | | H.263 video, second version (1998) | , , |
| dynamic | H263-2000 | video | | 90000 | | | H.263 video, third version (2000) | |
| dynamic (or profile) | H264 AVC | video | | 90000 | | | H.264 video (MPEG-4 Part 10) | |
| dynamic (or profile) | H264 SVC | video | | 90000 | | | H.264 video | |
| dynamic (or profile) | H265 | video | | 90000 | | | H.265 video (HEVC) | |
| dynamic (or profile) | theora | video | | 90000 | | | Theora video | draft-barbato-avt-rtp-theora |
| dynamic | iLBC | audio | 1 | 8000 | 20, 30 | 20, 30 | Internet low Bitrate Codec 13.33 or 15.2 kbit/s | |
| dynamic | PCMA-WB | audio | 1 | 16000 | 5 | | ITU-T G.711.1 A-law | |
| dynamic | PCMU-WB | audio | 1 | 16000 | 5 | | ITU-T G.711.1 μ-law | |
| dynamic | G718 | audio | | 32000 (placeholder) | 20 | | ITU-T G.718 | draft-ietf-payload-rtp-g718 |
| dynamic | G719 | audio | (various) | 48000 | 20 | | ITU-T G.719 | |
| dynamic | G7221 | audio | | 16000, 32000 | 20 | | ITU-T G.722.1 and G.722.1 Annex C | |
| dynamic | G726-16 | audio | 1 | 8000 | any | 20 | ITU-T G.726 audio 16 kbit/s | |
| dynamic | G726-24 | audio | 1 | 8000 | any | 20 | ITU-T G.726 audio 24 kbit/s | |
| dynamic | G726-32 | audio | 1 | 8000 | any | 20 | ITU-T G.726 audio 32 kbit/s | |
| dynamic | G726-40 | audio | 1 | 8000 | any | 20 | ITU-T G.726 audio 40 kbit/s | |
| dynamic | G729D | audio | 1 | 8000 | 10 | 20 | ITU-T G.729 Annex D | |
| dynamic | G729E | audio | 1 | 8000 | 10 | 20 | ITU-T G.729 Annex E | |
| dynamic | G7291 | audio | | 16000 | 20 | | ITU-T G.729.1 | |
| dynamic | GSM-EFR | audio | 1 | 8000 | 20 | 20 | ITU-T GSM-EFR (GSM 06.60) | |
| dynamic | GSM-HR-08 | audio | 1 | 8000 | 20 | | ITU-T GSM-HR (GSM 06.20) | |
| dynamic (or profile) | AMR | audio | (various) | 8000 | 20 | | Adaptive Multi-Rate audio | |
| dynamic (or profile) | AMR-WB | audio | (various) | 16000 | 20 | | Adaptive Multi-Rate Wideband audio (ITU-T G.722.2) | |
| dynamic (or profile) | AMR-WB+ | audio | 1, 2 or omit | 72000 | 13.3–40 | | Extended Adaptive Multi Rate – WideBand audio | |
| dynamic (or profile) | vorbis | audio | (various) | (various) | | | Vorbis audio | |
| dynamic (or profile) | opus | audio | 1, 2 | 48000 | 2.5–60 | 20 | Opus audio | |
| dynamic (or profile) | speex | audio | 1 | 8000, 16000, 32000 | 20 | | Speex audio | |
| dynamic | mpa-robust | audio | 1, 2 | 90000 | 24–72 | | Loss-Tolerant MP3 audio | |
| dynamic (or profile) | MP4A-LATM | audio | | 90000 or others | | | MPEG-4 Audio (includes AAC) | |
| dynamic (or profile) | MP4V-ES | video | | 90000 or others | | | MPEG-4 Visual | |
| dynamic (or profile) | mpeg4-generic | audio/video | | 90000 or other | | | MPEG-4 Elementary Streams | |
| dynamic | VP8 | video | | 90000 | | | VP8 video | |
| dynamic | VP9 | video | | 90000 | | | VP9 video | |
| dynamic | AV1 | video | | 90000 | | | AV1 video | av1-rtp-spec |
| dynamic | L8 | audio | (various) | (various) | any | 20 | Linear PCM 8-bit audio with 128 offset | |
| dynamic | DAT12 | audio | (various) | (various) | any | 20 (by analogy with L16) | IEC 61119 12-bit nonlinear audio | |
| dynamic | L16 | audio | (various) | (various) | any | 20 | Linear PCM 16-bit audio | , |
| dynamic | L20 | audio | (various) | (various) | any | 20 (by analogy with L16) | Linear PCM 20-bit audio | |
| dynamic | L24 | audio | (various) | (various) | any | 20 (by analogy with L16) | Linear PCM 24-bit audio | |
| dynamic | raw | video | | 90000 | | | Uncompressed Video | |
| dynamic | ac3 | audio | (various) | 32000, 44100, 48000 | | | Dolby AC-3 audio | |
| dynamic | eac3 | audio | (various) | 32000, 44100, 48000 | | | Enhanced AC-3 audio | |
| dynamic | t140 | text | | 1000 | | | Text over IP | |
| dynamic | EVRC EVRC0 EVRC1 | audio | | 8000 | | | EVRC audio | |
| dynamic | EVRCB EVRCB0 EVRCB1 | audio | | 8000 | | | EVRC-B audio | |
| dynamic | EVRCWB EVRCWB0 EVRCWB1 | audio | | 16000 | | | EVRC-WB audio | |
| dynamic | jpeg2000 | video | | 90000 | | | JPEG 2000 video | |
| dynamic | UEMCLIP | audio | | 8000, 16000 | | | UEMCLIP audio | |
| dynamic | ATRAC3 | audio | | 44100 | | | ATRAC3 audio | |
| dynamic | ATRAC-X | audio | | 44100, 48000 | | | ATRAC3+ audio | |
| dynamic | ATRAC-ADVANCED-LOSSLESS | audio | | (various) | | | ATRAC Advanced Lossless audio | |
| dynamic | DV | video | | 90000 | | | DV video | |
| dynamic | BT656 | video | | | | | ITU-R BT.656 video | |
| dynamic | BMPEG | video | | | | | Bundled MPEG-2 video | |
| dynamic | SMPTE292M | video | | | | | SMPTE 292M video | |
| dynamic | RED | audio | | | | | Redundant Audio Data | |
| dynamic | VDVI | audio | | | | | Variable-rate DVI4 audio | |
| dynamic | MP1S | video | | | | | MPEG-1 Systems Streams video | |
| dynamic | MP2P | video | | | | | MPEG-2 Program Streams video | |
| dynamic | tone | audio | | 8000 (default) | | | tone | |
| dynamic | telephone-event | audio | | 8000 (default) | | | DTMF tone | |
| dynamic | aptx | audio | 2 - 6 | (equal to sampling rate) | 4000 ÷ sample rate | 4 | aptX audio | |
| dynamic | jxsv | video | | 90000 | | | JPEG XS video | |
| dynamic | scip | audio/video | | 8000 or 90000 | | | SCIP | |

==See also==
- Session Initiation Protocol
- H.323
- Comparison of audio coding formats
