Real Time Transport Protocol, or RTP for short, is a data transfer protocol designed specifically to exchange real-time sensitive, audio-visual data on IP-based networks. RTP is often used in Voice-over-IP telephony (VoIP telephony). IP-capable telephone systems therefore also feature the Real Time Transport Protocol. The first standardization was already recorded in 1996 in RFC 1889. Since 2003 RTP is a specification in RFC 3550.

The Real Time Transport Protocol is able to code multimedia data streams such as audio or video, divide them into packets and transmit them over an IP network. At transport level, Real Time Transport Protocol typically uses connectionless UDP (User Datagram Protocol). RTP allows data to be exchanged in Unicast as well as Multicast communication. In order to handle and meet the necessary Quality of Service parameters (QoS parameters) during the transfer, RTP partners with the Real Time Control Protocol (RTCP).
RTP has been established as the communication link standard for transmitting audio or video streams for the IP telephony protocols SIP and H.323.

Elements of the RTP header

Every RTP packet has a header which contains a variety of information related to the contents of the packet and its transfer. The header, for example, includes version and sequence numbers, the unique sender ID, time stamp and information about the data format. The remainder of the packet is payload. 

Important components of the Real Time Transport Protocol architecture

Individual components are defined in the architecture of the Real Time Transport Protocol for transmitting media streams. The key components are the synchronization source, the translator and the mixer. Whilst the synchronization source represents the actual data source and tagged with a 32 bit identifier, the translator is able to forward RTP packets and if necessary modify the coding of the data to be transmitted. Lastly, the mixer can merge the data streams of multiple sources and forward them in a new data stream.