Audio synchronizer

From Wikipedia, the free encyclopedia
Jump to: navigation, search

An audio synchronizer is a variable audio delay used to correct or maintain audio-video sync or timing[1] also known as lip sync error. See for example the specification for audio to video timing given in ATSC Document IS-191.[2] Modern television systems use large amounts of video signal processing such as MPEG preprocessing, encoding and decoding, video synchronization and resolution conversion in pixelated displays. This video processing can cause delays in the video signal ranging from a few microseconds to tens of seconds. If the television program is displayed to the viewer with this video delay the audio-video synchronization is wrong, and the video will appear to the viewer after the sound is heard. This effect is commonly referred to as A/V sync or lip sync error and can cause serious problems related to the viewer's enjoyment of the program.

Error correction[edit]

To correct audio video sync problems, the video processing circuitry outputs a DDO (digital delay output) signal, which carries information about the amount of delay the video signal experiences due to the video processing. The DDO may, for example, be provided by equipment which adheres to the SMPTE ST2064 Audio to Video Synchronization Standard. The audio synchronizer receives the DDO signal and in response delays the audio by an equivalent amount, thereby maintaining proper audio-video sync. Modern audio synchronizers operate by digitizing and writing the audio signal into a ring memory, which is most commonly a RAM-based memory having independent read and write ability. At the appropriate delay time (as conveyed by the DDO) after an audio sample (or group of samples) are written into the memory the previously stored audio sample is read from the ring memory. The storage and reading of the audio samples take place continuously in response to the respective memory write and read addresses, which increment by 1 count for every write or read operation. For example, an audio sample would be written at address 1, a different sample read from (previously written) address 5, another sample is written at address 2, yet another read from 6, write at 3, read from 7 and so on. The delay between writing and reading a particular sample is 4 addresses which, when multiplied by the amount of time it takes to change from one address to the next, gives the total audio delay.

Tracking changes[edit]

Unfortunately, video delays frequently make quick and large changes, for example, a jump in delay time from 2 seconds to 6 seconds is possible. To maintain proper audio-video sync, the audio delay must track these video delay changes. Changing the audio delay requires changing the difference between the write address and the read address. This change can be accomplished by causing either the write or read address to jump forward or backward, however, this jump causes some audio samples to repeat or be lost resulting in an unwanted and annoying pop, click, gap, distortion and/or noise in the audio signal. Some audio synchronizers operate by making repeated, very small jumps that cause unwanted (but less annoying) distortion and noise in the audio signal, rather than pops, gaps, and clicks. Other audio synchronizers change delay by changing the speed of the reading of audio from the ring memory. If audio samples are read out of the memory more slowly than they are written, the delay increases. If audio samples are read out faster than they are written the delay decreases. Using variable speed reading prevents pops, clicks, gaps, distortion and noise from being introduced into the audio, but does create unwanted and annoying pitch errors. For example, reading faster than writing causes the audio pitch to increase and reading slower than writing causes the pitch to decrease.

Variable speed reading[edit]

Audio synchronizers that use variable speed reading are generally preferred in professional applications. The control of audio delay is generally more accurate and more easily accomplished. Pitch errors in lower performance devices are uncompensated and kept to a level generally not perceived by the average viewer, by limiting the amount of change of reading speed. Typically the change limit is in the order of 0.2%. Unfortunately, this limits the rate of delay change and when large video delay changes occur the slow tracking rate of these uncompensated synchronizers can cause the audio video sync to be off for several seconds or minutes until the audio delay catches up with the video delay. Additionally, listeners with excellent pitch perception may notice and be annoyed by even these small pitch errors.

Pitch correction circuit[edit]

In higher performance audio synchronizers, the rate of delay change is allowed to be much faster, generally in the order of 25%, and the resulting pitch error is corrected with a pitch correction circuit. The pitch correction circuitry is frequently a proprietary design, due to the difficulty in performing correction so the errors are imperceptible to critical listeners. These higher performance audio synchronizers allow the audio delay to track even large and quick video delay changes without generating any artifacts that are perceptible to even critical listeners for most audio program material.

Recent developments[edit]

Recent development in video processing devices permit those devices to sense when a large video delay change will need to be made beforehand and allow that change information to be communicated to the audio synchronizer. The "advanced notice" from the video processing device allows the audio synchronizer to anticipate and take advantage of particular audio material (e.g., periods of relative silence or periods without music) to facilitate making corresponding large audio delay changes that do not risk generating noticeable audio artifacts. Further developments permit handshaking between the video processing device and the audio synchronizer to control when the video delay change is made to optimize the timing of the tracking audio delay change thereby further reducing the risk of generating noticeable audio artifacts and at the same time reducing the risk of missynchronization due to rapid video delay changes.


  1. ^ Aldo Cucnini (2007-09-01). "Managing lip sync". Broadcast Engineering. Archived from the original on 2011-07-27. Retrieved 2011-07-27. 
  2. ^ IS-191: Relative Timing of Sound and Vision for Broadcast Operations, ATSC, 2003-06-26, archived from the original on 2011-07-27 

External links[edit]

  • For examples of modern audio synchronizers, search "audio synchronizer" or "audio video sync" on the United States Patent Office web site at [1].