Jump to content

PSOLA

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 134.96.2.50 (talk) at 15:13, 17 July 2009 (Added duration modification and simultaneous duration/pitch modification with PSOLA). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In digital signal processing techniques PSOLA stands for Pitch Synchronous Overlap Add Method.

It is used in speech synthesis.

PSOLA

This technique is often used in speech processing to change the pitch of a speech signal without affecting its duration. A very simple technique to modify the pitch would be by changing the duration of the speech signal, lengthening it to decrease the pitch and shortening it to increase the pitch.

But in PSOLA, The speech waveform is first divided into several small overlapping segments and the segments are then moved closer or apart depending on whether to increase or decrease the pitch. Then the segments are interpolated and added using the overlap add technique, so that the duration of the resultant speech waveform is same as that of the actual waveform.

PSOLA can also be applied to modify the duration of a speech signal while preserving the original pitch characteristics. In this case, the small overlapping segments synchronized with local pitch period are repeated(eliminated) to increase(decrease) duration. Simultaneous application of pitch and duration modification makes it possible to modify speech signal prosody. PSOLA techniques are known to produce highly natural output provided that the modification amounts are small, the pitch period does not change rapidly and it can be measured correctly.

See also

Audio timescale-pitch modification.