Jump to content

PSOLA

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Headbomb (talk | contribs) at 05:10, 15 November 2019 (Alter: template type. Add: year, pages, volume, title, chapter, author pars. 1-2. Formatted dashes. | You can use this tool yourself. Report bugs here. | via #UCB_Gadget). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Oscillograms, spectrograms and intonograms of Polish expression (a) "jajem" [egg] (b) "ja jem" [I'm eating] (c) "nawóz" [fertiliser] (d) "na wóz" [on a cart][1]

PSOLA (Pitch Synchronous Overlap and Add) is a digital signal processing technique used for speech processing and more specifically speech synthesis. It can be used to modify the pitch and duration of a speech signal. It was invented around 1986.[2]

PSOLA works by dividing the speech waveform in small overlapping segments. To change the pitch of the signal, the segments are moved further apart (to decrease the pitch) or closer together (to increase the pitch). To change the duration of the signal, the segments are then repeated multiple times (to increase the duration) or some are eliminated (to decrease the duration). The segments are then combined using the overlap add technique.

PSOLA can be used to change the prosody of a speech signal.

See also

References

  1. ^ Grazyna Demenko (1999). Analiza cech suprasegmentalnych jezyka polskiego na potrzeby technologii mowy (PDF) (Ph.D. thesis). Seria Jezykoznawstwo Stosowane. Vol. 17. Uniwersytet Im. Adama Mickiewicza W Poznaniu. Fig.7.1, p.63.
  2. ^ Charpentier, F.; Stella, M. (1986). "Diphone synthesis using an overlap-add technique for speech waveforms concatenation". ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 11. pp. 2015–2018. doi:10.1109/ICASSP.1986.1168657.