Jump to content

Spectral modeling synthesis

From Wikipedia, the free encyclopedia
Spectral modeling synthesis (based on Roads 1996, p. 153)

Spectral modeling synthesis (SMS) is an acoustic modeling approach for speech and other signals. SMS considers sounds as a combination of harmonic content and noise content. Harmonic components are identified based on peaks in the frequency spectrum of the signal, normally as found by the short-time Fourier transform. The signal that remains following removal of the spectral components, sometimes referred to as the residual, is then modeled as white noise passed through a time-varying filter. The output of the model, then, are the frequencies and levels of the detected harmonic components and the coefficients of the time-varying filter.

Intuitively, the model can be applied to many types of audio signals. Speech signals, for example, include slowly changing harmonic sounds caused by vibration of the vocal cords plus wideband, noise-like sounds caused by the lips and mouth. Musical instruments also produce sounds containing both harmonic components and percussive, noise-like sounds when the notes are struck or changed.

SMS analysis & synthesis block diagrams (based on Bonada et al. 2001, Fig.1 & Fig.2)

See also[edit]


  • Serra, Xavier (2003). "Spectral Modeling Synthesis: Past and Present" (PDF). p. 20. Retrieved May 11, 2010.
  • Serra, Xavier. "Spectral Modeling Synthesis Tools". Retrieved May 11, 2010.
  • Smith III, Julius O. (28 December 2005). "Spectral Modeling". Retrieved April 19, 2008.
  • Roads, Curtis (1996). "Figure 4.23: Overview of spectrum modeling synthesis. ...". The Computer Music Tutorial. MIT Press. p. 153. ISBN 978-0-262-68082-0.
  • Bonada, J.; Loscos, A.; Cano, P.; Serra, X.; Kenmochi, H. (2001). "Spectral Approach to the Modeling of the Singing Voice". In Proc. of the 111th AES Convention. CiteSeerX