Concatenative synthesis

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Concatenative synthesis is a technique for synthesising sounds by concatenating short samples of recorded sound (called units). The duration of the units is not strictly defined and may vary according to the implementation, roughly in the range of 10 milliseconds up to 1 second. It is used in speech synthesis and music sound synthesis to generate user-specified sequences of sound from a database built from recordings of other sequences.

In contrast to granular synthesis, concatenative synthesis is driven by an analysis of the source sound, in order to identify the units that best match the specified criterion.[1]

In speech[edit]

In music[edit]

Concatenative synthesis for music started to develop in the 2000s in particular through the work of Schwarz [2] and Pachet [3] (so-called musaicing). The basic techniques are similar to those for speech, although with differences due to the differing nature of speech and music: for example, the segmentation is not into phonetic units but often into subunits of musical notes or events.[1][2][4]

See also[edit]


  1. ^ a b Schwarz, D. (2005), "Current research in Concatenative Sound Synthesis" (PDF), Proceedings of the International Computer Music Conference (ICMC)
  2. ^ a b Schwarz, Diemo (2004-01-23), Data-Driven Concatenative Sound Synthesis, retrieved 2010-01-15
  3. ^ Zils, A.; Pachet, F. (2001), "Musical Mosaicing" (PDF), Proceedings of the COST G-6 Conference on Digital Audio Effects (DaFx-01), University of Limerick, pp. 39–44, archived from the original (PDF) on 2011-09-27, retrieved 2011-04-27
  4. ^ Maestre, E. and Ramírez, R. and Kersten, S. and Serra, X. (2009), "Expressive Concatenative Synthesis by Reusing Samples from Real Performance Recordings", Computer Music Journal, 33 (4), pp. 23–42, CiteSeerX, doi:10.1162/comj.2009.33.4.23, S2CID 1078610CS1 maint: multiple names: authors list (link)