Sound synthesis

In music technology, sound synthesis is the process of generating sound from analogue and digital electronic equipment, often for musical, artistic or entertainment purposes. In particular, it refers to the process of generating, combining or mixing sounds from a set of fundamental building blocks or routines in order to create sounds of a greater complexity and richness. Sound synthesis can be used to mimic acoustic sound sources or generate sound that may be impossible to realise naturally. Since its development in the first half of the 20th Century, it has provided applications to music, computer science, film, acoustics and even biology.

Introduction

When any mechanical collision occurs sound is produced. The energy from the collision is transferred through the air as sound waves, which are perceived by the human auditory system. Sound waves are the aggregate of one or many periodic vibrations, described mathematically by sine waves. The characteristics of a sound, known as pitch and timbre, are defined by the amplitude and pitch of each individual sine wave, collectively know as the partials or harmonics. Generally, a sound that does not change over time will include a fundamental partial or harmonic, and any number of partials. Traditionally, the aims and methods of generating sounds via synthesis is to attempt to mimic the amplitude and pitch of the partials in an acoustic sound source, effectively creating a mathematical model for the sound.

Synthesis

When natural tonal instruments' sounds are analyzed in the frequency domain (as on a spectrum analyzer), the spectra of their sounds will exhibit amplitude spikes at each of the fundamental tone's harmonics. Some harmonics may have higher amplitudes than others. The specific set of harmonic-vs-amplitude pairs is known as a sound's harmonic content.

When analyzed in the time domain, a sound does not necessarily have the same harmonic content throughout the duration of the sound. Typically, high-frequency harmonics will die out more quickly than the lower harmonics. For a synthesized sound to "sound" right, it requires accurate reproduction of the original sound in both the frequency domain and the time domain.

Percussion instruments and rasps have very low harmonic content, and exhibit spectra that are comprised mainly of noise shaped by the resonant frequencies of the structures that produce the sounds. However, the resonant properties of the instruments (the spectral peaks of which are also referred to as formants) also shape an instrument's spectrum (esp. in string, wind, voice and other natural instruments).

In most conventional synthesizers, for purposes of re-synthesis, recordings of real instruments are composed of several components.

These component sounds represent the acoustic responses of different parts of the instrument, the sounds produced by the instrument during different parts of a performance, or the behavior of the instrument under different playing conditions (pitch, intensity of playing, fingering, etc.) The distinctive timbre, intonation and attack of a real instrument can therefore be created by mixing together these components in such a way as resembles the natural behavior of the real instrument. Nomenclature varies by synthesizer methodology and manufacturer, but the components are often referred to as oscillators or partials. A higher fidelity reproduction of a natural instrument can typically be achieved using more oscillators, but increased computational power and human programming is required, and most synthesizers use between one and four oscillators by default.

Amplitude Envelope

One of the major characteristics of a sound is how its overall amplitude varies over time. Sound synthesis techniques often employ a transfer function called an amplitude envelope which describes the amplitude at any point in its duration. Most often, this amplitude profile is realized with an "ADSR" (Attack Decay Sustain Release) envelope model, which is applied to a overall amplitude control. Apart from Sustain, each of these stages is modeled by a change in volume (typically exponential). Although the oscillations in real instruments also change frequency, most instruments can be modeled well without this refinement.

Attack time is the time taken for initial run-up of the sound level from nil to its peak amplitude. decay time is the time taken for the subsequent run down from the attack level to the designated sustain level. sustain level is the amplitude of the sound during the main sequence of its duration. release time is the time taken for the sound to decay from the sustain level to zero.

Overview of popular synthesis methods

Subtractive synthesizers use a simple acoustic model that assumes an instrument can be approximated by a simple signal generator (producing sawtooth waves, square waves, etc...) followed by a filter which represents the frequency-dependent losses and resonances in the instrument body. For reasons of simplicity and economy, these filters are typically low-order lowpass filters. The combination of simple modulation routings (such as pulse width modulation and oscillator sync), along with the physically unrealistic lowpass filters, is responsible for the "classic synthesizer" sound commonly associated with "analog synthesis" and often mistakenly used when referring to software synthesizers using subtractive synthesis. Although physical modeling synthesis, synthesis wherein the sound is generated according to the physics of the instrument, has superseded subtractive synthesis for accurately reproducing natural instrument timbres, the subtractive synthesis paradigm is still ubiquitous in synthesizers with most modern designs still offering low-order lowpass or bandpass filters following the oscillator stage.

One of the newest systems to evolve inside music synthesis is physical modeling. This involves taking up models of components of musical objects and creating systems which define action, filters, envelopes and other parameters over time. The definition of such instruments is virtually limitless, as one can combine any given models available with any amount of sources of modulation in terms of pitch, frequency and contour. For example, the model of a violin with characteristics of a pedal steel guitar and perhaps the action of piano hammer ... physical modeling on computers gets better and faster with higher processing.

One of the easiest synthesis systems is to record a real instrument as a digitized waveform, and then play back its recordings at different speeds to produce different tones. This is the technique used in "sampling". Most samplers designate a part of the sample for each component of the ADSR envelope, and then repeat that section while changing the volume for that segment of the envelope. This lets the sampler have a persuasively different envelope using the same note..

Synthesizer basics

There are three major kinds of synthesizers, analog, digital and software. In addition there are synthesizers that rely upon combinations of those three kinds, known as hybrid synthesizers.

There are also many different kinds of synthesis methods, each applicable to both analog and digital synthesizers. These techniques tend to be mathematically related, especially frequency modulation and phase modulation.