Spectral flatness
Spectral flatness or tonality coefficient[1][2], also known as Wiener entropy[3], is a measure used in digital signal processing to characterize an audio spectrum. Spectral flatness is typically measured in decibels, and provides a way to quantify how tone-like a sound is, as opposed to being noise-like.[2] The meaning of tonal in this context is in the sense of the amount of peaks or resonant structure in a power spectrum, as opposed to flat spectrum of a white noise. A high spectral flatness indicates that the spectrum has a similar amount of power in all spectral bands – this would sound similar to white noise, and the graph of the spectrum would appear relatively flat and smooth. A low spectral flatness indicates that the spectral power is concentrated in a relatively small number of bands – this would typically sound like a mixture of sine waves, and the spectrum would appear "spiky".[4]
The spectral flatness is calculated by dividing the geometric mean of the power spectrum by the arithmetic mean of the power spectrum, i.e.:
where x(n) represents the magnitude of bin number n. Note that a single (or more) empty bin yields a flatness of 0, so this measure is most useful when bins are generally not empty.
The ratio produced by this calculation is often converted to a decibel scale for reporting.
The spectral flatness can also be measured within a specified subband, rather than across the whole band.
[edit] Applications
This measurement is one of the many audio descriptors used in the MPEG-7 standard, in which it is labelled "AudioSpectralFlatness".
In birdsong research, it has been used as one of the features measured on birdsong audio, when testing similarity between two excerpts.[5]
[edit] References
- ^ J. D. Johnston (1988). "Transform coding of audio signals using perceptual noise criteria". IEEE Journal on Selected Areas in Communications 6 (2): 314–332. doi:10.1109/49.608.
- ^ a b Shlomo Dubnov (2004). "Generalization of Spectral Flatness Measure for Non-Gaussian Linear Processes". Signal Processing Letters 11 (8): 698–701. doi:10.1109/LSP.2004.831663. ISSN 1070-9908.
- ^ http://luscinia.sourceforge.net/page19/page8/page33/page33.html
- ^ A Large Set of Audio Features for Sound Description - technical report published by IRCAM in 2003. Section 9.1
- ^ Tchernichovski, O., Nottebohm, F., Ho, C. E., Pesaran, B., Mitra, P. P., 2000. A procedure for an automated measurement of song similarity. Animal Behaviour 59 (6), 1167–1176, doi:10.1006/anbe.1999.1416.
| This signal processing-related article is a stub. You can help Wikipedia by expanding it. |
![\mathrm{Flatness} = \frac{\sqrt[N]{\prod_{n=0}^{N-1}x(n)}}{\frac{\sum_{n=0}^{N-1}x(n)}{N}} = \frac{\exp\left(\frac{1}{N}\sum_{n=0}^{N-1} \ln x(n)\right)}{\frac{1}{N} \sum_{n=0}^{N-1}x(n)}](http://upload.wikimedia.org/wikipedia/en/math/8/0/d/80d27dbedb4ff46d2ad47aaff623c93a.png)