Zero-crossing rate

The zero-crossing rate is the rate of sign-changes along a signal, i.e., the rate at which the signal changes from positive to zero to negative or from negative to zero to positive.[1] This feature has been used heavily in both speech recognition and music information retrieval, being a key feature to classify percussive sounds.[2]

ZCR is defined formally as

${\displaystyle zcr={\frac {1}{T-1}}\sum _{t=1}^{T-1}\mathbb {1} _{\mathbb {R} _{<0}}(s_{t}s_{t-1})}$

where ${\displaystyle s}$ is a signal of length ${\displaystyle T}$ and ${\displaystyle \mathbb {1} _{\mathbb {R} _{<0}}}$ is an indicator function.

In some cases only the "positive-going" or "negative-going" crossings are counted, rather than all the crossings - since, logically, between a pair of adjacent positive zero-crossings there must be one and only one negative zero-crossing.

For monophonic tonal signals, the zero-crossing rate can be used as a primitive pitch detection algorithm.

Applications

Zero crossing rates are used for Voice activity detection (VAD), i.e., finding whether human speech is present in an audio segment or not.