Itakura–Saito distance

From Wikipedia, the free encyclopedia
Jump to: navigation, search

The Itakura–Saito distance (or Itakura–Saito divergence) is a measure of the difference between an original spectrum and an approximation of that spectrum. Although it is not a perceptual measure, it is intended to reflect perceptual (dis)similarity. It was proposed by Fumitada Itakura and Shuzo Saito in the 1960s while they were with NTT.[1]

The distance is defined as:[2]

The Itakura–Saito distance is a Bregman divergence, but is not a true metric since it is not symmetric[3] and it does not fulfil triangle inequality.

In Non-negative matrix factorization, the Itakura-Saito divergence can be used as a measure of the quality of the factorization: this implies a meaningful statistical model of the components and can be solved through an iterative method.[4]

See also[edit]


  1. ^ Itakura, F., & Saito, S. (1968). Analysis synthesis telephony based on the maximum likelihood method. In Proc. 6th of the International Congress on Acoustics (pp. C–17–C–20). Los Alamitos, CA: IEEE.
  2. ^ Alan H. S. Chan; Sio-Iong Ao (2008). Advances in industrial engineering and operations research. Springer. p. 51. ISBN 978-0-387-74903-7. 
  3. ^ A. Banerjee; et al. (2004). "Clustering with Bregman Divergences". In Michael W. Berry; Umeshwar Dayal; Chandrika Kamath; David Skillicorn. Proceedings of the Fourth SIAM International Conference on Data Mining. SIAM. pp. 234–245. ISBN 978-0-89871-568-2. 
  4. ^ Cédric Févotte; Nancy Bertin; Jean-Louis Durrieu (2009). "Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis". Neural Computation. 21 (3): 793–830. PMID 18785855. doi:10.1162/neco.2008.04-08-771.