Itakura–Saito distance

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The Itakura–Saito distance (or Itakura–Saito divergence) is a measure of the difference between an original spectrum and an approximation of that spectrum. Although it is not a perceptual measure, it is intended to reflect perceptual (dis)similarity. It was proposed by Fumitada Itakura and Shuzo Saito in the 1960s while they were with NTT.[1]

The distance is defined as:[2]

The Itakura–Saito distance is a Bregman divergence generated by minus the logarithmic function, but is not a true metric since it is not symmetric[3] and it does not fulfil triangle inequality.

In Non-negative matrix factorization, the Itakura-Saito divergence can be used as a measure of the quality of the factorization: this implies a meaningful statistical model of the components and can be solved through an iterative method.[4]

The Itakura-Saito distance is the Bregman divergence associated with the Gamma exponential family where the information divergence of one distribution in the family from another element in the family is given by the Itakura-Saito divergence of the mean value of the first distribution from the mean value of the second distribution.

See also[edit]


  1. ^ Itakura, F., & Saito, S. (1968). Analysis synthesis telephony based on the maximum likelihood method. In Proc. 6th of the International Congress on Acoustics (pp. C–17–C–20). Los Alamitos, CA: IEEE.
  2. ^ Alan H. S. Chan; Sio-Iong Ao (2008). Advances in industrial engineering and operations research. Springer. p. 51. ISBN 978-0-387-74903-7.
  3. ^ A. Banerjee; et al. (2004). "Clustering with Bregman Divergences". In Michael W. Berry; Umeshwar Dayal; Chandrika Kamath; David Skillicorn. Proceedings of the Fourth SIAM International Conference on Data Mining. SIAM. pp. 234–245. ISBN 978-0-89871-568-2.
  4. ^ Cédric Févotte; Nancy Bertin; Jean-Louis Durrieu (2009). "Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis". Neural Computation. 21 (3): 793–830. doi:10.1162/neco.2008.04-08-771. PMID 18785855.