Bayes error rate

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In statistical classification, the Bayes error rate is the lowest possible error rate for a given class of classifier.[1][2]

A number of approaches to the estimation of the Bayes error rate exist. One method seeks to obtain analytical bounds which are inherently dependent on distribution parameters, and hence difficult to estimate. Another approach focuses on class densities, while yet another method combines and compares various classifiers.[2]

The Bayes error rate finds important use in the study of patterns and machine learning techniques.[citation needed]

Error determination[edit]

In terms of machine learning and pattern classification, the data set can be discretely divided into 2 or more classes. Each element of the dataset is called an instance and the class it belongs to is called the label. The Bayes error rate of the dataset classifier is the probability of the classifier to incorrectly classify an instance. For a multiclass classifier, the Bayes error rate may be calculated as follows:[citation needed]

p = \sum_{C_{i} \neq C_\text{max}}  \textstyle \int\limits_{x\in H_{i}}P(x|C_{i})p(C_{i})\, dx,

where x is an instance, Ci is a class into which an instance is classified, Hi is the area/region that a classifier function h classifies as Ci.[clarification needed]

A Bayes error is non-zero if the distributions of the instances overlap, i.e. a certain instance x can have more than one label.[citation needed]

See also[edit]

References[edit]

  1. ^ Fukunaga, Keinosuke (1990) Introduction to Statistical Pattern Recognition by ISBN 0122698517 pages 3 and 97
  2. ^ a b K. Tumer, K. (1996) "Estimating the Bayes error rate through classifier combining" in Proceedings of the 13th International Conference on Pattern Recognition, Volume 2, 695–699