Jump to content

Decision stump: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
→cite book, journal | Alter: journal. Add: s2cid, authors 1-1. Removed parameters. Some additions/deletions were parameter name changes. | Use this tool. Report bugs. | #UCB_Gadget
mNo edit summary
 
Line 1: Line 1:
{{Short description|Boolean classifier from one decision}}[[File:Decision stump.svg|thumb|250px|right|An example of a decision stump that discriminates between two of three classes of [[Iris flower data set]]: ''Iris versicolor'' and ''Iris virginica''. The petal width is in centimetres. This particular stump achieves 94% accuracy on the Iris dataset for these two classes.]]
{{Short description|Boolean classifier from one decision}}[[File:Decision stump.svg|thumb|250px|right|An example of a decision stump that discriminates between two of three classes of [[Iris flower data set]]: ''Iris versicolor'' and ''Iris virginica''. The petal width is in centimetres. This particular stump achieves 94% accuracy on the Iris dataset for these two classes.]]
A '''decision stump''' is a [[machine learning]] model consisting of a one-level [[Decision tree learning|decision tree]].<ref name="IL92" /> That is, it is a decision tree with one internal node (the root) which is immediately connected to the terminal nodes (its leaves). A decision stump makes a prediction based on the value of just a single input feature. Sometimes they are also called '''1-rules'''.<ref>{{cite journal |first=Robert C. |last=Holte |title=Very simple classification rules perform well on most commonly used datasets |journal=Machine Learning |volume=11 |issue=1 |pages=63–90 |date=1993 |doi=10.1023/A:1022631118932 |s2cid=6596 |url=https://www.mlpack.org/papers/ds.pdf}}</ref>
A '''decision stump''' is a [[machine learning model]] consisting of a one-level [[Decision tree learning|decision tree]].<ref name="IL92" /> That is, it is a decision tree with one internal node (the root) which is immediately connected to the terminal nodes (its leaves). A decision stump makes a prediction based on the value of just a single input feature. Sometimes they are also called '''1-rules'''.<ref>{{cite journal |first=Robert C. |last=Holte |title=Very simple classification rules perform well on most commonly used datasets |journal=Machine Learning |volume=11 |issue=1 |pages=63–90 |date=1993 |doi=10.1023/A:1022631118932 |s2cid=6596 |url=https://www.mlpack.org/papers/ds.pdf}}</ref>


Depending on the type of the input [[Feature (machine learning)|feature]], several variations are possible. For nominal features, one may build a stump which contains a leaf for each possible feature value<ref>{{cite book |last1=Loper |first1=Edward L. |last2=Bird |first2=Steven |last3=Klein |first3=Ewan |title=Natural language processing with Python |publisher=[[O'Reilly Media|O'Reilly]] |location=Sebastopol, CA |year=2009 |url=http://nltk.googlecode.com/svn/trunk/doc/book/ch06.html |isbn=978-0-596-51649-9 |access-date=2010-06-10 |archive-url=https://web.archive.org/web/20100618205152/http://nltk.googlecode.com/svn/trunk/doc/book/ch06.html |archive-date=2010-06-18 |url-status=dead }}</ref><ref>This classifier is implemented in [[Weka (machine learning)|Weka]] under the name <code>OneR</code> (for "1-rule").</ref> or a stump with the two leaves, one of which corresponds to some chosen category, and the other leaf to all the other categories.<ref name="weka">This is what has been implemented in [[Weka (machine learning)|Weka]]'s <code>DecisionStump</code> classifier.</ref> '''For binary features''' these two schemes are identical. A missing value may be treated as a yet another category.<ref name="weka" />
Depending on the type of the input [[Feature (machine learning)|feature]], several variations are possible. For nominal features, one may build a stump which contains a leaf for each possible feature value<ref>{{cite book |last1=Loper |first1=Edward L. |last2=Bird |first2=Steven |last3=Klein |first3=Ewan |title=Natural language processing with Python |publisher=[[O'Reilly Media|O'Reilly]] |location=Sebastopol, CA |year=2009 |url=http://nltk.googlecode.com/svn/trunk/doc/book/ch06.html |isbn=978-0-596-51649-9 |access-date=2010-06-10 |archive-url=https://web.archive.org/web/20100618205152/http://nltk.googlecode.com/svn/trunk/doc/book/ch06.html |archive-date=2010-06-18 |url-status=dead }}</ref><ref>This classifier is implemented in [[Weka (machine learning)|Weka]] under the name <code>OneR</code> (for "1-rule").</ref> or a stump with the two leaves, one of which corresponds to some chosen category, and the other leaf to all the other categories.<ref name="weka">This is what has been implemented in [[Weka (machine learning)|Weka]]'s <code>DecisionStump</code> classifier.</ref> '''For binary features''' these two schemes are identical. A missing value may be treated as a yet another category.<ref name="weka" />

Latest revision as of 18:37, 26 May 2024

An example of a decision stump that discriminates between two of three classes of Iris flower data set: Iris versicolor and Iris virginica. The petal width is in centimetres. This particular stump achieves 94% accuracy on the Iris dataset for these two classes.

A decision stump is a machine learning model consisting of a one-level decision tree.[1] That is, it is a decision tree with one internal node (the root) which is immediately connected to the terminal nodes (its leaves). A decision stump makes a prediction based on the value of just a single input feature. Sometimes they are also called 1-rules.[2]

Depending on the type of the input feature, several variations are possible. For nominal features, one may build a stump which contains a leaf for each possible feature value[3][4] or a stump with the two leaves, one of which corresponds to some chosen category, and the other leaf to all the other categories.[5] For binary features these two schemes are identical. A missing value may be treated as a yet another category.[5]

For continuous features, usually, some threshold feature value is selected, and the stump contains two leaves — for values below and above the threshold. However, rarely, multiple thresholds may be chosen and the stump therefore contains three or more leaves.

Decision stumps are often[6] used as components (called "weak learners" or "base learners") in machine learning ensemble techniques such as bagging and boosting. For example, a Viola–Jones face detection algorithm employs AdaBoost with decision stumps as weak learners.[7]

The term "decision stump" was coined in a 1992 ICML paper by Wayne Iba and Pat Langley.[1][8]

See also[edit]

References[edit]

  1. ^ a b Iba, Wayne; Langley, Pat (1992). "Induction of One-Level Decision Trees" (PDF). ML92: Proceedings of the Ninth International Conference on Machine Learning, Aberdeen, Scotland, 1–3 July 1992. Morgan Kaufmann. pp. 233–240. doi:10.1016/B978-1-55860-247-2.50035-8. ISBN 978-1-55860-247-2.
  2. ^ Holte, Robert C. (1993). "Very simple classification rules perform well on most commonly used datasets" (PDF). Machine Learning. 11 (1): 63–90. doi:10.1023/A:1022631118932. S2CID 6596.
  3. ^ Loper, Edward L.; Bird, Steven; Klein, Ewan (2009). Natural language processing with Python. Sebastopol, CA: O'Reilly. ISBN 978-0-596-51649-9. Archived from the original on 2010-06-18. Retrieved 2010-06-10.
  4. ^ This classifier is implemented in Weka under the name OneR (for "1-rule").
  5. ^ a b This is what has been implemented in Weka's DecisionStump classifier.
  6. ^ Reyzin, Lev; Schapire, Robert E. (2006). "How Boosting the Margin Can Also Boost Classifier Complexity" (PDF). ICML′06: Proceedings of the 23rd international conference on Machine Learning. pp. 753–760. doi:10.1145/1143844.1143939. ISBN 978-1-59593-383-6. S2CID 2483269.
  7. ^ Viola, Paul; Jones, Michael J. (2004). "Robust Real-Time Face Detection" (PDF). International Journal of Computer Vision. 57 (2): 137–154. doi:10.1023/B:VISI.0000013087.49260.fb. S2CID 2796017.
  8. ^ Oliver, Jonathan J.; Hand, David (1994). "Averaging Over Decision Stumps". Machine Learning: ECML-94, European Conference on Machine Learning, Catania, Italy, April 6–8, 1994, Proceedings. Lecture Notes in Computer Science. Vol. 784. Springer. pp. 231–241. doi:10.1007/3-540-57868-4_61. ISBN 3-540-57868-4. These simple rules are in effect severely pruned decision trees and have been termed decision stumps Iba & Langley 1992