Learning curve (machine learning)

In machine learning, a learning curve (or training curve) shows the validation and training score of an estimator for varying numbers of training samples. It is a tool to find out how much a machine learning model benefits from adding more training data and whether the estimator suffers more from a variance error or a bias error. If both the validation score and the training score converge to a value that is too low with increasing size of the training set, it will not benefit much from more training data.^[1]

The machine learning curve is useful for many purposes including comparing different algorithms,^[2] choosing model parameters during design,^[3] adjusting optimization to improve convergence, and determining the amount of data used for training.^[4]

In the machine learning domain, there are two connotations of learning curves differing in the x-axis of the curves, with experience of the model graphed either as the number of training examples used for learning or the number of iterations used in training the model.^[5]

References

^ scikit-learn developers. "Validation curves: plotting scores to evaluate models — scikit-learn 0.20.2 documentation". Retrieved February 15, 2019.
^ Madhavan, P.G. (1997). "A New Recurrent Neural Network Learning Algorithm for Time Series Prediction" (PDF). Journal of Intelligent Systems. p. 113 Fig. 3.
^ "Machine Learning 102: Practical Advice". Tutorial: Machine Learning for Astronomy with Scikit-learn.
^ Meek, Christopher; Thiesson, Bo; Heckerman, David (Summer 2002). "The Learning-Curve Sampling Method Applied to Model-Based Clustering". Journal of Machine Learning Research. 2 (3): 397.
^ Sammut, Claude; Webb, Geoffrey I. (Eds.) (28 March 2011). Encyclopedia of Machine Learning (1st ed.). Springer. p. 578. ISBN 978-0-387-30768-8.

This artificial intelligence-related article is a stub. You can help Wikipedia by expanding it.

This statistics-related article is a stub. You can help Wikipedia by expanding it.

[scikit-learn_learning-curve-1] scikit-learn developers. "Validation curves: plotting scores to evaluate models — scikit-learn 0.20.2 documentation". Retrieved February 15, 2019.

[2] Madhavan, P.G. (1997). "A New Recurrent Neural Network Learning Algorithm for Time Series Prediction" (PDF). Journal of Intelligent Systems. p. 113 Fig. 3.

[3] "Machine Learning 102: Practical Advice". Tutorial: Machine Learning for Astronomy with Scikit-learn.

[4] Meek, Christopher; Thiesson, Bo; Heckerman, David (Summer 2002). "The Learning-Curve Sampling Method Applied to Model-Based Clustering". Journal of Machine Learning Research. 2 (3): 397.

[5] Sammut, Claude; Webb, Geoffrey I. (Eds.) (28 March 2011). Encyclopedia of Machine Learning (1st ed.). Springer. p. 578. ISBN 978-0-387-30768-8.

[1]

[2]

[3]

[4]

[5]

See also

References