# PU learning

In machine learning, PU learning is a collection of semisupervised techniques for training binary classifiers on positive and unlabeled examples only.[1]

In PU learning, two sets of samples are assumed to be available for training: the positive set $P$ and a mixed set $U$, which is assumed to contain both positive and negative samples, but without these being labeled as such. This contrasts with other forms of semisupervised learning, where it is assumed that a labeled set containing examples of both classes is available. A variety of techniques exist to adapt supervised classifiers to the PU learning setting. PU learning successfully been applied to text classification [2][3][4] and bioinformatics tasks.[5]

## References

1. ^ Liu, Bing (2007). Web Data Mining. Springer. pp. 165−178.
2. ^ Bing Liu, Wee Sun Lee, Philip S. Yu and Xiao-Li Li (2002). "Partially supervised classification of text documents". ICML. pp. 8–12.
3. ^ Hwanjo Yu, Jiawei Han, Kevin Chen-Chuan Chang (2002). "PEBL: positive example based learning for web page classification using SVM". ACM SIGKDD.
4. ^ Xiao-Li Li and Bing Liu (2003). "Learning to classify text using positive and unlabeled data". IJCAI.
5. ^ Peng Yang, Xiao-Li Li, Jian-Ping Mei, Chee-Keong Kwoh and See-Kiong Ng (2012). "Positive-Unlabeled Learning for Disease Gene Identification". Bioinformatics, Vol 28(20).