Biomedical text mining
Biomedical text mining (also known as BioNLP) refers to text mining applied to texts and literature of the biomedical and molecular biology domain. It is a rather recent research field on the edge of natural language processing, bioinformatics, medical informatics and computational linguistics.
There is an increasing interest in text mining and information extraction strategies applied to the biomedical and molecular biology literature due to the increasing number of electronically available publications stored in databases such as PubMed.
The main developments in this area have been related to the identification of biological entities (named entity recognition), such as protein and gene names as well as chemical compounds and drugs  in free text, the association of gene clusters obtained by microarray experiments with the biological context provided by the corresponding literature, automatic extraction of protein interactions and associations of proteins to functional concepts (e.g. gene ontology terms). Even the extraction of kinetic parameters from text or the subcellular location of proteins have been addressed by information extraction and text mining technology. Information extraction and text mining methods have been explored to extract information related to biological processes and diseases.
Conferences at which BioNLP research is presented
BioNLP is presented at a variety of meetings:
- Pacific Symposium on Biocomputing: in plenary session
- Intelligent Systems for Molecular Biology: in plenary session and also in the BioLINK and Bio-ontologies workshops
- Association for Computational Linguistics and North American Association for Computational Linguistics annual meetings and associated workshops: in plenary session and as part of the BioNLP workshop (see below)
- BioNLP 2010
- American Medical Informatics Association annual meeting: in plenary session
- PACBB - Practical Applications of Computational Biology & Bioinformatics
- Bio-NLP resources, systems and application database collection
- The BioNLP mailing list archives
- Corpora for biomedical text mining
- The BioCreative evaluations of biomedical text mining technologies
- Directory of people involved in BioNLP
- National Centre for Text Mining (NaCTeM)
- M Krallinger, F Leitner, O Rabal, M Vazquez, J Oyarzabal and A Valencia, Overview of the chemical compound and drug name recognition (CHEMDNER) task. Proceedings of the Fourth BioCreative Challenge Evaluation Workshop vol. 2. 6-37. http://www.biocreative.org/media/store/files/2013/bc4_v2_1.pdf
- Krallinger, M; Leitner, F; Valencia, A (2010). "Analysis of Biological Processes and Diseases Using Text Mining Approaches". Bioinformatics Methods in Clinical Research. Methods in Molecular Biology 593. pp. 341–82. doi:10.1007/978-1-60327-194-3_16. ISBN 978-1-60327-193-6. PMID 19957157.
- Krallinger M, Valencia A (2005). "Text-mining and information-retrieval services for molecular biology". Genome Biol. 6 (7): 224. doi:10.1186/gb-2005-6-7-224. PMC 1175978. PMID 15998455.
- Hoffmann R, Krallinger M, Andres E, Tamames J, Blaschke C, Valencia A (May 2005). "Text mining for metabolic pathways, signaling cascades, and protein networks". Sci. STKE 2005 (283): pe21. doi:10.1126/stke.2832005pe21. PMID 15886388.
- Krallinger M, Erhardt RA, Valencia A (March 2005). "Text-mining approaches in molecular biology and biomedicine". Drug Discov. Today 10 (6): 439–45. doi:10.1016/S1359-6446(05)03376-3. PMID 15808823.
- Biomedical Literature Mining Publications (BLIMP): A comprehensive and regularly updated index of publications on (bio)medical text mining