BabelNet is a multilingual lexicalized semantic network and ontology. BabelNet was automatically created by linking the largest multilingual Web encyclopedia (Wikipedia) to the most popular computational lexicon of the English language, WordNet. The integration is performed by means of an automatic mapping and by filling in lexical gaps in resource-poor languages with the aid of statistical machine translation. The result is an "encyclopedic dictionary" that provides concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations. Similarly to WordNet, BabelNet groups words in different languages into sets of synonyms, called Babel synsets. For each Babel synset, BabelNet provides short definitions (called glosses) in many languages harvested from both WordNet and Wikipedia.
Statistics of BabelNet
As of October 2013[update], BabelNet (version 2.0) covers 50 languages, including all European languages, most Asian languages, and even Latin. BabelNet 2.0 contains more than 9 million synsets and about 50 million word senses (regardless of their language). Each Babel synset contains 5.5 synonyms, i.e., word senses, on average, in any language. The semantic network includes all the lexico-semantic relations from WordNet (hypernymy and hyponymy, meronymy and holonymy, antonymy and synonymy, etc., totaling around 364,000 relation edges) as well as an underspecified relatedness relation from Wikipedia (totaling around 262 million relation edges). Version 2.0 also associates 7.7 million images with Babel synsets and provides a Lemon RDF encoding of the resource.
BabelNet has been shown to enable multilingual Natural Language Processing applications. The lexicalized knowledge available in BabelNet has been shown to obtain state-of-the-art results in semantic relatedness and multilingual word sense disambiguation.
- Semantic network
- Semantic relatedness
- Word sense disambiguation
- Word sense induction
- R. Navigli, S. P. Ponzetto. BabelNet: Building a Very Large Multilingual Semantic Network. Proc. of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), Uppsala, Sweden, July 11–16, 2010, pp. 216–225.
- R. Navigli and S. Ponzetto. 2012. BabelRelate! A Joint Multilingual Approach to Computing Semantic Relatedness. Proc. of the 26th AAAI Conference on Artificial Intelligence (AAAI 2012), Toronto, Canada, pp. 108-114.
- R. Navigli and S. Ponzetto. Joining Forces Pays Off: Multilingual Joint Word Sense Disambiguation. Proc. of the 2012 Conference on Empirical Methods in Natural Language Processing (EMNLP 2012), Jeju, Korea, July 12–14, 2012, pp. 1399-1410.