Calais is a service by Thomson Reuters that automatically extracts semantic information from web pages in a format that can be used on the semantic web. Calais was launched in January 2008, and is free to use.
The Calais Web Service reads unstructured text and returns Resource Description Framework formatted results identifying entities, facts and events within the text. The service appears to be based on technology acquired when Reuters purchased ClearForest in 2007.
Recent uses of the technology have included the automatic tagging of blog articles and the organization of museum collections.
Calais uses natural language processing technologies delivered via a web service interface.
- Official website
- Case Study of Open Calais in E. Curry, A. Freitas, and S. O’Riáin, “The Role of Community-Driven Data Curation for Enterprises,”, in Linking Enterprise Data, D. Wood, Ed. Boston, MA: Springer US, 2010, pp. 25–47.