LOLITA is a natural language processing system developed by Durham University between 1986 and 2000. The name is an acronym for "Large-scale, Object-based, Linguistic Interactor, Translator and Analyzer".
LOLITA was developed by Roberto Garigliano and colleagues between 1986 and 2000. It was designed as a general-purpose tool for processing unrestricted text that could be the basis of a wide variety of applications. At its core was a semantic network containing some 90,000 interlinked concepts. Text could be parsed and analysed then incorporated into the semantic net, where it could be reasoned about (Long and Garigliano, 1993). Fragments of the semantic net could also be rendered back to English or Spanish.
Several applications were built using the system, including financial information analysers and information extraction tools for Darpa’s “Message Understanding Conference Competitions” (MUC-6 and MUC-7). The latter involved processing original Wall Street Journal articles, to perform tasks such as identifying key job changes in businesses and summarising articles. LOLITA was one of a small number of systems worldwide to compete in all sections of the tasks. A system description and an analysis of the MUC-6 results were written by Callaghan (Callaghan, 1998).
LOLITA was an early example of a substantial application written in a functional language: it consisted of around 50,000 lines of Haskell, with around 6000 lines of C. It is also a complex and demanding application, in which many aspects of Haskell were invaluable in development.
LOLITA was designed to handle unrestricted text, so that ambiguity at various levels was unavoidable and significant. Laziness was essential in handling the explosion of syntactic ambiguity resulting from a large grammar, and it was much used with semantic ambiguity too. The system used multiple "domain specific embedded languages" for semantic and pragmatic processing and for generation of natural language text from the semantic net. Also important was the ability to work with complex abstractions and to prototype new analysis algorithms quickly.
Later systems based on the same design include Concepts and SenseGraph.
- A History of Haskell: Being Lazy With Class section 11.5