Link grammar
Link grammar (LG) is a theory of syntax by Davy Temperley and Daniel Sleator which builds relations between pairs of words, rather than constructing constituents in a tree-like hierarchy. There are two basic parameters: directionality and distance. Link grammar is similar to dependency grammar, but dependency grammar includes a head-dependent relationship, as well as lacking directionality in the relations between words. Colored Multiplanar Link Grammar (CMLG) is an extension of LG allowing crossing relations between pairs of words[1]
For example, in a subject–verb–object language like English, the verb would look left to form a subject link, and right to form an object link. Nouns would look right to complete the subject link, or left to complete the object link.
In a subject–object–verb language like Persian, the verb would look left to form an object link, and a more distant left to form a subject link. Nouns would look to the right for both subject and object links.
Contents |
[edit] Syntax
Rightward links are represented as a +, and leftward links with a -. Optional links are contained in curly brackets {...}. Undesirable links are contained in any number of square brackets [...]. Multiple links are joined either by a conjunction & or a disjunction or. Each rule ends with a semicolon ;.
[edit] Examples
[edit] Example 1
A basic rule file for an SVO language might look like:
- <determiner>: D+;
- <noun-subject>: {D-} & S+;
- <noun-object>: {D-} & O-;
- <verb>: S- & {O+};
Thus the English sentence, “The boy painted a picture” would appear as:
+-----O-----+
+-D-+--S--+ +--D--+
| | | | |
The boy painted a picture
[edit] Example 2
While a rule file for a null subject SOV language might consist of the following links:
- <noun-subject>: S+;
- <noun-object>: O+;
- <verb>: {O-} & {S-};
And a simple Persian sentence, man nAn xordam (من نان خوردم) 'I ate bread' would look like:
+-----S-----+ | +--O--+ | | | man nAn xordam
[edit] Implementations
The Link grammar syntax parser is a library for natural language processing written in C. It is available under the BSD license, which is compatible with the GNU General Public License. The parser is an ongoing project, located here. Recent versions include improved sentence coverage, various bug and security fixes, and Java language bindings.
There are also Perl, Python, Ruby, Java, OCaml and .NET bindings available.[2]
The link-grammar program along with rules and word lists for English may be found in standard Linux distributions, e.g., as a Debian package.[3]
[edit] Applications
AbiWord, a free word processor, uses Link Grammar for on-the-fly grammar checking.[1] Words that cannot be linked anywhere are underlined in green.
The RelEx semantic relationship extractor, layered on top of the Link Grammar library, generates a dependency grammar output by making explicit the semantic relationships between words in a sentence. Its output can be classified as being at a level between that of SSyntR and DSyntR of Meaning-Text Theory. It also provides framing/grounding, anaphora resolution, head-word identification, lexical chunking, part-of-speech identification, and tagging, including entity, date, money, gender, etc. tagging. It includes a compatibility mode to generate dependency output compatible with the Stanford parser, and Penn TreeBank-compatible POS tagging.
Link Grammar has also been employed for information extraction of biomedical texts[4][5] and events described in news articles,[6] as well as experimental machine translation systems from English to German and Turkish.
The Link Grammar link dictionary is used to generate and verify the syntactic correctness of two different natural language generation systems: NLGen[7] and NLGen2.[8] It is also used as a part of the NLP pipeline in the OpenCog AI project.
[edit] Notes
- ^ .Anssi Yli-Jyrä and Matti Nykänen (2004). "A Hierarchy of Mildly Context-Sensitive Dependency Grammars". In G. P. Gerhard Jäger, Paola Monachesi and S. Wintner. Proceedings of the 9th conference on Formal Grammar 2004 "FGNancy". Pre-Proceedings.. pp. 151–165. http://www.ling.helsinki.fi/~aylijyra/dissertation/5.pdf.
- ^ (Perl) (Python) (Ruby) (OCaml) (.NET)
- ^ Debian - Package Search Results - link-grammar
- ^ Jing Ding, Daniel Berleant, Jun Xu, Andy W. Fulmer (November 2003). "Extracting biochemical interactions from MEDLINE using a link grammar parser". Tools with Artificial Intelligence, 2003. Proceedings. 15th IEEE International Conference on. pp. 467–471. ISBN 0-7695-2038-3. http://ifsc.ualr.edu/jdberleant/papers/LGPmanuscript8-8-03a.pdf.
- ^ Sampo Pyysalo, Tapio Salakoski, Sophie Aubin and Adeline Nazarenko, "Lexical Adaptation of Link Grammar to the Biomedical Sublanguage: a Comparative Evaluation of Three Approaches", BMC Bioinformatics 7(Suppl 3):S2 (2006).
- ^ Harsha V. Madhyastha, N. Balakrishnan, K. R. Ramakrishnan (2003). "Event Information Extraction Using Link Grammar". 13th International WorkShop on Research Issues in Data Engineering: Multi-lingual Information Management (RIDE'03). pp. 16. doi:10.1109/RIDE.2003.1249841.
- ^ Ruiting Lian, et al, "Sentence generation for artificial brains: a glocal similarity matching approach", Neurocomputing (Elsevier) (2009, submitted for publication).
- ^ Blake Lemoine, NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System (2009)
[edit] Further reading
- Schneider, Gerold (1998). A Linguistic Comparison Constituency, Dependency, and Link Grammar. Masters Thesis, University of Zurich. http://www.ifi.unizh.ch/cl/study/lizarbeiten/lizgerold.pdf. Retrieved 2007-12-26.
- Daniel Sleator & Davy Temperly (1993). "Parsing English with a Link Grammar". Third International Workshop on Parsing Technologies. http://www.cs.cmu.edu/afs/cs.cmu.edu/project/link/pub/www/papers/ps/LG-IWPT93.pdf.
- "A robust parsing algorithm for link grammars". Proceedings of the Fourth International Workshop on Parsing Technologies (Prague). September 1995. http://www.cs.cmu.edu/afs/cs.cmu.edu/project/link/pub/www/papers/ps/tr95-125.pdf.
[edit] External links
- The original Link Grammar homepage (which has been replaced by the current project.)
- Online English demonstration (for an older, out-of-date version; many bugs have been fixed since this version.)
- LinkGrammar-WN, lexicon expansion for the Link Grammar Parser (out of date, superseded by recent work that has been incorporated into the link-grammar parser.)
- BioLG, a modification of the Link Grammar Parser adapted for the biomedical domain (many, but not all, BioLG enhancements have been folded back into the main link-grammar distribution).