Rhetorical structure theory
Rhetorical structure theory (RST) was originally formulated by William Mann and Sandra Thompson of the University of Southern California's Information Sciences Institute (ISI) in 1988. This theory was developed as part of studies of computer based text generation. Natural language researchers later began using RST in text summarization and other applications. RST addresses text organization by means of relations that hold between parts of text. It explains coherence by postulating a hierarchical, connected structure of texts.
In 2000, Daniel Marcu, also of ISI, demonstrated that practical discourse parsing and text summarization also could be achieved using RST. Marcu was named a fellow of the Association for Computational Linguistics in 2014 for his "significant contributions to discourse parsing, summarization, and machine translation and to kickstarting the statistical machine translation industry".
Rhetorical relations or coherence relations or discourse relations are paratactic (coordinate) or hypotactic (subordinate) relations that hold across two or more text spans. It is widely accepted that notion of coherence is through text relations like this. RST using rhetorical relations provide a systematic way for an analyst to analyse the text. An analysis is usually built by reading the text and constructing a tree using the relations. The following example is a title and summary, appearing at the top of an article in Scientific American magazine (Ramachandran and Anstis, 1986). The original text, broken into numbered units, is:
- [Title:] The Perception of Apparent Motion
- [Abstract:] When the motion of an intermittently seen object is ambiguous
- the visual system resolves confusion
- by applying some tricks that reflect a built-in knowledge of properties of the physical world
In the figure, numbers 1,2,3,4 show the corresponding units as explained above. The fourth unit and the third unit form a relation "Means". The fourth unit is the essential part of this relation, so it is called the nucleus of the relation and third unit is called the satellite of the relation. Similarly second unit to third and fourth unit is forming relation "Condition". All units are also spans and spans may be composed of more than one unit.
Nuclearity in discourse
RST establishes two different types of units. Nuclei are considered as the most important parts of text whereas satellites contribute to the nuclei and are secondary. Nucleus contains basic information and satellite contains additional information about nucleus. The satellite is often incomprehensible without nucleus, whereas a text where a satellites have been deleted can be understood to a certain extent.
Hierarchy in the analysis
RST relations are applied recursively in a text, until all units in that text are constituents in an RST relation. The result of such analyses is that RST structure are typically represented as trees, with one top level relation that encompasses other relations at lower levels.
- From linguistic point of view, RST proposes a different view of text organization than most linguistic theories.
- RST points to a tight relation between relations and coherence in text
- From a computational point of view, it provides a characterization of text relations that has been implemented in different systems and for applications as text generation and summarization.
In design rationale
Computer scientists Ana Cristina Bicharra Garcia and Clarisse Sieckenius de Souz have used RST as the basis of a design rationale system called ADD+. In ADD+, RST is used as the basis for the rhetorical organization of a knowledge base, in a way comparable to other knowledge representation systems such as issue-based information system (IBIS). Similarly, RST has been used in representation schemes for argumentation.
- Mann, William C.; Thompson, Sandra A. (1988). "Rhetorical structure theory: toward a functional theory of text organization" (PDF). Text: Interdisciplinary Journal for the Study of Discourse. 8 (3): 243–281. doi:10.1515/text.1.19126.96.36.199. Retrieved 1 November 2017.
- Matthiessen, Christian M. I. M. (June 2005). "Remembering Bill Mann". Computational Linguistics. 31 (2): 161–171. doi:10.1162/0891201054224002. Retrieved 1 November 2017.
- Taboada, Maite; Mann, William C. (June 2006). "Rhetorical structure theory: looking back and moving ahead" (PDF). Discourse Studies. 8 (3): 423–459. CiteSeerX 10.1.1.216.381. doi:10.1177/1461445606061881.
- Marcu, Daniel (2000). The theory and practice of discourse parsing and summarization. Cambridge, Mass.: MIT Press. ISBN 978-0262133722. OCLC 43811223.
- Carlson, Lynn; Marcu, Daniel; Okurowski, Mary Ellen (2003) . "Building a discourse-tagged corpus in the framework of rhetorical structure theory" (PDF). In Kuppevelt, Jan van; Smith, Ronnie W. Current and new directions in discourse and dialogue. Text, speech, and language technology. 22. Dordrecht; Boston: Kluwer Academic Publishers. pp. 85–112. doi:10.1007/978-94-010-0019-2_5. ISBN 978-1402016141. OCLC 53097055.
- "Timeline". isi.edu. Information Sciences Institute. Retrieved 1 November 2017.
- "ACL Fellows". aclweb.org. Retrieved 1 November 2017.
- Taboada, Maite (2009). "Implicit and explicit coherence relations" (PDF). In Renkema, Jan. Discourse, of course: an overview of research in discourse studies. Amsterdam; Philadelphia: John Benjamins Publishing Company. pp. 127–140. doi:10.1075/z.148.13tab. ISBN 9789027232588. OCLC 276996573.
- "RST and text generation". ccl.pku.edu.cn. Retrieved 1 November 2017.
- Uzêda, Vinícius Rodrigues; Pardo, Thiago Alexandre Salgueiro; Nunes, Maria das Graças Volpe (November 2008). "Evaluation of automatic text summarization methods based on rhetorical structure theory" (PDF). Eighth International Conference on Intelligent Systems Design and Applications: Kaohsiung, Taiwan, 26–28 November 2008. ISDA'08. 2. Piscataway, NJ: IEEE. pp. 389–394. doi:10.1109/ISDA.2008.289. ISBN 978-0-7695-3382-7. Retrieved 1 November 2017.
- Garcia, Ana Cristina Bicharra; Souz, Clarisse Sieckenius de (April 1997). "ADD+: including rhetorical structures in active documents" (PDF). AI EDAM: Artificial Intelligence for Engineering Design, Analysis and Manufacturing. 11 (2): 109–124. doi:10.1017/S0890060400001906.
- Regli, William C.; Hu, Xiaochun; Atwood, Michael; Sun, Wei (December 2000). "A survey of design rationale systems: approaches, representation, capture and retrieval" (PDF). Engineering with Computers. 16 (3–4): 209–235. doi:10.1007/PL00013715.
- Green, Nancy L. (August 2009). "Representation of argumentation in text with rhetorical structure theory". Argumentation. 24 (2): 181–196. doi:10.1007/s10503-009-9169-4.
- Green, Nancy L. (November 2015). "Annotating evidence-based argumentation in biomedical text". 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Washington, DC, USA, 9–12 November 2015. Piscataway, NJ: IEEE. pp. 922–929. doi:10.1109/BIBM.2015.7359807. ISBN 978-1-4673-6799-8. OCLC 972619754.
- Mitrović, Jelena; O'Reilly, Cliff; Mladenović, Miljana; Handschuh, Siegfried (January 2017). "Ontological representations of rhetorical figures for argument mining". Argument & Computation. 8 (3): 267–287. doi:10.3233/AAC-170027.