Signalling of Coherence Relations in Discourse

Date created: 
Coherence relations
Signals of relations
Discourse markers
Corpus annotation
RST Discourse Treebank

In this dissertation, we examine how coherence relations (relations between propositions, such as Concession and Purpose) are signalled in discourse, and what signals are used to indicate them. We also explore the relationship between coherence relations and their signals, and examine whether every relation is signalled. Traditionally, coherence relations are considered to be signalled only by discourse markers or DMs (but, however, therefore, etc.), and relations, based on the presence and absence of DMs, are classified as explicit and implicit relations, respectively (Renkema, 2004). The fact that relations without DMs are omnipresent in discourse (Taboada, 2009), and relations are interpreted even in the absence of DMs (Sanders & Noordman, 2000) leads us to hypothesize that the signalling of relations is achieved not only by DMs, but also by several other textual signals. We also hypothesize that every relation in a discourse is signalled, as a signal must be necessary for correct interpretation. We conduct a corpus study, using the RST Discourse Treebank (Carlson et al., 2002), which contains a collection of 385 Wall Street Journal articles annotated for rhetorical (coherence) relations. We examine each relation in the corpus (20,123 relations in total), identify relevant signals for those relations, and finally, add a new layer of annotation to them, to include signalling information. Results from our corpus study show that the majority of relations (over 90%) in a discourse are signalled (sometimes by multiple signals), and also that the majority of signalled relations (over 80%) are indicated by signals other than DMs, such as lexical, semantic, syntactic and graphical features. The results also show that the signalling varies quantitatively and qualitatively for individual relations. These findings reinforce the psychological claim that there exist signals for every interpretable relation. They also suggest that the category of explicit relations needs to be expanded, to include relations which are signalled by any textual feature. The annotated corpus with signalling information can be used in psycholinguistic studies to determine how readers or hearers use signals to identify relations. It can also be used to develop discourse parsing systems to automatically categorize coherence relations.

Document type: 
Copyright remains with the author. The author granted permission for the file to be printed and for the text to be copied and pasted.
Dr. Paul McFetridge
Dr. Maite Taboada
Arts & Social Sciences: Department of Linguistics
Thesis type: 
(Thesis) Ph.D.