← Theses & Dissertations

An Analysis of Causal Language Constructions in Diverse Discourse Data

Angela Cao

Abstract

Creating datasets of manually annotated texts for relationships such as causality has been of interest to computational linguists. This thesis introduces the annotated Constructions of CAUSE, ENABLE, and PREVENT (CCEP) corpus to contribute to the field by systematizing the nuanced CAUSE, ENABLE, and PREVENT roles and enabling annotation of a wide variety of causal construction types. This corpus utilizes constructions as the basic unit of causal language, which is based on the linguistic paradigm entitled Construction Grammar (CxG) and manifests through the surface construction labeling (SCL) approach. In this project, I adapt a pre-identified bank of causal connectives (the Constructicon) from Dunietz, 2018, which are used as triggers for annotation instances. Through high inter-annotator performance demonstrated in the corpus of 150 doubly-annotated documents based on the CCEP guidelines, I (1) support Wolff et al., 2005’s causal aspectualization as psychologically real through high inter-annotator agreement of distinguishing such, (2) build upon previous annotation work that aim to embed this model of causation, and (3) provide a high quality dataset for understanding textual causality.

Department
Linguistics
Term
Spring 2022
Degree
BA
Honors
Highest Honor
Committee
Jinho D. Choi , Computer Science, Emory University (Chair)
Marjorie Pak, Linguistics, Emory University
Yun Kim, Linguistics,Emory University
David Zureick-Brown, Mathematics, Emory University
Photo of Angela Cao