2024
pdf
bib
abs
Adapting Abstract Meaning Representation Parsing to the Clinical Narrative – the SPRING THYME parser
Jon Cai
|
Kristin Wright-Bettner
|
Martha Palmer
|
Guergana Savova
|
James Martin
Proceedings of the 6th Clinical Natural Language Processing Workshop
This paper is dedicated to the design and evaluation of the first AMR parser tailored for clinical notes. Our objective was to facilitate the precise transformation of the clinical notes into structured AMR expressions, thereby enhancing the interpretability and usability of clinical text data at scale. Leveraging the colon cancer dataset from the Temporal Histories of Your Medical Events (THYME) corpus, we adapted a state-of-the-art AMR parser utilizing continuous training. Our approach incorporates data augmentation techniques to enhance the accuracy of AMR structure predictions. Notably, through this learning strategy, our parser achieved an impressive F1 score of 88% on the THYME corpus’s colon cancer dataset. Moreover, our research delved into the efficacy of data required for domain adaptation within the realm of clinical notes, presenting domain adaptation data requirements for AMR parsing. This exploration not only underscores the parser’s robust performance but also highlights its potential in facilitating a deeper understanding of clinical narratives through structured semantic representations.
pdf
bib
abs
Linear Cross-document Event Coreference Resolution with X-AMR
Shafiuddin Rehan Ahmed
|
George Arthur Baker
|
Evi Judge
|
Michael Reagan
|
Kristin Wright-Bettner
|
Martha Palmer
|
James H. Martin
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Event Coreference Resolution (ECR) as a pairwise mention classification task is expensive both for automated systems and manual annotations. The task’s quadratic difficulty is exacerbated when using Large Language Models (LLMs), making prompt engineering for ECR prohibitively costly. In this work, we propose a graphical representation of events, X-AMR, anchored around individual mentions using a cross-document version of Abstract Meaning Representation. We then linearize the ECR with a novel multi-hop coreference algorithm over the event graphs. The event graphs simplify ECR, making it a) LLM cost-effective, b) compositional and interpretable, and c) easily annotated. For a fair assessment, we first enrich an existing ECR benchmark dataset with these event graphs using an annotator-friendly tool we introduce. Then, we employ GPT-4, the newest LLM by OpenAI, for these annotations. Finally, using the ECR algorithm, we assess GPT-4 against humans and analyze its limitations. Through this research, we aim to advance the state-of-the-art for efficient ECR and shed light on the potential shortcomings of current LLMs at this task. Code and annotations:
https://github.com/ahmeshaf/gpt_coref2023
pdf
bib
abs
UMR-Writer 2.0: Incorporating a New Keyboard Interface and Workflow into UMR-Writer
Sijia Ge
|
Jin Zhao
|
Kristin Wright-bettner
|
Skatje Myers
|
Nianwen Xue
|
Martha Palmer
Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII)
UMR-Writer is a web-based tool for annotating semantic graphs with the Uniform Meaning Representation (UMR) scheme. UMR is a graph-based semantic representation that can be applied cross-linguistically for deep semantic analysis of texts. In this work, we implemented a new keyboard interface in UMR-Writer 2.0, which is a powerful addition to the original mouse interface, supporting faster annotation for more experienced annotators. The new interface also addresses issues with the original mouse interface. Additionally, we demonstrate an efficient workflow for annotation project management in UMR-Writer 2.0, which has been applied to many projects.
pdf
bib
abs
Textual Entailment for Temporal Dependency Graph Parsing
Jiarui Yao
|
Steven Bethard
|
Kristin Wright-Bettner
|
Eli Goldner
|
David Harris
|
Guergana Savova
Proceedings of the 5th Clinical Natural Language Processing Workshop
We explore temporal dependency graph (TDG) parsing in the clinical domain. We leverage existing annotations on the THYME dataset to semi-automatically construct a TDG corpus. Then we propose a new natural language inference (NLI) approach to TDG parsing, and evaluate it both on general domain TDGs from wikinews and the newly constructed clinical TDG corpus. We achieve competitive performance on general domain TDGs with a much simpler model than prior work. On the clinical TDGs, our method establishes the first result of TDG parsing on clinical data with 0.79/0.88 micro/macro F1.
pdf
bib
abs
CAMRA: Copilot for AMR Annotation
Jon Cai
|
Shafiuddin Rehan Ahmed
|
Julia Bonn
|
Kristin Wright-Bettner
|
Martha Palmer
|
James H. Martin
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
In this paper, we introduce CAMRA (Copilot for AMR Annotatations), a cutting-edge web-based tool designed for constructing Abstract Meaning Representation (AMR) from natural language text. CAMRA offers a novel approach to deep lexical semantics annotation such as AMR, treating AMR annotation akin to coding in programming languages. Leveraging the familiarity of programming paradigms, CAMRA encompasses all essential features of existing AMR editors, including example lookup, while going a step further by integrating Propbank roleset lookup as an autocomplete feature within the tool. Notably, CAMRA incorporates AMR parser models as coding co-pilots, greatly enhancing the efficiency and accuracy of AMR annotators.
2022
pdf
bib
abs
PropBank Comes of Age—Larger, Smarter, and more Diverse
Sameer Pradhan
|
Julia Bonn
|
Skatje Myers
|
Kathryn Conger
|
Tim O’gorman
|
James Gung
|
Kristin Wright-bettner
|
Martha Palmer
Proceedings of the 11th Joint Conference on Lexical and Computational Semantics
This paper describes the evolution of the PropBank approach to semantic role labeling over the last two decades. During this time the PropBank frame files have been expanded to include non-verbal predicates such as adjectives, prepositions and multi-word expressions. The number of domains, genres and languages that have been PropBanked has also expanded greatly, creating an opportunity for much more challenging and robust testing of the generalization capabilities of PropBank semantic role labeling systems. We also describe the substantial effort that has gone into ensuring the consistency and reliability of the various annotated datasets and resources, to better support the training and evaluation of such systems
2020
pdf
bib
abs
Defining and Learning Refined Temporal Relations in the Clinical Narrative
Kristin Wright-Bettner
|
Chen Lin
|
Timothy Miller
|
Steven Bethard
|
Dmitriy Dligach
|
Martha Palmer
|
James H. Martin
|
Guergana Savova
Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis
We present refinements over existing temporal relation annotations in the Electronic Medical Record clinical narrative. We refined the THYME corpus annotations to more faithfully represent nuanced temporality and nuanced temporal-coreferential relations. The main contributions are in re-defining CONTAINS and OVERLAP relations into CONTAINS, CONTAINS-SUBEVENT, OVERLAP and NOTED-ON. We demonstrate that these refinements lead to substantial gains in learnability for state-of-the-art transformer models as compared to previously reported results on the original THYME corpus. We thus establish a baseline for the automatic extraction of these refined temporal relations. Although our study is done on clinical narrative, we believe it addresses far-reaching challenges that are corpus- and domain- agnostic.
pdf
bib
abs
Spatial AMR: Expanded Spatial Annotation in the Context of a Grounded Minecraft Corpus
Julia Bonn
|
Martha Palmer
|
Zheng Cai
|
Kristin Wright-Bettner
Proceedings of the Twelfth Language Resources and Evaluation Conference
This paper presents an expansion to the Abstract Meaning Representation (AMR) annotation schema that captures fine-grained semantically and pragmatically derived spatial information in grounded corpora. We describe a new lexical category conceptualization and set of spatial annotation tools built in the context of a multimodal corpus consisting of 170 3D structure-building dialogues between a human architect and human builder in Minecraft. Minecraft provides a particularly beneficial spatial relation-elicitation environment because it automatically tracks locations and orientations of objects and avatars in the space according to an absolute Cartesian coordinate system. Through a two-step process of sentence-level and document-level annotation designed to capture implicit information, we leverage these coordinates and bearings in the AMRs in combination with spatial framework annotation to ground the spatial language in the dialogues to absolute space.
2019
pdf
bib
abs
Cross-document coreference: An approach to capturing coreference without context
Kristin Wright-Bettner
|
Martha Palmer
|
Guergana Savova
|
Piet de Groen
|
Timothy Miller
Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)
This paper discusses a cross-document coreference annotation schema that was developed to further automatic extraction of timelines in the clinical domain. Lexical senses and coreference choices are determined largely by context, but cross-document work requires reasoning across contexts that are not necessarily coherent. We found that an annotation approach that relies less on context-guided annotator intuitions and more on schematic rules was most effective in creating meaningful and consistent cross-document relations.
2016
pdf
bib
Richer Event Description: Integrating event coreference with temporal, causal and bridging annotation
Tim O’Gorman
|
Kristin Wright-Bettner
|
Martha Palmer
Proceedings of the 2nd Workshop on Computing News Storylines (CNS 2016)