Silvia Pareti


2024

pdf bib
Bias Bluff Busters at FIGNEWS 2024 Shared Task: Developing Guidelines to Make Bias Conscious
Jasmin Heierli | Silvia Pareti | Serena Pareti | Tatiana Lando
Proceedings of The Second Arabic Natural Language Processing Conference

This paper details our participation in the FIGNEWS-2024 shared task on bias and propaganda annotation in Gaza conflict news. Our objectives were to develop robust guidelines and annotate a substantial dataset to enhance bias detection. We iteratively refined our guidelines and used examples for clarity. Key findings include the challenges in achieving high inter-annotator agreement and the importance of annotator awareness of their own biases. We also explored the integration of ChatGPT as an annotator to support consistency. This paper contributes to the field by providing detailed annotation guidelines, and offering insights into the subjectivity of bias annotation.

2023

pdf bib
Resolving Indirect Referring Expressions for Entity Selection
Mohammad Javad Hosseini | Filip Radlinski | Silvia Pareti | Annie Louis
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recent advances in language modeling have enabled new conversational systems. In particular, it is often desirable for people to make choices among specified options when using such systems. We address the problem of reference resolution, when people use natural expressions to choose between real world entities. For example, given the choice ‘Should we make a Simnel cake or a Pandan cake¿ a natural response from a non-expert may be indirect: ‘let’s make the green one‘. Reference resolution has been little studied with natural expressions, thus robustly understanding such language has large potential for improving naturalness in dialog, recommendation, and search systems. We create AltEntities (Alternative Entities), a new public dataset of entity pairs and utterances, and develop models for the disambiguation problem. Consisting of 42K indirect referring expressions across three domains, it enables for the first time the study of how large language models can be adapted to this task. We find they achieve 82%-87% accuracy in realistic settings, which while reasonable also invites further advances.

2018

pdf bib
Dialog Intent Structure: A Hierarchical Schema of Linked Dialog Acts
Silvia Pareti | Tatiana Lando
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib
Annotating Topic Development in Information Seeking Queries
Marta Andersson | Adnan Öztürel | Silvia Pareti
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper contributes to the limited body of empirical research in the domain of discourse structure of information seeking queries. We describe the development of an annotation schema for coding topic development in information seeking queries and the initial observations from a pilot sample of query sessions. The main idea that we explore is the relationship between constant and variable discourse entities and their role in tracking changes in the topic progression. We argue that the topicalized entities remain stable across development of the discourse and can be identified by a simple mechanism where anaphora resolution is a precursor. We also claim that a corpus annotated in this framework can be used as training data for dialogue management and computational semantics systems.

pdf bib
PARC 3.0: A Corpus of Attribution Relations
Silvia Pareti
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Quotation and opinion extraction, discourse and factuality have all partly addressed the annotation and identification of Attribution Relations. However, disjoint efforts have provided a partial and partly inaccurate picture of attribution and generated small or incomplete resources, thus limiting the applicability of machine learning approaches. This paper presents PARC 3.0, a large corpus fully annotated with Attribution Relations (ARs). The annotation scheme was tested with an inter-annotator agreement study showing satisfactory results for the identification of ARs and high agreement on the selection of the text spans corresponding to its constitutive elements: source, cue and content. The corpus, which comprises around 20k ARs, was used to investigate the range of structures that can express attribution. The results show a complex and varied relation of which the literature has addressed only a portion. PARC 3.0 is available for research use and can be used in a range of different studies to analyse attribution and validate assumptions as well as to develop supervised attribution extraction models.

2015

pdf bib
Annotating Attribution Relations across Languages and Genres
Silvia Pareti
Proceedings of the 11th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-11)

2013

pdf bib
Automatically Detecting and Attributing Indirect Quotations
Silvia Pareti | Tim O’Keefe | Ioannis Konstas | James R. Curran | Irena Koprinska
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2012

pdf bib
A Sequence Labelling Approach to Quote Attribution
Timothy O’Keefe | Silvia Pareti | James R. Curran | Irena Koprinska | Matthew Honnibal
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
A Database of Attribution Relations
Silvia Pareti
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The importance of attribution is becoming evident due to its relevance in particular for Opinion Analysis and Information Extraction applications. Attribution would allow to identify different perspectives on a given topic or retrieve the statements of a specific source of interest, but also to select more relevant and reliable information. However, the scarce and partial resources available to date to conduct attribution studies have determined that only a portion of attribution structures has been identified and addressed. This paper presents the collection and further annotation of a database of over 9800 attributions relations from the Penn Discourse TreeBank (PDTB). The aim is to build a large and complete resource that fills a key gap in the field and enables the training and testing of robust attribution extraction systems.

2010

pdf bib
Annotating Attribution Relations: Towards an Italian Discourse Treebank
Silvia Pareti | Irina Prodanof
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper we describe the development of a schema for the annotation of attribution relations and present the first findings and some relevant issues concerning this phenomenon. Following the D-LTAG approach to discourse, we have developed a lexically anchored description of attribution, considering this relation, contrary to the approach in the PDTB, independently from other discourse relations. This approach has allowed us to deal with the phenomenon in a broader perspective than previous studies, reaching therefore a more accurate description of it and making it possible to raise some still unaddressed issues. Following this analysis, we propose an annotation schema and discuss the first results concerning its applicability. The schema has been applied to a pilot portion of the ISST corpus of Italian and represents the initial phase of a project aiming at the creation of an Italian Discourse Treebank. We believe this work will raise some awareness concerning the fundamental importance of attribution relations. The identification of the source has in fact strong implications for the attributed material. Moreover, it will make overt the complexity of a phenomenon for long underestimated.