Meladel Mistica


2024

pdf bib
To Aggregate or Not to Aggregate. That is the Question: A Case Study on Annotation Subjectivity in Span Prediction
Kemal Kurniawan | Meladel Mistica | Timothy Baldwin | Jey Han Lau
Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

This paper explores the task of automatic prediction of text spans in a legal problem description that support a legal area label. We use a corpus of problem descriptions written by laypeople in English that is annotated by practising lawyers. Inherent subjectivity exists in our task because legal area categorisation is a complex task, and lawyers often have different views on a problem. Experiments show that training on majority-voted spans outperforms training on disaggregated ones.

2021

pdf bib
Evaluating Document Coherence Modeling
Aili Shen | Meladel Mistica | Bahar Salehi | Hang Li | Timothy Baldwin | Jianzhong Qi
Transactions of the Association for Computational Linguistics, Volume 9

While pretrained language models (LMs) have driven impressive gains over morpho-syntactic and semantic tasks, their ability to model discourse and pragmatic phenomena is less clear. As a step towards a better understanding of their discourse modeling capabilities, we propose a sentence intrusion detection task. We examine the performance of a broad range of pretrained LMs on this detection task for English. Lacking a dataset for the task, we introduce INSteD, a novel intruder sentence detection dataset, containing 170,000+ documents constructed from English Wikipedia and CNN news articles. Our experiments show that pretrained LMs perform impressively in in-domain evaluation, but experience a substantial drop in the cross-domain setting, indicating limited generalization capacity. Further results over a novel linguistic probe dataset show that there is substantial room for improvement, especially in the cross- domain setting.

pdf bib
Semi-automatic Triage of Requests for Free Legal Assistance
Meladel Mistica | Jey Han Lau | Brayden Merrifield | Kate Fazio | Timothy Baldwin
Proceedings of the Natural Legal Language Processing Workshop 2021

Free legal assistance is critically under-resourced, and many of those who seek legal help have their needs unmet. A major bottleneck in the provision of free legal assistance to those most in need is the determination of the precise nature of the legal problem. This paper describes a collaboration with a major provider of free legal assistance, and the deployment of natural language processing models to assign area-of-law categories to real-world requests for legal assistance. In particular, we focus on an investigation of models to generate efficiencies in the triage process, but also the risks associated with naive use of model predictions, including fairness across different user demographics.

pdf bib
Automatic Resolution of Domain Name Disputes
Wayan Oger Vihikan | Meladel Mistica | Inbar Levy | Andrew Christie | Timothy Baldwin
Proceedings of the Natural Legal Language Processing Workshop 2021

We introduce the new task of domain name dispute resolution (DNDR), that predicts the outcome of a process for resolving disputes about legal entitlement to a domain name. TheICANN UDRP establishes a mandatory arbitration process for a dispute between a trade-mark owner and a domain name registrant pertaining to a generic Top-Level Domain (gTLD) name (one ending in .COM, .ORG, .NET, etc). The nature of the problem leads to a very skewed data set, which stems from being able to register a domain name with extreme ease, very little expense, and no need to prove an entitlement to it. In this paper, we describe thetask and associated data set. We also present benchmarking results based on a range of mod-els, which show that simple baselines are in general difficult to beat due to the skewed data distribution, but in the specific case of the respondent having submitted a response, a fine-tuned BERT model offers considerable improvements over a majority-class model

2020

pdf bib
Proceedings of the 18th Annual Workshop of the Australasian Language Technology Association
Maria Kim | Daniel Beck | Meladel Mistica
Proceedings of the 18th Annual Workshop of the Australasian Language Technology Association

pdf bib
Information Extraction from Legal Documents: A Study in the Context of Common Law Court Judgements
Meladel Mistica | Geordie Z. Zhang | Hui Chia | Kabir Manandhar Shrestha | Rohit Kumar Gupta | Saket Khandelwal | Jeannie Paterson | Timothy Baldwin | Daniel Beck
Proceedings of the 18th Annual Workshop of the Australasian Language Technology Association

‘Common Law’ judicial systems follow the doctrine of precedent, which means the legal principles articulated in court judgements are binding in subsequent cases in lower courts. For this reason, lawyers must search prior judgements for the legal principles that are relevant to their case. The difficulty for those within the legal profession is that the information that they are looking for may be contained within a few paragraphs or sentences, but those few paragraphs may be buried within a hundred-page document. In this study, we create a schema based on the relevant information that legal professionals seek within judgements and perform text classification based on it, with the aim of not only assisting lawyers in researching cases, but eventually enabling large-scale analysis of legal judgements to find trends in court outcomes over time.

2019

pdf bib
Proceedings of the 17th Annual Workshop of the Australasian Language Technology Association
Meladel Mistica | Massimo Piccardi | Andrew MacKinlay
Proceedings of the 17th Annual Workshop of the Australasian Language Technology Association

2013

pdf bib
Unsupervised Word Class Induction for Under-resourced Languages: A Case Study on Indonesian
Meladel Mistica | Jey Han Lau | Timothy Baldwin
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
ParGramBank: The ParGram Parallel Treebank
Sebastian Sulger | Miriam Butt | Tracy Holloway King | Paul Meurer | Tibor Laczkó | György Rákosi | Cheikh Bamba Dione | Helge Dyvik | Victoria Rosén | Koenraad De Smedt | Agnieszka Patejuk | Özlem Çetinoğlu | I Wayan Arka | Meladel Mistica
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2011

pdf bib
Word classes in Indonesian: A linguistic reality or a convenient fallacy in natural language processing?
Meladel Mistica | Timothy Baldwin | I Wayan Arka
Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation

2009

pdf bib
Recognising the Predicate-argument Structure of Tagalog
Meladel Mistica | Timothy Baldwin
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

pdf bib
Double Double, Morphology and Trouble: Looking into Reduplication in Indonesian
Meladel Mistica | I Wayan Arka | Timothy Baldwin | Avery Andrews
Proceedings of the Australasian Language Technology Association Workshop 2009

2008

pdf bib
Applying Discourse Analysis and Data Mining Methods to Spoken OSCE Assessments
Meladel Mistica | Timothy Baldwin | Marisa Cordella | Simon Musgrave
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2007

pdf bib
Extending Sense Collocations in Interpreting Noun Compounds
Su Nam Kim | Meladel Mistica | Timothy Baldwin
Proceedings of the Australasian Language Technology Workshop 2007