Miriam Kaeshammer


2015

pdf bib
Hierarchical Machine Translation With Discontinuous Phrases
Miriam Kaeshammer
Proceedings of the Tenth Workshop on Statistical Machine Translation

2014

pdf bib
Discosuite - A parser test suite for German discontinuous structures
Wolfgang Maier | Miriam Kaeshammer | Peter Baumann | Sandra Kübler
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Parser evaluation traditionally relies on evaluation metrics which deliver a single aggregate score over all sentences in the parser output, such as PARSEVAL. However, for the evaluation of parser performance concerning a particular phenomenon, a test suite of sentences is needed in which this phenomenon has been identified. In recent years, the parsing of discontinuous structures has received a rising interest. Therefore, in this paper, we present a test suite for testing the performance of dependency and constituency parsers on non-projective dependencies and discontinuous constituents for German. The test suite is based on the newly released TIGER treebank version 2.2. It provides a unique possibility of benchmarking parsers on non-local syntactic relationships in German, for constituents and dependencies. We include a linguistic analysis of the phenomena that cause discontinuity in the TIGER annotation, thereby closing gaps in previous literature. The linguistic phenomena we investigate include extraposition, a placeholder/repeated element construction, topicalization, scrambling, local movement, parentheticals, and fronting of pronouns.

pdf bib
On Complex Word Alignment Configurations
Miriam Kaeshammer | Anika Westburg
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Resources of manual word alignments contain configurations that are beyond the alignment capacity of current translation models, hence the term complex alignment configuration. They have been the matter of some debate in the machine translation community, as they call for more powerful translation models that come with further complications. In this work we investigate instances of complex alignment configurations in data sets of four different language pairs to shed more light on the nature and cause of those configurations. For the English-German alignments from Padó and Lapata (2006), for instance, we find that only a small fraction of the complex configurations are due to real annotation errors. While a third of the complex configurations in this data set could be simplified when annotating according to a different style guide, the remaining ones are phenomena that one would like to be able to generate during translation. Those instances are mainly caused by the different word order of English and German. Our findings thus motivate further research in the area of translation beyond phrase-based and context-free translation modeling.

2013

pdf bib
Synchronous Linear Context-Free Rewriting Systems for Machine Translation
Miriam Kaeshammer
Proceedings of the Seventh Workshop on Syntax, Semantics and Structure in Statistical Translation

2012

pdf bib
PLCFRS Parsing Revisited: Restricting the Fan-Out to Two
Wolfgang Maier | Miriam Kaeshammer | Laura Kallmeyer
Proceedings of the 11th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+11)

pdf bib
German and English Treebanks and Lexica for Tree-Adjoining Grammars
Miriam Kaeshammer | Vera Demberg
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present a treebank and lexicon for German and English, which have been developed for PLTAG parsing. PLTAG is a psycholinguistically motivated, incremental version of tree-adjoining grammar (TAG). The resources are however also applicable to parsing with other variants of TAG. The German PLTAG resources are based on the TIGER corpus and, to the best of our knowledge, constitute the first scalable German TAG grammar. The English PLTAG resources go beyond existing resources in that they include the NP annotation by (Vadas and Curran, 2007), and include the prediction lexicon necessary for PLTAG.

2011

pdf bib
Enriching Phrase-Based Statistical Machine Translation with POS Information
Miriam Kaeshammer | Dominikus Wetzel
Proceedings of the Second Student Research Workshop associated with RANLP 2011