Krasimir Angelov


2020

pdf bib
Abstract Syntax as Interlingua: Scaling Up the Grammatical Framework from Controlled Languages to Robust Pipelines
Aarne Ranta | Krasimir Angelov | Normunds Gruzitis | Prasanth Kolachina
Computational Linguistics, Volume 46, Issue 2 - June 2020

Abstract syntax is an interlingual representation used in compilers. Grammatical Framework (GF) applies the abstract syntax idea to natural languages. The development of GF started in 1998, first as a tool for controlled language implementations, where it has gained an established position in both academic and commercial projects. GF provides grammar resources for over 40 languages, enabling accurate generation and translation, as well as grammar engineering tools and components for mobile and Web applications. On the research side, the focus in the last ten years has been on scaling up GF to wide-coverage language processing. The concept of abstract syntax offers a unified view on many other approaches: Universal Dependencies, WordNets, FrameNets, Construction Grammars, and Abstract Meaning Representations. This makes it possible for GF to utilize data from the other approaches and to build robust pipelines. In return, GF can contribute to data-driven approaches by methods to transfer resources from one language to others, to augment data by rule-based generation, to check the consistency of hand-annotated corpora, and to pipe analyses into high-precision semantic back ends. This article gives an overview of the use of abstract syntax as interlingua through both established and emerging NLP applications involving GF.

pdf bib
A Parallel WordNet for English, Swedish and Bulgarian
Krasimir Angelov
Proceedings of the Twelfth Language Resources and Evaluation Conference

We present the parallel creation of a WordNet resource for Swedish and Bulgarian which is tightly aligned with the Princeton WordNet. The alignment is not only on the synset level, but also on word level, by matching words with their closest translations in each language. We argue that the tighter alignment is essential in machine translation and natural language generation. About one-fifth of the lexical entries are also linked to the corresponding Wikipedia articles. In addition to the traditional semantic relations in WordNet, we also integrate morphological and morpho-syntactic information. The resource comes with a corpus where examples from Princeton WordNet are translated to Swedish and Bulgarian. The examples are aligned on word and phrase level. The new resource is open-source and in its development we used only existing open-source resources.

2016

pdf bib
Predicting Translation Equivalents in Linked WordNets
Krasimir Angelov | Gleb Lobanov
Proceedings of the Sixth Workshop on Hybrid Approaches to Translation (HyTra6)

We present an algorithm for predicting translation equivalents between two languages, based on the corresponding WordNets. The assumption is that all synsets of one of the languages are linked to the corresponding synsets in the other language. In theory, given the exact sense of a word in a context it must be possible to translate it as any of the words in the linked synset. In practice, however, this does not work well since automatic and accurate sense disambiguation is difficult. Instead it is possible to define a more robust translation relation between the lexemes of the two languages. As far as we know the Finnish WordNet is the only one that includes that relation. Our algorithm can be used to predict the relation for other languages as well. This is useful for instance in hybrid machine translation systems which are usually more dependent on high-quality translation dictionaries.

2015

pdf bib
Orthography Engineering in Grammatical Framework
Krasimir Angelov
Proceedings of the Grammar Engineering Across Frameworks (GEAF) 2015 Workshop

2014

pdf bib
Bootstrapping Open-Source English-Bulgarian Computational Dictionary
Krasimir Angelov
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We present an open-source English-Bulgarian dictionary which is a unification and consolidation of existing and freely available resources for the two languages. The new resource can be used as either a pair of two monolingual morphological lexicons, or as a bidirectional translation dictionary between the languages. The structure of the resource is compatible with the existing synchronous English-Bulgarian grammar in Grammatical Framework (GF). This makes it possible to immediately plug it in as a component in a grammar-based translation system that is currently under development in the same framework. This also meant that we had to enrich the dictionary with additional syntactic and semantic information that was missing in the original resources.

pdf bib
Fast Statistical Parsing with Parallel Multiple Context-Free Grammars
Krasimir Angelov | Peter Ljunglöf
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Speech-Enabled Hybrid Multilingual Translation for Mobile Devices
Krasimir Angelov | Björn Bringert | Aarne Ranta
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Developing an interlingual translation lexicon using WordNets and Grammatical Framework
Shafqat Mumtaz Virk | K.V.S. Prasad | Aarne Ranta | Krasimir Angelov
Proceedings of the Fifth Workshop on South and Southeast Asian Natural Language Processing

2010

pdf bib
Tools for Multilingual Grammar-Based Translation on the Web
Aarne Ranta | Krasimir Angelov | Thomas Hallgren
Proceedings of the ACL 2010 System Demonstrations

2009

pdf bib
Incremental Parsing with Parallel Multiple Context-Free Grammars
Krasimir Angelov
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
Grammatical Framework Web Service
Björn Bringert | Krasimir Angelov | Aarne Ranta
Proceedings of the Demonstrations Session at EACL 2009

pdf bib
Grammar Development in GF
Aarne Ranta | Krasimir Angelov | Björn Bringert
Proceedings of the Demonstrations Session at EACL 2009