Nghia The Pham

Also published as: Nghia Pham


2020

pdf bib
A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal
Demian Gholipour Ghalandari | Chris Hokamp | Nghia The Pham | John Glover | Georgiana Ifrim
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Multi-document summarization (MDS) aims to compress the content in large document collections into short summaries and has important applications in story clustering for newsfeeds, presentation of search results, and timeline generation. However, there is a lack of datasets that realistically address such use cases at a scale large enough for training supervised models for this task. This work presents a new dataset for MDS that is large both in the total number of document clusters and in the size of individual clusters. We build this dataset by leveraging the Wikipedia Current Events Portal (WCEP), which provides concise and neutral human-written summaries of news events, with links to external source articles. We also automatically extend these source articles by looking for related articles in the Common Crawl archive. We provide a quantitative analysis of the dataset and empirical results for several state-of-the-art MDS techniques.

2017

pdf bib
Living a discrete life in a continuous world: Reference in cross-modal entity tracking
Gemma Boleda | Sebastian Padó | Nghia The Pham | Marco Baroni
Proceedings of the 12th International Conference on Computational Semantics (IWCS) — Short papers

2016

pdf bib
The red one!: On learning to refer to things based on discriminative properties
Angeliki Lazaridou | Nghia The Pham | Marco Baroni
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2015

pdf bib
Combining Language and Vision with a Multimodal Skip-gram Model
Angeliki Lazaridou | Nghia The Pham | Marco Baroni
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Jointly optimizing word representations for lexical and sentential tasks with the C-PHRASE model
Nghia The Pham | Germán Kruszewski | Angeliki Lazaridou | Marco Baroni
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
A Multitask Objective to Inject Lexical Contrast into Distributional Semantics
Nghia The Pham | Angeliki Lazaridou | Marco Baroni
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
A practical and linguistically-motivated approach to compositional distributional semantics
Denis Paperno | Nghia The Pham | Marco Baroni
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Compositional Distributional Semantics Models in Chunk-based Smoothed Tree Kernels
Nghia The Pham | Lorenzo Ferrone | Fabio Massimo Zanzotto
Proceedings of the Third Joint Conference on Lexical and Computational Semantics (*SEM 2014)

2013

pdf bib
Sentence paraphrase detection: When determiners and word order make the difference
Nghia Pham | Raffaella Bernardi | Yao Zhong Zhang | Marco Baroni
Proceedings of the IWCS 2013 Workshop Towards a Formal Distributional Semantics

pdf bib
General estimation and evaluation of compositional distributional semantic models
Georgiana Dinu | Nghia The Pham | Marco Baroni
Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality

pdf bib
DISSECT - DIStributional SEmantics Composition Toolkit
Georgiana Dinu | Nghia The Pham | Marco Baroni
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations