Aminul Islam

Also published as: Md. Aminul Islam


2016

pdf bib
Non-uniform Language Detection in Technical Writing
Weibo Wang | Abidalrahman Moh’d | Aminul Islam | Axel Soto | Evangelos Milios
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Reddit Temporal N-gram Corpus and its Applications on Paraphrase and Semantic Similarity in Social Media using a Topic-based Latent Semantic Analysis
Anh Dang | Abidalrahman Moh’d | Aminul Islam | Rosane Minghim | Michael Smit | Evangelos Milios
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

This paper introduces a new large-scale n-gram corpus that is created specifically from social media text. Two distinguishing characteristics of this corpus are its monthly temporal attribute and that it is created from 1.65 billion comments of user-generated text in Reddit. The usefulness of this corpus is exemplified and evaluated by a novel Topic-based Latent Semantic Analysis (TLSA) algorithm. The experimental results show that unsupervised TLSA outperforms all the state-of-the-art unsupervised and semi-supervised methods in SEMEVAL 2015: paraphrase and semantic similarity in Twitter tasks.

pdf bib
DalGTM at SemEval-2016 Task 1: Importance-Aware Compositional Approach to Short Text Similarity
Jie Mei | Aminul Islam | Evangelos Milios
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
TrWP: Text Relatedness using Word and Phrase Relatedness
Md Rashadul Hasan Rakib | Aminul Islam | Evangelos Milios
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2012

pdf bib
Comparing Word Relatedness Measures Based on Google n-grams
Aminul Islam | Evangelos Milios | Vlado Keselj
Proceedings of COLING 2012: Posters

2009

pdf bib
Real-Word Spelling Correction using Google Web 1T 3-grams
Aminul Islam | Diana Inkpen
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

2006

pdf bib
Second Order Co-occurrence PMI for Determining the Semantic Similarity of Words
Md. Aminul Islam | Diana Inkpen
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper presents a new corpus-based method for calculating the semantic similarity of two target words. Our method, called Second Order Co-occurrencePMI (SOC-PMI), uses Pointwise Mutual Information to sort lists of important neighbor words of the two target words. Then we consider the words which are common in both lists and aggregate their PMI values (from the opposite list) to calculate the relative semantic similarity. Our method was empirically evaluated using Miller and Charler’s (1991) 30 noun pair subset, Ruben-stein and Goodenough’s (1965) 65 noun pairs, 80 synonym test questions from the Test of English as a Foreign Language (TOEFL), and 50 synonym test questions from a collection of English as a Second Language (ESL) tests. Evaluation results show that our method outperforms several competing corpus-based methods.