Mary Clare Shen


2019

pdf bib
Harvey Mudd College at SemEval-2019 Task 4: The D.X. Beaumont Hyperpartisan News Detector
Evan Amason | Jake Palanker | Mary Clare Shen | Julie Medero
Proceedings of the 13th International Workshop on Semantic Evaluation

We use the 600 hand-labelled articles from SemEval Task 4 to hand-tune a classifier with 3000 features for the Hyperpartisan News Detection task. Our final system uses features based on bag-of-words (BoW), analysis of the article title, language complexity, and simple sentiment analysis in a naive Bayes classifier. We trained our final system on the 600,000 articles labelled by publisher. Our final system has an accuracy of 0.653 on the hand-labeled test set. The most effective features are the Automated Readability Index and the presence of certain words in the title. This suggests that hyperpartisan writing uses a distinct writing style, especially in the title.