Workshop on Web as Corpus (2016)


up

pdf (full)
bib (full)
Proceedings of the 10th Web as Corpus Workshop

pdf bib
Proceedings of the 10th Web as Corpus Workshop
Paul Cook | Stefan Evert | Roland Schäfer | Egon Stemle

pdf bib
Automatic Classification by Topic Domain for Meta Data Generation, Web Corpus Evaluation, and Corpus Comparison
Roland Schäfer | Felix Bildhauer

pdf bib
Efficient construction of metadata-enhanced web corpora
Adrien Barbaresi

pdf bib
Topically-focused Blog Corpora for Multiple Languages
Andrew Salway | Dag Elgesem | Knut Hofland | Øystein Reigem | Lubos Steskal

pdf bib
The Challenges and Joys of Analysing Ongoing Language Change in Web-based Corpora: a Case Study
Anne Krause

pdf bib
Using the Web and Social Media as Corpora for Monitoring the Spread of Neologisms. The case of ‘rapefugee’, ‘rapeugee’, and ‘rapugee’.
Quirin Würschinger | Mohammad Fazleh Elahi | Desislava Zhekova | Hans-Jörg Schmid

pdf bib
EmpiriST 2015: A Shared Task on the Automatic Linguistic Annotation of Computer-Mediated Communication and Web Corpora
Michael Beißwenger | Sabine Bartsch | Stefan Evert | Kay-Michael Würzner

pdf bib
SoMaJo: State-of-the-art tokenization for German web and social media texts
Thomas Proisl | Peter Uhrig

pdf bib
UdS-(retrain|distributional|surface): Improving POS Tagging for OOV Words in German CMC and Web Data
Jakob Prange | Andrea Horbach | Stefan Thater

pdf bib
Babler - Data Collection from the Web to Support Speech Recognition and Keyword Search
Gideon Mendels | Erica Cooper | Julia Hirschberg

pdf bib
A Global Analysis of Emoji Usage
Nikola Ljubešić | Darja Fišer

pdf bib
Genre classification for a corpus of academic webpages
Erika Dalan | Serge Sharoff

pdf bib
On Bias-free Crawling and Representative Web Corpora
Roland Schäfer

pdf bib
EmpiriST: AIPHES - Robust Tokenization and POS-Tagging for Different Genres
Steffen Remus | Gerold Hintz | Chris Biemann | Christian M. Meyer | Darina Benikova | Judith Eckle-Kohler | Margot Mieskes | Thomas Arnold

pdf bib
bot.zen @ EmpiriST 2015 - A minimally-deep learning PoS-tagger (trained for German CMC and Web data)
Egon Stemle

pdf bib
LTL-UDE @ EmpiriST 2015: Tokenization and PoS Tagging of Social Media Text
Tobias Horsmann | Torsten Zesch