| 14 | |
| 15 | |
| 16 | == Program == |
| 17 | |
| 18 | === WAC-X morning session === |
| 19 | |
| 20 | || 9:30–9:40 ||'''Welcome and Introduction''' || |
| 21 | || 9:40–10:00 ||''Automatic Classification by Topic Domain for Meta Data Generation, Web Corpus Evaluation, and Corpus Comparison'' || |
| 22 | || ||Roland Schäfer and Felix Bildhauer || |
| 23 | || 10:00–10:30 ||''Efficient construction of metadata-enhanced web corpora'' || |
| 24 | || ||Adrien Barbaresi || |
| 25 | |
| 26 | === WAC-X noon session === |
| 27 | |
| 28 | || 11:00–11:30 ||''Topically-focused Blog Corpora for Multiple Languages'' || |
| 29 | || ||Andrew Salway, Dag Elgesem, Knut Hofland, Øystein Reigem and Lubos Steskal || |
| 30 | || 11:30–12:00 ||''The Challenges and Joys of Analysing Ongoing Language Change in Web-based Corpora: a Case Study'' || |
| 31 | || ||Anne Krause || |
| 32 | || 12:00–12:30 ||''Using the Web and Social Media as Corpora for Monitoring the Spread of Neologisms. The case of ’rapefugee’, ’rapeugee’, and ’rapugee’.'' || |
| 33 | || ||Quirin Würschinger, Mohammad Fazleh Elahi, Desislava Zhekova and Hans-Jörg Schmid || |
| 34 | |
| 35 | === EmpiriST session === |
| 36 | |
| 37 | || 13:30–13:50 ||''EmpiriST 2015: A Shared Task on the Automatic Linguistic Annotation of Computer-Mediated Communication and Web Corpora'' || |
| 38 | || ||Michael Beißwenger, Sabine Bartsch, Stefan Evert and Kay-Michael Würzner || |
| 39 | || 13:50–14:10 ||''!SoMaJo: State-of-the-art tokenization for German web and social media texts'' || |
| 40 | || ||Thomas Proisl and Peter Uhrig || |
| 41 | || 14:10–14:30 ||''UdS-(retrain|distributional|surface): Improving POS Tagging for OOV Words in German CMC and Web Data'' || |
| 42 | || ||Jakob Prange, Andrea Horbach and Stefan Thater || |
| 43 | |
| 44 | === WAC-X and EmpiriST teaser talks === |
| 45 | |
| 46 | || 14:30–14:35 ||''Babler - Data Collection from the Web to Support Speech Recognition and Keyword Search'' || |
| 47 | || ||Gideon Mendels, Erica Cooper and Julia Hirschberg || |
| 48 | || 14:35–14:40 ||''A Global Analysis of Emoji Usage'' || |
| 49 | || ||Nikola Ljubešić and Darja Fišer || |
| 50 | || 14:40–14:45 ||''Genre classification for a corpus of academic webpages'' || |
| 51 | || ||Erika Dalan and Serge Sharoff || |
| 52 | || 14:45–14:50 ||''On Bias-free Crawling and Representative Web Corpora'' || |
| 53 | || ||Roland Schäfer || |
| 54 | || 14:55–15:00 ||''EmpiriST: AIPHES - Robust Tokenization and POS-Tagging for Different Genres'' || |
| 55 | || ||Steffen Remus, Gerold Hintz, Chris Biemann, Christian M. Meyer, Darina Benikova, Judith Eckle-Kohler, Margot Mieskes and Thomas Arnold || |
| 56 | || 15:00–15:05 ||''bot.zen @ EmpiriST 2015 - A minimally-deep learning PoS-tagger (trained for German CMC and Web data)'' || |
| 57 | || ||Egon Stemle || |
| 58 | || 15:05–15:10 ||''LTL-UDE @ EmpiriST 2015: Tokenization and PoS Tagging of Social Media Text'' || |
| 59 | || ||Tobias Horsmann and Torsten Zesch || |
| 60 | |
| 61 | === Posters and discussion === |
| 62 | |
| 63 | || 15:10–16:30 ||='''WAC-X and EmpiriST poster session''' =|| |
| 64 | || 16:30–17:30 ||='''WAC-X and EmpiriST closing discussion''' =|| |
| 65 | || 17:30–18:30 ||='''Panel discussion ''Corpora, open science, and copyright reforms''''' =|| |