wiki:WAC10

Instead of the 10th Web as Corpus Workshop (WaC-10) there will be a WaC Meeting, have a look at WAC@eLex2015

10th Web as Corpus Workshop (WaC-10) @ eLex 2015: CANCELLED

Unfortunately, the total number of submissions to the workshop was very low, and as such we could not have guaranteed WAC-10 to have as full and engaging a program as previous WAC events. We have considered a number of creative strategies that might have saved the workshop, but in the interest of keeping up the highest possible standards, we finally decided against each of them. Therefore, WAC-10 has been canceled.

We strongly believe that the long tradition of successful WAC workshops is highly worth continuing, especially since the field is constantly expanding and new technologies are emerging every day. Hence, we will immediately begin the search for an attractive co-location and thematic focus for WAC-11 in 2016.

We thank the authors who submitted papers to WAC-10, and apologize for not being able to offer a venue for presentation and publication of their work in 2015. We further thank the members of the program committee for agreeing to review for WAC-10.

August 10, 2015 (Herstmonceux Castle, Sussex, UK)

Endorsed by ACL SIGWAC.

The web has become increasingly popular as a source of linguistic data, not only within the NLP community, but also with lexicographers and linguists. Accordingly, web corpora continue to gain importance, given their size and diversity in terms of genres/text types. However, a number of issues in web corpus construction still need much research, ranging from questions of corpus design to more-technical aspects of efficient construction of large corpora. Similarly, the systematic evaluation of web corpora, for example in the form of task-based comparisons to traditional corpora, has only lately shifted into focus.

For a decade now, the ACL SIGWAC, and especially the highly successful Web as Corpus (WaC) workshops have served as a platform for researchers interested in building and working with web-derived corpora. Past workshops have been co-located with EACL, NAACL, LREC, WWW, and Corpus Linguistics.

This year we are excited to be collocated with Electronic lexicography in the 21st century: linking lexical data in the digital age (eLex 2015). This will be the first time that WAC has co-located with a lexicography conference.

Final Call for Papers: Extended Deadline

As in previous years, the 10th Web as Corpus workshop (WAC-10) invites original contributions pertaining to all aspects of web corpora, including data collection, cleaning, duplicate removal, document filtering, linguistic post-processing and annotation, and the use of web corpora in language technology and linguistics. Because of its co-location with a lexicography conference, WAC-10 particularly encourages submissions related to the use of web corpora in lexicography.

A major challenge in the construction of web corpora is the question of the quality and the evaluation of both the software used in the construction of web corpora as well as the corpora themselves. WAC10 encourages submissions related to these issues.

Submission format

All submissions should follow the ACL-IJCNLP 2015 style guidelines and must be in PDF format. The style files are available from the ACL-IJCNLP 2015 website.

Full paper submissions may consist of up to eight (8) pages of content plus any number of pages consisting of only references. Short papers may consist of up to four (4) pages of content plus any number of pages consisting of only references. Full papers will be distinguished from short papers in the proceedings.

Papers will be presented either orally or as posters at the workshop. There will be no distinction between papers presented orally and those presented as posters in the proceedings.

Reviewing of papers will be double-blind. Therefore, the paper must not include the authors' names and affiliations. Furthermore, self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ...", must be avoided. Instead, use citations such as "Smith (1991) previously showed ...". Papers not conforming to these requirements will be rejected without review.

We strongly recommend the use of the ACL-IJCNLP 2015 LaTeX style files or Microsoft Word Style files. The style files and example documents will be available from the workshop website. We reserve the right to reject submissions that do not conform to these styles including font and page size restrictions.

Paper submission is via EasyChair.

Organizing Committee

  • Paul Cook, University of New Brunswick (paul.cook@…)
  • Roland Schäfer, Freie Universität Berlin
  • Egon Stemle, EURAC (egon.stemle@…)

Program Comittee

  • Andrea Abel, European Academy Bolzano / Bozen
  • Douglas Biber, Northern Arizona University
  • Felix Bildhauer, Freie Universität Berlin
  • Jesse Egbert, Brigham Young University
  • Stefan Evert, Friedrich-Alexander-Universität Erlangen-Nürnberg
  • Ulrich Heid, Universität Hildesheim
  • Iztok Kosem, Trojina, Institute for Applied Slovene Studies
  • Simon Krek, Jožef Stefan Institute
  • Lothar Lemnitzer, Berlin-Brandenburgische Akademie der Wissenschaften
  • Robert Lew, Adam Mickiewicz University in Poznań
  • Nikola Ljubešić, University of Zagreb
  • Carolin Müller-Spitzer, Institut für Deutsche Sprache
  • Siva Reddy, University of Edinburgh
  • Steffen Remus, TU Darmstadt
  • Pavel Rychly, Masaryk University
  • Serge Sharoff, University of Leeds
  • Carole Tiberius, Instituut voor Nederlandse Lexicologie
  • Yukio Tono, Tokyo University of Foreign Studies
  • Andreas Witt, Institut für Deutsche Sprache
  • Torsten Zesch, University of Duisburg-Essen

Important Dates

  • 24 April 1 May 2015 (GMT -12): Paper submission deadline (Extended)
  • 29 May 2015: Notification
  • 19 June 2015: Camera-ready deadline
  • 10 August 2015: WAC-10 Workshop
Last modified 2 years ago Last modified on Jul 11, 2015, 2:40:23 AM