6th Web as Corpus Workshop (WAC-6)

To be held in association with NAACL-HLT in Los Angeles, 5th June 2010

Sponsored by ACL SIGWAC

Invited Speaker: Patrick Pantel, Microsoft Research

Program here (added 5 May 2010)

More and more people are using Web data for linguistic and NLP research. The workshop, the sixth in an annual series, provides a venue for exploring how we can use it effectively and what we will find if we do.

We invite submissions which:

  • describe Web corpus collection projects, or modules for one part of the process (crawling, filtering, de-duplication, language-id, tokenising, indexing, ...)
  • explore characteristics of Web data from a linguistics/NLP perspective including registers, domains, frequency distributions, comparisons between datasets
  • use crawled Web data for NLP purposes (with emphasis on the data rather than the use)

Previous WAC workshops have been in Europe and Africa. The west coast of the US is the global centre for web development, hosting Google, Microsoft, Yahoo and a thousand others, so we are looking forward to visiting!

Call for Papers

Submissions should be formatted using the NAACL 2010 stylefiles, with blind review and not exceeding 8 pages plus an extra page for references. The stylefiles are available at Each submission will be reviewed by at least two members of the programme committee. Accepted papers will be published in the workshop proceedings.

Organising committee

  • Adam Kilgarriff (Lexical Computing Ltd., Workshop Chair)
  • Dekang Lin (Google Inc)
  • Serge Sharoff (University of Leeds, SIGWAC Chair)

Programme committee

Organising committee plus:

  • Silvia Bernardini, U of Bologna, Italy
  • Stefan Evert, U of Osnabrück, Germany
  • Cédrick Fairon, UCLouvain, Belgium
  • William H. Fletcher, U.S. Naval Academy, USA
  • Gregory Grefenstette, Exalead, France
  • Igor Leturia, Elhuyar Fundazioa, Basque Country, Spain
  • Jan Pomikalek. Masaryk Univ, Czech Republic
  • Preslav Nakov, National U of Singapore
  • Kevin Scannell, Saint Louis U, USA
  • Gilles-Maurice de Schryver, U Gent, Belgium
Last modified 14 years ago Last modified on 06/01/10 19:47:01
Note: See TracWiki for help on using the wiki.