wiki:WAC7

Version 7 (modified by Serge Sharoff, 12 years ago) ( diff )

--

7th Web as Corpus Workshop (WAC-7)

To be held in association with WWW2012 in Lyon, France, 17th April 2012

Sponsored by ACL SIGWAC

More and more people are using Web data for linguistic and NLP research: the Web provides an easy source of linguistic data in a great variety of languages. However, a ‘crawl’ is not ready for exploration in the same way a traditional ‘corpus’ is. We need to turn a crawl into a corpus. The workshop, the seventh in an annual series, provides a venue for exploring what it involves, how to do it, and what we find out if we do.

We invite submissions which:

  • describe Web corpus collection projects, or modules for one part of the process (crawling, filtering, de-duplication, language-id, tokenising, indexing, ...)
  • explore characteristics of Web data from a linguistics/NLP perspective including registers, domains, frequency distributions, comparisons between datasets
  • use crawled Web data for NLP purposes (with emphasis on the data rather than the use)

The previous WAC workshops have been co-located with various conferences in computational linguistics. This time the workshop co-locates with WWW2012, the main world conference on the Web technologies and their impact on the society.

Important dates

  • Submission by January 22 2012, to be made through EasyChair
  • Notification of acceptance by February 3
  • Camera-ready copy due February 15

Submissions should be formatted using the ACM SIG stylefiles, and not exceeding 8 pages plus an extra page for references. Each submission will be reviewed by at least two members of the programme committee. Accepted papers will be published in the workshop proceedings.

Organising committee

  • Adam Kilgarriff (Lexical Computing Ltd.)
  • Jan Pomikalek (Masaryk University)
  • Serge Sharoff (University of Leeds, Workshop Chair)

Programme committee

Organising committee plus:

  • Silvia Bernardini, U of Bologna, Italy
  • Stefan Evert, U of Osnabrück, Germany
  • Cédrick Fairon, UCLouvain, Belgium
  • William H. Fletcher, U.S. Naval Academy, USA
  • Gregory Grefenstette, Exalead, France
  • Igor Leturia, Elhuyar Fundazioa, Basque Country, Spain
  • Preslav Nakov, National U of Singapore
  • Reinhard Rapp, U Mainz, Germany
  • Kevin Scannell, Saint Louis U, USA
  • Gilles-Maurice de Schryver, U Gent, Belgium
  • Pierre Zweigenbaum, LIMSI, France

Attachments (1)

Note: See TracWiki for help on using the wiki.