= 7th Web as Corpus Workshop (WAC-7) = To be held in association with [http://www2012.org/ WWW2012] in Lyon, France, 17th April 2012 Sponsored by [http://www.sigwac.org.uk ACL SIGWAC] More and more people are using Web data for linguistic and NLP research: the Web provides an easy source of linguistic data in a great variety of languages. However, a ‘crawl’ is not ready for exploration in the same way a traditional ‘corpus’ is. We need to turn a crawl into a corpus. The workshop, the seventh in an annual series, provides a venue for exploring what it involves, how to do it, and what we find out if we do. We invite submissions which: * describe Web corpus collection projects, or modules for one part of the process (crawling, filtering, de-duplication, language-id, tokenising, indexing, ...) * explore characteristics of Web data from a linguistics/NLP perspective including registers, domains, frequency distributions, comparisons between datasets * use crawled Web data for NLP purposes (with emphasis on the data rather than the use) The previous WAC workshops have been co-located with various conferences in computational linguistics. This time the workshop co-locates with WWW2012, the main world conference on the Web technologies and their impact on the society. == wiki:Programme == {{{#!comment == Important dates == * Submission by '''January 30 2012,''' to be made through [https://www.easychair.org/conferences/?conf=wac7 EasyChair] * Notification of acceptance by February 6 * Camera-ready copy due February 15 Submissions should be formatted using the [http://www.acm.org/sigs/publications/proceedings-templates ACM SIG stylefiles], and not exceeding 8 pages plus an extra page for references. Each submission will be reviewed by at least two members of the programme committee. Accepted papers will be published in the workshop proceedings. }}} == Organising committee == * Adam Kilgarriff (Lexical Computing Ltd.) * Serge Sharoff (University of Leeds, Workshop Chair) == Programme committee == Organising committee plus: * Silvia Bernardini, U of Bologna, Italy * Stefan Evert, U of Osnabrück, Germany * Cédrick Fairon, UCLouvain, Belgium * William H. Fletcher, U.S. Naval Academy, USA * Gregory Grefenstette, Exalead, France * Igor Leturia, Elhuyar Fundazioa, Basque Country, Spain * Preslav Nakov, National U of Singapore * Jan Pomikalek (Masaryk University) * Reinhard Rapp, U Mainz, Germany * Kevin Scannell, Saint Louis U, USA * Gilles-Maurice de Schryver, U Gent, Belgium * Pierre Zweigenbaum, LIMSI, France