| 1 | = 6th Web as Corpus Workshop (WAC-6) = |
| 2 | To be held in association with [http://naaclhlt2010.isi.edu/ NAACL-HLT] in Los Angeles, |
| 3 | 5th/6th June 2010 |
| 4 | |
| 5 | Sponsored by [http://www.sigwac.org.uk ACL SIGWAC] |
| 6 | |
| 7 | === Invited Speaker: [http://www.patrickpantel.com/ Patrick Pantel], ISI, University of Southern California === |
| 8 | |
| 9 | |
| 10 | More and more people are using Web data for linguistic and NLP research. The workshop, the sixth in an annual series, provides a venue for exploring how we can use it effectively and what we will find if we do. |
| 11 | |
| 12 | We invite submissions which: |
| 13 | * describe Web corpus collection projects, or modules for one part of the process (crawling, filtering, de-duplication, language-id, tokenising, indexing, ...) |
| 14 | * explore characteristics of Web data from a linguistics/NLP perspective including registers, domains, frequency distributions, comparisons between datasets |
| 15 | * use crawled Web data for NLP purposes (with emphasis on the data rather than the use) |
| 16 | Previous WAC workshops have been in Europe and Africa. The west coast of the US is the global centre for web development, hosting Google, Microsoft, Yahoo and a thousand others, so we are looking forward to visiting! |
| 17 | |
| 18 | |
| 19 | == Call for Papers == |
| 20 | * Submission by '''March 1st 2010,''' to be made through the NAACL system at https://www.softconf.com/naaclhlt2010/webascorpus/ |
| 21 | * Notification of acceptance by March 30 |
| 22 | * Camera-ready copy due April 12 |
| 23 | |
| 24 | Submissions should be formatted using the NAACL 2010 stylefiles, with blind review and not exceeding 8 pages plus an extra page for references. The stylefiles are available at http://naaclhlt2010.isi.edu/authors.html. Each submission will be reviewed by at least two members of the programme committee. Accepted papers will be published in the workshop proceedings. |
| 25 | |
| 26 | |
| 27 | == Organising committee == |
| 28 | * Adam Kilgarriff (Lexical Computing Ltd., Workshop Chair) |
| 29 | * Dekang Lin (Google Inc) |
| 30 | * Serge Sharoff (University of Leeds, SIGWAC Chair) |
| 31 | |
| 32 | == Programme committee == |
| 33 | Organising committee plus: |
| 34 | * Silvia Bernardini, U of Bologna, Italy |
| 35 | * Stefan Evert, U of Osnabrück, Germany |
| 36 | * Cédrick Fairon, UCLouvain, Belgium |
| 37 | * William H. Fletcher, U.S. Naval Academy, USA |
| 38 | * Gregory Grefenstette, Exalead, France |
| 39 | * Igor Leturia, Elhuyar Fundazioa, Basque Country, Spain |
| 40 | * Jan Pomikalek. Masaryk Univ, Czech Republic |
| 41 | * Preslav Nakov, National U of Singapore |
| 42 | * Kevin Scannell, Saint Louis U, USA |
| 43 | * Gilles-Maurice de Schryver, U Gent, Belgium |