18 | 21 | |
19 | 22 | For almost a decade, the ACL SIGWAC, and most notably the Web as Corpus (WAC) workshops, have served as a platform for researchers interested in the compilation, processing and use of web-derived corpora as well as computer-mediated communication. Past workshops were co-located with major conferences on corpus linguistics and/or computational linguistics (such as ACL, EACL, Corpus Linguistics, LREC, NAACL, WWW). The eleventh Web as Corpus workshop (WAC-XI) emphasises the linguistic aspects of web corpus research more than the technological aspects while keeping in mind that the two are inseparable. |
20 | 23 | |
21 | 24 | The World Wide Web has become increasingly popular as a source of linguistic evidence, not only within the computational linguistics community, but also with theoretical linguists facing problems such as data sparseness or the lack of variation in traditional corpora of written language. Accordingly, web corpora continue to gain relevance, given their size and diversity in terms of genres and text types. In lexicography, web data have become a major and well-established resource with dedicated research data and an environment such as the !SketchEngine. In other areas of linguistics, the adoption rate of web corpora has been slower but steady. Furthermore, some areas of research dealing exclusively with web (or similar) data have emerged, such as the construction and exploitation of corpora based on short messages. Another example is the (manual or automatic) classification of web texts by genre, register, or – more generally speaking – text type, as well as topic area. Similarly, the areas of corpus evaluation and corpus comparison have been advanced greatly through the rise of web corpora, mostly because web corpora (especially larger ones in the region of several billions of tokens) are often created by downloading texts from the web unselectively with respect to their text type or content. While the composition (or stratification) of such corpora cannot be determined before their construction, it is desirable to evaluate it afterwards, at least. Also, comparing web corpora to corpora that have been compiled in a traditional way is key in determining the quality of web corpora with respect to a given research question. |
39 | | == !CleanerEval first panel discussion == |
| 44 | === Submission website === |
| 45 | |
| 46 | We will use EasyChair, URL tba. |
| 47 | |
| 48 | === Submission format === |
| 49 | |
| 50 | We call for extended abstracts of 1,000 – 1,500 words length (excluding references, tables, and figures). |
| 51 | Submissions must be in PDF format. Authors of accepted papers will receive minimal formatting instructions for the publication of the abstracts on the WAC-XI website in due time. |
| 52 | There will be no proceedings volume, but a successful workshop might lead to a special issue/edited volume on web (and similar) data in linguistics, for which a separate call for (full) papers would be published after the workshop. |
| 53 | |
| 54 | |
| 55 | === Important dates ===#dates |
| 56 | |
| 57 | * 13 February 2017: First Call for Workshop Papers |
| 58 | * 13 March 2017: Second Call for Workshop Papers |
| 59 | * 16 April 2017: Workshop Paper Due Date |
| 60 | * 5 June 2017: Notification of Acceptance |
| 61 | * 24 July 2017: Workshop Day |
| 62 | |
| 63 | |
| 64 | |
| 65 | === !CleanerEval first panel discussion === |
72 | | === Important dates ===#dates |
73 | | |
74 | | * 13 February 2017: First Call for Workshop Papers |
75 | | * 13 March 2017: Second Call for Workshop Papers |
76 | | * 16 April 2017: Workshop Paper Due Date |
77 | | * 5 June 2017: Notification of Acceptance |
78 | | * 24 July 2017: Workshop Day |
79 | | |
80 | | === Call for papers === #cfp |
81 | | |
82 | | tba |
83 | | |
84 | | === Submission website === |
85 | | |
86 | | We will use EasyChair, URL tba. |
87 | | |
88 | | === Submission format === |
89 | | |
90 | | We call for extended abstracts of 1,000 – 1,500 words length (excluding references, tables, and figures). |
91 | | Submissions must be in PDF format. Authors of accepted papers will receive minimal formatting instructions for the publication of the abstracts on the WAC-XI website in due time. |
92 | | There will be no proceedings volume, but a successful workshop might lead to a special issue/edited volume on web (and similar) data in linguistics, for which a separate call for (full) papers would be published after the workshop. |