NTUCLE: Developing a Corpus of Learner English to Provide Writing Support for Engineering Students
Winder, Roger Vivek Placidus
Li, Shu Yun
da Costa, Luís Morgado
Date of Issue2017
The 4th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2017)
This paper describes the creation of a new annotated learner corpus. The aim is to use this corpus to develop an automated system for corrective feedback on students’ writing. With this system, students will be able to receive timely feedback on language errors before they submit their assignments for grading. A corpus of assignments submitted by first year engineering students was compiled, and a new error tag set for the NTU Corpus of Learner English (NTUCLE) was developed based on that of the NUS Corpus of Learner English (NUCLE), as well as marking rubrics used at NTU. After a description of the corpus, error tag set and annotation process, the paper presents the results of the annotation exercise as well as follow up actions. The final error tag set, which is significantly larger than that for the NUCLE error categories, is then presented before a brief conclusion summarising our experience and future plans.
© 2017 The author(s) (published by Asian Federation of Natural Language Processing (AFNLP)). This paper was published in Proceedings of the 4th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2017) and is made available as an electronic reprint (preprint) with permission of AFNLP. The published version is available at: [https://aclanthology.coli.uni-saarland.de/papers/W17-5901/w17-5901]. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper is prohibited and is subject to penalties under law.