Please use this identifier to cite or link to this item:
Title: Using rich models of language in grammatical error detection
Authors: Da Costa, Luis Morgado
Keywords: Engineering::Computer science and engineering::Computer applications
Issue Date: 2021
Publisher: Nanyang Technological University
Source: Da Costa, L. M. (2021). Using rich models of language in grammatical error detection. Doctoral thesis, Nanyang Technological University, Singapore.
Abstract: In this thesis, I show the advantages of using symbolic parsers for Grammatical Error Detection and Correction. In particular, I work with computational grammars for English and Mandarin Chinese to demonstrate how linguistically motivated research using symbolic parsers is still an extremely viable approach to build educational applications. During the various chapters of this thesis, I will guide the reader through the entire process of creating a successful educational application that has benefited thousands of NTU students. To this end, I will start by describing the creation of two new learner corpora, one for English and one for Mandarin Chinese, through which I collected first-hand data about common errors NTU students make in these two languages. I will follow with a discussion of my contributions to ZHONG, an open source computational grammar of Mandarin Chinese using a theoretical framework known as Head-Driven Phrase Structure Grammar, with special emphasis on the design of special rules capable of transforming a computational grammar into an error detection system. I will then discuss the creation of a new treebank used to train parse-ranking models to help symbolic parsers decide the most likely correction for a given error. And I will conclude by describing the development of two web-based applications exploiting a mature symbolic parser to provide immediate corrective feedback for a large number of common errors. This thesis presents multiple sets of positive results. I have not only substantially increased ZHONG's coverage, but I have also successfully implemented dozens of checks to detect common grammatical mistakes made by learners of Mandarin Chinese. Using the new parse-ranking models, I was also able to improve the precision of error detection in both English and Mandarin Chinese by between 15% and 20%. Finally, a blended learning experiment involving more than 1,800 NTU students has shown the success of an application developed specifically to help improve students' writing. All developed systems, as well as most of the data collected and tagged during this thesis, are released under open-source licenses.
DOI: 10.32657/10356/155214
Rights: This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:IGS Theses

Files in This Item:
File Description SizeFormat 
lmc_2021_ntu_phd_thesis.pdf3.86 MBAdobe PDFView/Open

Page view(s)

Updated on May 15, 2022

Download(s) 50

Updated on May 15, 2022

Google ScholarTM




Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.