Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/164625
Title: Automatic question generation from natural language texts
Authors: Cao, Zhen
Keywords: Engineering::Electrical and electronic engineering
Issue Date: 2022
Publisher: Nanyang Technological University
Source: Cao, Z. (2022). Automatic question generation from natural language texts. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/164625
Abstract: Question generation (QG) is defined as the task of generating questions automatically from a variety of inputs, ranging from database information to raw text. QG system has been considered as a critical component for numerous applications, which include but are not limited to information-seeking systems, multi-modal conversations, and intelligent tutoring and computer-assisted learning systems. Among these, generating questions from natural language texts has great practical significance. In education, forming good questions is crucial for improving academic performance and evaluating students' abilities. Meanwhile, texts with potential educational value, such as new articles and Wikipedia, are good reading resources for learning purposes. QG can generate questions for reading comprehension and serve as a component in intelligent tutoring systems. Benefiting from QG, the manpower and cost of creating large-scale data sets for question answering (QA) and reading comprehension can be reduced. QG is an essential feature for the chatbot in the dialog system when initiating a conversation or seeking specific information from users. Therefore, developing an automatic QG system is important and timely. The thesis focuses on QG from texts automatically. The goal is to develop models which input from the text and the output is generated questions for downstream applications. Inspired by advances in deep learning, several neural QG models are proposed and studied, addressing different challenges in QG in this thesis. One of the main challenges faced by existing neural QG models is the degradation in performance due to the issue of one-to-many mapping in the nature of QG, where, given a passage, a number of questions with variations can be generated. First, the answer as direction information (DI) is included in the RNN-based sequence-to-sequence (seq2seq) model, so as to allow the neural model to consider what to ask, which makes the model generate pertinent questions. A dual-encoder model is developed to learn the source input text and the DI representation, respectively. Therefore, an answer-aware QG model is proposed to alleviate the problem of one-to-many mapping. The nature of one-to-many mapping in QG is mainly due to two aspects of QG–what to ask and how to ask, in which the former is associated with generating to-the-point questions while the latter is with the content selection from the source to form specific questions. To further cope with the one-to-many mapping challenge, a controllable QG (CQG) model that employs an attentive seq2seq based generative model with a copying mechanism is investigated. The proposed CQG incorporates query interest and auxiliary information as controllers to address the one-to-many mapping problem in QG. Two variants of embedding strategies are designed for CQG to obtain good performance. Furthermore, multi-hop QG, which requires more complex reasoning of the input texts over multiple pieces of information, is studied. To capture the global context and facilitate reasoning, a novel framework that includes the semantic graph of the input document, where the graph can be regarded as auxiliary information, is proposed. The model encodes the semantic graph for the input texts by graph learning. Thereafter, text-level and graph-level representations are fused to generate questions via a pre-trained language model.
URI: https://hdl.handle.net/10356/164625
DOI: 10.32657/10356/164625
Schools: School of Electrical and Electronic Engineering 
Rights: This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
Thesis_final_CaoZhen_revised_v2.pdf5.38 MBAdobe PDFThumbnail
View/Open

Page view(s)

208
Updated on Jun 22, 2024

Download(s) 50

132
Updated on Jun 22, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.