Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/75876
Full metadata record
DC FieldValueLanguage
dc.contributor.authorChong, Tze Yuang
dc.date.accessioned2018-07-04T12:01:48Z
dc.date.available2018-07-04T12:01:48Z
dc.date.issued2018
dc.identifier.citationChong, T. Y. (2018). Exploiting long context using joint distance and occurrence information for language modeling. Doctoral thesis, Nanyang Technological University, Singapore.
dc.identifier.urihttp://hdl.handle.net/10356/75876
dc.description.abstractThis thesis investigates an approach to exploiting the long context based on the information about the distance and occurrence. By modeling the joint event of distance and occurrence, this approach attempts to incorporate the inter-dependencies into the model, such that information captured from the long context can be more optimally made use. This thesis addresses the problem with the conventional language modeling approaches that tend to neglect the inter-dependencies. Based on the proposed approach, a novel language model, referred to as the term-distance term-occurrence (TDTO) model, is formulated. The TDTO model estimates probabilities based on the events of term-distance (TD) and term-occurrence (TO) that correspond to the distances and occurrences of words in the context. By expressing the TDTO model in terms of a log-linear interpolation framework, the impact of the TD and TO towards the final estimation can be tuned. Specifically, as the TD events, i.e. positions, within a long context are likely rare or unseen, the weight of the TD component can be tuned down accordingly to alleviate the data scarcity problem. Through a series of experiments, the TDTO model has been shown to be capable of exploiting the long context to reduce the perplexities of the language models. On the BLLIP Wall Street Journal (WSJ) and Switchboard-1 (SWB) corpora, perplexity reductions up to 11.2% and 6.5% were obtained, with the context lengths of seven and eight, respectively. In addition, the TDTO model has been shown to outperform other conventional models used to exploit the long context, such as the distant-bigram, trigger and BOW models – the TDTO model consistently showed lower perplexities. The applicability of the TDTO model has been examined on several tasks, such as the speech recognition, text classification and word prediction. The TDTO model has been shown to improve the baseline performance on all the considered tasks. Furthermore, this thesis proposes a neural network implementation of the TDTO model. The aim is to provide a better smoothing mechanism for TDTO modeling. The resulted model, referred to as the neural network based TDTO (NN-TDTO) model, has been empirically shown to outperform the baseline TDTO model in both perplexity and speech recognition accuracy. On the WSJ corpus, the NN-TDTO model yielded up to 9.2% lower perplexity as compared to the TDTO model. On the Aurora-4 speech recognition task, the NN-TDTO model obtained up to 12.9% relatively lower word error rate.en_US
dc.format.extent121 p.en_US
dc.language.isoenen_US
dc.subjectDRNTU::Engineering::Computer science and engineeringen_US
dc.titleExploiting long context using joint distance and occurrence information for language modelingen_US
dc.typeThesis
dc.contributor.supervisorChng Eng Siongen_US
dc.contributor.schoolSchool of Computer Science and Engineeringen_US
dc.description.degreeDoctor of Philosophy (SCE)en_US
dc.identifier.doi10.32657/10356/75876-
item.fulltextWith Fulltext-
item.grantfulltextopen-
Appears in Collections:SCSE Theses
Files in This Item:
File Description SizeFormat 
main.pdf1.5 MBAdobe PDFThumbnail
View/Open

Page view(s) 50

455
Updated on Jul 16, 2024

Download(s) 20

280
Updated on Jul 16, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.