Please use this identifier to cite or link to this item:
Full metadata record
DC FieldValueLanguage
dc.contributor.authorYang, Xuefeng-
dc.identifier.citationYang, X. (2016). Word representation learning. Doctoral thesis, Nanyang Technological University, Singapore.
dc.description.abstractThe research topic studied in this dissertation is word representation learning, which aims to learn the numerical vector representation for words in natural language. The learned vector representation of words may be used as a dictionary for computers and applied in many natural language processing tasks. There are two major research directions in this dissertation, including addressing the problems existing in the application of word vector representations and enhancing existing word vector representations in a postprocessing way. The works are categorized in 4 chapters based on the problems they aim to address, including the effect of imbalanced word frequency, bias of context definition, multi-prototype word representation learning and sentence vector representation learning (compositional distributional semantic). Firstly, the inconsistency problem between existing word vector representations and WordNet is identified based on empirical experimental analysis. Many potential factors affecting the identified problem are explored to locate the root cause, and the inconsistency problem is found to be a side effect of existing word vector representation algorithms and imbalanced word frequency. To alleviate the pain, two measures based on ordinal information and piecewise linear mapping are proposed. The experiment result empirically proves the effectiveness of proposed new measures. The first study reveals that the ranking of cosine values is more robust than the cosine values themselves. This motivates the author to improve the existing word vector representation by adjusting the ranking of cosine similarity values. With the help of ranking learning, a supervised fine tuning framework is proposed to alleviate the bias problem caused by context definition. As a postprocessing framework, the proposed fine tuning framework is compatible with all word vector representation learning models employing vectors to represent words. Various empirical experiments prove the proposed framework may significantly improve the performance of existing word vector representations. After addressing the bias problem of context definition, the supervised fine tuning framework is further enhanced to learn multi-prototype word vector representations. The mini context word sense disambiguation is proposed and integrated into the framework. Armed with new initialization and leaning algorithms, the framework may transfer a single-prototype word vector representation into a multi-prototype word vector representation. The experimental result reveals that the learned multiprototype word vector representation may encode different senses of the polysemous words and outperforms the original single-prototype word vector representation. At last, a vectorial sentence model is proposed to extend the existing word level vector representation to the sentence level vector representation. The proposed vectorial sentence model is based on phrase level semantic composition models and recursive neural network. It has a dynamic model structure which is consistent with the dependency tree of the given sentence. The model may benefit from both data driven learning algorithms and grammar rules defined by linguistic experts. Both phrase and sentence level evaluation experiments prove the proposed models are effective.en_US
dc.format.extent218 p.en_US
dc.subjectDRNTU::Engineering::Electrical and electronic engineeringen_US
dc.titleWord representation learningen_US
dc.contributor.supervisorMao Kezhien_US
dc.contributor.schoolSchool of Electrical and Electronic Engineeringen_US
dc.description.degreeDoctor of Philosophy (EEE)en_US
item.fulltextWith Fulltext-
Appears in Collections:EEE Theses
Files in This Item:
File Description SizeFormat 
Yang Xuefeng 2015.pdf
  Restricted Access
1.37 MBAdobe PDFView/Open

Page view(s)

Updated on Jun 18, 2021


Updated on Jun 18, 2021

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.