Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/83166
Title: PSDVec: A toolbox for incremental and scalable word embedding
Authors: Li, Shaohua
Zhu, Jun
Miao, Chunyan
Keywords: Word embedding
Matrix factorization
Issue Date: 2016
Source: Li, S., Zhu, J., & Miao, C. (2016). PSDVec: A toolbox for incremental and scalable word embedding. Neurocomputing, 237, 405-409.
Series/Report no.: Neurocomputing
Abstract: PSDVec is a Python/Perl toolbox that learns word embeddings, i.e. the mapping of words in a natural language to continuous vectors which encode the semantic/syntactic regularities between the words. PSDVec implements a word embedding learning method based on a weighted low-rank positive semidefinite approximation. To scale up the learning process, we implement a blockwise online learning algorithm to learn the embeddings incrementally. This strategy greatly reduces the learning time of word embeddings on a large vocabulary, and can learn the embeddings of new words without re-learning the whole vocabulary. On 9 word similarity/analogy benchmark sets and 2 Natural Language Processing (NLP) tasks, PSDVec produces embeddings that has the best average performance among popular word embedding tools. PSDVec provides a new option for NLP practitioners.
URI: https://hdl.handle.net/10356/83166
http://hdl.handle.net/10220/42454
ISSN: 0925-2312
DOI: https://doi.org/10.1016/j.neucom.2016.05.093
Rights: © 2016 Elsevier B. V. This is the author created version of a work that has been peer reviewed and accepted for publication by Neurocomputing, Elsevier. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [https://doi.org/10.1016/j.neucom.2016.05.093].
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Journal Articles

Files in This Item:
File Description SizeFormat 
PSDVec_A_toolbox_for_incremental_and_scalable_word_embedding_accepted.pdf334.57 kBAdobe PDFThumbnail
View/Open

Google ScholarTM

Check

Altmetric

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.