Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/141962
Title: | Learning distributed sentence representations for story segmentation | Authors: | Yu, Jia Xie, Lei Xiao, Xiong Chng, Eng Siong |
Keywords: | Engineering::Computer science and engineering | Issue Date: | 2018 | Source: | Yu, J., Xie, L., Xiao, X., & Chng, E. S. (2018). Learning distributed sentence representations for story segmentation. Signal Processing, 142, 403-411. doi:10.1016/j.sigpro.2017.07.026 | Journal: | Signal Processing | Abstract: | Traditional sentence representations such as bag-of-words (BOW) and term frequency-inverse document frequency (tf-idf) face the problem of data sparsity and may not generalize well. Neural network based representations such as word/sentence vectors are usually trained in an unsupervised way and lack the topic information which is important for story segmentation. In this paper, we propose to learn sentence representation by using deep neural network (DNN) to directly predict the topic class of the input sentence. By using supervised training, the learned vector representation of sentences contains more topic information and is more suitable for the story segmentation task. The input of the DNN is BOW vector computed from a context window. Multiple time resolution BOW and bottleneck features (BNF) are also introduced to enhance the performance of story segmentation. As text data labeled with topic information is limited, we cluster stories into classes and use the class ID as the topic label of the stories for DNN training. We evaluated the proposed sentence representation with the TextTiling and normalized cuts (NCuts) based story segmentation methods on the topic detection and tracking (TDT2) task. Experimental results show that the proposed topical sentence representation outperforms both the BOW baseline and the recently proposed neural network based representations, i.e., word and sentence vectors. | URI: | https://hdl.handle.net/10356/141962 | ISSN: | 0165-1684 | DOI: | 10.1016/j.sigpro.2017.07.026 | Rights: | © 2017 Elsevier B.V. All rights reserved. | Fulltext Permission: | none | Fulltext Availability: | No Fulltext |
Appears in Collections: | TL Journal Articles |
SCOPUSTM
Citations
50
3
Updated on Feb 4, 2023
Web of ScienceTM
Citations
50
3
Updated on Feb 5, 2023
Page view(s)
207
Updated on Feb 6, 2023
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.