Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/62840
Title: Deriving protein-protein interactions of dengue from literature by using automatic content extraction features
Authors: Huang, Yizhou
Keywords: DRNTU::Engineering::Computer science and engineering::Data
Issue Date: 2015
Abstract: Dengue Fever is one of the most severe diseases spread throughout the tropics. However, due to the immunity response elicited among the serotypes of dengue virus, it is very difficult to develop vaccines to protect human from dengue infections. However, with the advancement in technology, researchers have focused on the area of genetic structure to develop vaccines. This project aims to regulate Automatic Content Extraction features and uses these features to derive protein interactions and gene regulation relations through text mining. The human gene list and dengue gene list are downloaded from online genome mapping repository while the texts are retrieved from abstracts of biomedical literature. Sentences are then pre-processed for further analysis. Biological knowledge and facts on gene regulations and protein interactions are generated with optimized methods and techniques. In this project, the keyword-tag and word-relation-word features are extracted to describe the regulation relations. To investigate the performance of different feature sets, this project makes use of Stanford Natural Language Processing Tools to analyse the semantic structure of sentences. A decision tree classifier is trained to learn the extracted patterns to perform the prediction job. The accuracy based on keyword-tag and word-relation-word feature have reached 99.4%. The reason for high accuracy is that the feature sets also contain some features extracted from the testing dataset. To improve this problem, more datasets will be involved to evaluate the performance.
URI: http://hdl.handle.net/10356/62840
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
FYP Report.pdf
  Restricted Access
Data Mining on Bio-Medical Literature2.85 MBAdobe PDFView/Open

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.