Please use this identifier to cite or link to this item:
Title: Topological analysis of protein structures with statistical learning
Authors: Lee, Si Xian
Keywords: Science::Biological sciences::Molecular biology
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Issue Date: 2018
Publisher: Nanyang Technological University
Abstract: The study of Protein structure-function relationship has been of key focus in computational biology. A novel method of protein data analysis involves the use of Persistent Homology Analysis (PHA) as a tool for protein classification. The main method of feature engineering from intervals generated is using systematic approach of binning to characterise topological features. These features are then applied into 3 types of statistical learning methods: SVM, Tree-based methods and Neural Networks. Protein classification tasks used include: classification of hemoglobin molecules in relaxed and taut form (task 1) or the identification of all alpha, all beta and alpha-beta protein domains carried out on 450 and 900 proteins samples (task 2 and 3 respectively). The used of modified tree-based approach showed surprisingly stable results that attained the highest overall accuracy of 93.3% and 87.8% for task 2 and 3 respectively.
Schools: School of Physical and Mathematical Sciences 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SPMS Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
  Restricted Access
1.77 MBAdobe PDFView/Open

Page view(s)

Updated on Sep 26, 2023


Updated on Sep 26, 2023

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.