Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/46760
Title: Prediction of functionality important sites from protein sequences
Authors: Muralidharan Nandakumar.
Keywords: DRNTU::Engineering
Issue Date: 2010
Abstract: The dissertation analyses the procedure for training the SVM for an imbalance multiclass dataset and two-class dataset and thereby maximize the prediction accuracy. One against all approach is followed for the multiclass problem and standard binary SVM for the two-class dataset. Experiments were performed to find the prediction accuracy using the proposed algorithm. The algorithm is tested on five datasets having 5261 samples (1444 features), 5261 samples (61 features), 768 samples, 197 samples (23 features), 267 samples (45 features). The probability estimates and also the decision functions values are found for the multiclass datasets. The one against all accuracies and prediction accuracies for the datasets considered are tabulated. The classification accuracy using the proposed method is tabulated below for each dataset. The SCOP datasets were classified with accuracy of 55.638 and 55.42 percentages. The SCOP datasets are multiclass datasets, whereas the two-class datasets as Pima dataset, Parkinson's dataset and SPECTF heart dataset were classified with the accuracy of 77, 93.2254 and 80.769 percentage respectively. The Algorithm used in this project needs to be tested on more datasets in the future.
Description: 57 p.
URI: http://hdl.handle.net/10356/46760
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
EEE_THESES_110.pdf
  Restricted Access
5.69 MBAdobe PDFView/Open

Page view(s) 20

195
checked on Oct 26, 2020

Download(s) 20

4
checked on Oct 26, 2020

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.