Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/81319
Title: Identifying protein complexes from heterogeneous biological data
Authors: Wu, Min
Xie, Zhipeng
Li, Xiaoli
Kwoh, Chee Keong
Zheng, Jie
Keywords: DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
Issue Date: 2013
Source: Wu, M., Xie, Z., Li, X., Kwoh, C. K., & Zheng, J. (2013). Identifying protein complexes from heterogeneous biological data. Proteins: Structure, Function, and Bioinformatics, 81(11), 2023-2033.
Series/Report no.: Proteins: Structure, Function, and Bioinformatics
Abstract: With the increasing availability of diverse biological information for proteins, integration of heterogeneous data becomes more useful for many problems in proteomics, such as annotating protein functions, predicting novel protein–protein interactions and so on. In this paper, we present an integrative approach called InteHC (Integrative Hierarchical Clustering) to identify protein complexes from multiple data sources. Although integrating multiple sources could effectively improve the coverage of current insufficient protein interactome (the false negative issue), it could also introduce potential false-positive interactions that could hurt the performance of protein complex prediction. Our proposed InteHC method can effectively address these issues to facilitate accurate protein complex prediction and it is summarized into the following three steps. First, for each individual source/feature, InteHC computes the matrices to store the affinity scores between a protein pair that indicate their propensity to interact or co-complex relationship. Second, InteHC computes a final score matrix, which is the weighted sum of affinity scores from individual sources. In particular, the weights indicating the reliability of individual sources are learned from a supervised model (i.e., a linear ranking SVM). Finally, a hierarchical clustering algorithm is performed on the final score matrix to generate clusters as predicted protein complexes. In our experiments, we compared the results collected by our hierarchical clustering on each individual feature with those predicted by InteHC on the combined matrix. We observed that integration of heterogeneous data significantly benefits the identification of protein complexes. Moreover, a comprehensive comparison demonstrates that InteHC performs much better than 14 state-of-the-art approaches. All the experimental data and results can be downloaded from http://www.ntu.edu.sg/home/zhengjie/data/InteHC.
URI: https://hdl.handle.net/10356/81319
http://hdl.handle.net/10220/18179
ISSN: 0887-3585
DOI: 10.1002/prot.24365
Rights: © 2013 Wiley Periodicals, Inc. This paper was published in Proteins: Structure, Function, and Bioinformatics and is made available as an electronic reprint (preprint) with permission of Wiley Periodicals, Inc. The paper can be found at the following official DOI: http://dx.doi.org/10.1002/prot.24365. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper is prohibited and is subject to penalties under law.
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Journal Articles

Files in This Item:
File Description SizeFormat 
Wu_Proteins_2013.pdfMain Article373.52 kBAdobe PDFThumbnail
View/Open
ProteinsSupplementary.pdfSupplementary Information86.97 kBAdobe PDFThumbnail
View/Open
Data_Used.zipData used1.25 MBUnknownView/Open
InteHC_individual.zipProtein complexes predicted by InteHC, on individual data sources66.69 kBUnknownView/Open
PPI.zipProtein complexes predicted from PPI networks100.89 kBUnknownView/Open
InteHC.txtProtein Complexes Predicted by InteHC48.66 kBTextView/Open
TAP-MS.zipProtein complexes predicted from TAP-MS data45.2 kBUnknownView/Open
CMBI.txtProtein Complexes Predicted by CMBI103.88 kBTextView/Open
Contact_Us.txtAuthor's Contact116 BTextView/Open

SCOPUSTM   
Citations 20

18
Updated on Jul 5, 2022

Web of ScienceTM
Citations 10

17
Updated on Jul 3, 2022

Page view(s) 50

456
Updated on Sep 26, 2022

Download(s) 10

412
Updated on Sep 26, 2022

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.