Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/145882
Title: GO2Vec : transforming GO terms and proteins to vector representations via graph embeddings
Authors: Zhong, Xiaoshi
Kaalia, Rama
Rajapakse, Jagath Chandana
Keywords: Science::Biological sciences
Issue Date: 2019
Source: Zhong, X., Kaalia, R., & Rajapakse, J. C. (2019). GO2Vec : transforming GO terms and proteins to vector representations via graph embeddings. BMC Genomics, 20, 918-. doi:10.1186/s12864-019-6272-2
Project: MOE2016-T2-1-029
Journal: BMC Genomics
Abstract: Background: Semantic similarity between Gene Ontology (GO) terms is a fundamental measure for many bioinformatics applications, such as determining functional similarity between genes or proteins. Most previous research exploited information content to estimate the semantic similarity between GO terms; recently some research exploited word embeddings to learn vector representations for GO terms from a large-scale corpus. In this paper, we proposed a novel method, named GO2Vec, that exploits graph embeddings to learn vector representations for GO terms from GO graph. GO2Vec combines the information from both GO graph and GO annotations, and its learned vectors can be applied to a variety of bioinformatics applications, such as calculating functional similarity between proteins and predicting protein-protein interactions. Results: We conducted two kinds of experiments to evaluate the quality of GO2Vec: (1) functional similarity between proteins on the Collaborative Evaluation of GO-based Semantic Similarity Measures (CESSM) dataset and (2) prediction of protein-protein interactions on the Yeast and Human datasets from the STRING database. Experimental results demonstrate the effectiveness of GO2Vec over the information content-based measures and the word embedding-based measures. Conclusion: Our experimental results demonstrate the effectiveness of using graph embeddings to learn vector representations from undirected GO and GOA graphs. Our results also demonstrate that GO annotations provide useful information for computing the similarity between GO terms and between proteins.
URI: https://hdl.handle.net/10356/145882
ISSN: 1471-2164
DOI: 10.1186/s12864-019-6272-2
Schools: School of Computer Science and Engineering 
Rights: © 2020 The Author(s). This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Journal Articles

Files in This Item:
File Description SizeFormat 
s12864-019-6272-2.pdf1.54 MBAdobe PDFThumbnail
View/Open

SCOPUSTM   
Citations 20

21
Updated on Mar 22, 2024

Web of ScienceTM
Citations 20

19
Updated on Oct 26, 2023

Page view(s)

193
Updated on Mar 29, 2024

Download(s) 50

73
Updated on Mar 29, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.