Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/84179
Title: MCtandem : an efficient tool for large-scale peptide identification on many integrated core (MIC) architecture
Authors: Li, Chuang
Li, Kenli
Li, Keqin
Lin, Feng
Keywords: Engineering::Computer science and engineering
Peptide Identification
Tandem Mass Spectrometry
Issue Date: 2019
Source: Li, C., Li, K., Li, K., & Lin, F. (2019). MCtandem: an efficient tool for large-scale peptide identification on many integrated core (MIC) architecture. BMC Bioinformatics, 20(1), 397-. doi:10.1186/s12859-019-2980-5
Series/Report no.: BMC Bioinformatics
Abstract: Background:Tandem mass spectrometry (MS/MS)-based database searching is a widely acknowledged and widely used method for peptide identification in shotgun proteomics. However, due to the rapid growth of spectra data produced by advanced mass spectrometry and the greatly increased number of modified and digested peptides identified in recent years, the current methods for peptide database searching cannot rapidly and thoroughly process large MS/MS spectra datasets. A breakthrough in efficient database search algorithms is crucial for peptide identification in computational proteomics.Results:This paper presents MCtandem, an efficient tool for large-scale peptide identification on Intel Many Integrated Core (MIC) architecture. To support big data processing capability, a novel parallel match scoring algorithm, named MIC-SDP (spectrum dot product), and its two-level parallelization are presented in MCtandem’s design. In addition, a series of optimization strategies on both the host CPU side and the MIC side, which includes pre-fetching, optimized communication overlapping scheme, multithreading and hyper-threading, are exploited to improve the execution performance.Conclusions:For fair comparisons, we first set up experiments and verified the 28 fold times speedup on a single MIC against the original CPU-based implementation. We then execute the MCtandem for a very large dataset on an MIC cluster (a component of the Tianhe-2 supercomputer) and achieved much higher scalability than in a benchmark MapReduce-based programs, MR-Tandem. MCtandem is an open-source software tool implemented in C++. The source code and the parameter settings are available at https://github.com/LogicZY/MCtandem.
URI: https://hdl.handle.net/10356/84179
http://hdl.handle.net/10220/49783
DOI: http://dx.doi.org/10.1186/s12859-019-2980-5
Rights: © 2019 The Author(s). This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Journal Articles

Google ScholarTM

Check

Altmetric

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.