Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/50628
Title: Design and implementation of parallel bioinformatics algorithms on heterogeneous computing architectures
Authors: Liu, Yongchao.
Keywords: DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
Issue Date: 2012
Abstract: Massively parallel DNA sequencing technologies have revolutionized genomics and molecular biology by producing large volumes of high quality DNA sequence data at a relatively low cost. However, this growthin sequence data establishes the need for more powerful computational hardware infrastructure and more sophisticated software algorithms for efficient management and analysis. This poses a number of challengesto the bioinformatics community in order to meet the compute-intensive and data-intensive requirements of current sequence analysis. This thesis makes contributions by conceiving and developing parallel algorithms for three primary research areas in bioinformatics, i.e. sequence alignment, motif discovery and genome sequencing, targeting heterogeneous computing architectures consisting of CUDA-enabled GPUs, multi-core CPUs, and CPU/GPU clusters. By combining different parallel programming models, the heterogeneous computing architectures are able to provide support for three kinds of computations: device-level computation on GPUs, node-level multi-threaded computation on shared-memory CPUs, and cluster-level parallel and distributed computation over compute nodes. The primary contributions to sequence alignment are the investigation of three parallel algorithms: CUDASW++, MSA-CUDA and MSAProbs. CUDASW++ is a CUDA-based protein sequence database search algorithm for multiple GPUs. It produces better performance in terms of execution speed and accuracy compared to other publicly available tools such as SWPS3, SW-CUDA and NCBI-BLAST+. Both MSA-CUDA and MSAProbs are multiple protein sequence aligners. MSA-CUDA accelerates the ClustalW processing pipeline using CUDA and achieves significant speedups over sequential ClustalW on a single GPU. MSAProbs is a new and practical multi-threaded aligner based on the pair hidden Markov models and partition function posterior probabilities for shared-memory CPUs. It achieves statistically significant alignment accuracy improvements over the existing top performing aligners, including ClustalW, MAFFT, MUSCLE, ProbCons, and Probalign, while demonstrating competitive speed. The primary contribution to motif discovery is the investigation of CUDA-MEME, a parallel and distributed motif discovery algorithm based on the MEME algorithm.
URI: http://hdl.handle.net/10356/50628
Schools: School of Computer Engineering 
Research Centres: Centre for High Performance Embedded Systems 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Theses

Files in This Item:
File Description SizeFormat 
Ph.D thesis final version.pdf
  Restricted Access
Main article2.55 MBAdobe PDFView/Open

Page view(s) 5

1,128
Updated on Mar 27, 2024

Download(s) 50

22
Updated on Mar 27, 2024

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.