Computational analysis of ChIP-seq data and its application to the reconstruction of transcriptional regulatory networks.
Date of Issue2011
School of Computer Engineering
On benefit of the rapid growth of ultra-high-throughput sequencing technologies in recent years, ChIP-seq (Chromatin Immuno-Precipitation Sequencing) has become the main stream for the genome-wide study of protein-DNA interactions and histone modifications. Large amount of ChIP-seq datasets have been generated and published. The analysis of ChIP-seq data posed new challenges to the bioinformatics community. In this thesis work, we proposed three computational techniques for the analysis of ChIP-seq data, in respect to different experimental design. Firstly, we developed Peak-finder for the prediction of transcription factor binding sites from a single ChIP library; secondly, we proposed a signal-noise model of ChIP-seq, from which we derived a general-purpose framework, CCAT (Control based ChIP-seq Analysis Tool), for the ChIP-seq applications with negative controls; thirdly, we introduced an HMM (Hidden Markov Model) approach, named ChIPDiff, to the comparative analysis of two ChIP-seq libraries that are associated with different cell-types or treatments. Next we addressed the problem of the reconstruction of transcriptional regulatory networks, which depict the relationship between transcription factors and their target genes. Based on the prediction of ChIP-enriched genomic sites, we proposed a probabilistic method to link the transcription factor binding sites to the putative target genes. We further refined target gene list with an integrative analysis that includes the microarray gene expression data. We apply our approaches to a large-scale ChIP-seq datasets in mESC (mouse Embryonic Stem Cell), and reconstruct the core transcriptional regulatory networks in this unique cell type.
DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences