Computational analysis and prediction of specific genomic regions forming R-loop structure and chromosomal variations associated with cancer
Date of Issue2015
School of Computer Engineering
Bioinformatics Research Centre
An R-loop is a structure formed co-transcriptionally between a nascent RNA and its template DNA strand, leaving the non-template DNA strand unpaired. I hypothesized that R-loops could form in many genes in mammalians, associate with transcription and genetic instability. I developed a quantitative model of R-loop forming sequences (QmRLFSs) and bioinformatics tools to predict RLFSs in human and mouse genomes. I collected these RLFSs from throughout the genome into R-loopDB, a database of predicted R-loops (http://rloop.bii.a-star.edu.sg/). Most (60%) of human and mouse genes contain RLFSs, and 11,773 evolutionarily conserved RLFSs map to 7,630 protein-coding genes and 117 ncRNA genes. Validation using experimental data showed that the model predicts RLFSs with a high agreement. Integrative genomics analyses suggested that RLFSs could play a role in gene regulation, AID/APOBEC-mediated editing/mutagenesis, alternative splicing, and epigenetic modifications, and also associate with mutations in cancer, neurodegenerative diseases and mental disorders. Therefore, RLFSs represent novel therapeutic targets. Comparison of three RLFS prediction models demonstrates that QmRLFS would be a promising approach for researchers interested in identifying RLFSs for both small and large-scale data.
DRNTU::Science::Biological sciences::Molecular biology