Quantitative model of R-loop forming structures reveals a novel level of RNA–DNA interactome complexity
Kwoh, Chee Keong
Date of Issue2012
School of Computer Engineering
R-loop is the structure co-transcriptionally formed between nascent RNA transcript and DNA template, leaving the non-transcribed DNA strand unpaired. This structure can be involved in the hyper-mutation and dsDNA breaks in mammalian immunoglobulin (Ig) genes, oncogenes and neurodegenerative disease related genes. R-loops have not been studied at the genome scale yet. To identify the R-loops, we developed a computational algorithm and mapped R-loop forming sequences (RLFS) onto 66 803 sequences defined by UCSC as ‘known’ genes. We found that ∼59% of these transcribed sequences contain at least one RLFS. We created R-loopDB (http://rloop.bii.a-star.edu.sg/), the database that collects all RLFS identified within over half of the human genes and links to the UCSC Genome Browser for information integration and visualisation across a variety of bioinformatics sources. We found that many oncogenes and tumour suppressors (e.g. Tp53, BRCA1, BRCA2, Kras and Ptprd) and neurodegenerative diseases related genes (e.g. ATM, Park2, Ptprd and GLDC) could be prone to significant R-loop formation. Our findings suggest that R-loops provide a novel level of RNA–DNA interactome complexity, playing key roles in gene expression controls, mutagenesis, recombination process, chromosomal rearrangement, alternative splicing, DNA-editing and epigenetic modifications. RLFSs could be used as a novel source of prospective therapeutic targets.
DRNTU::Engineering::Computer science and engineering
Nucleic acids research
© 2011 The Author(s) (Published by Oxford University Press). This paper was published in Nucleic Acids Research and is made available as an electronic reprint (preprint) with permission of The Author(s) (Published by Oxford University Press). The paper can be found at the following official DOI: [http://dx.doi.org/10.1093/nar/gkr1075]. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper is prohibited and is subject to penalties under law.