Integration of genomic and epigenomic features to predict meiotic recombination hotspots in human and mouse
Kwoh, Chee Keong
Przytycka, Teresa M.
Date of Issue2012
Conference on Bioinformatics, Computational Biology and Biomedicine (2012 : Orlando, USA)
School of Computer Engineering
The regulatory mechanism of meiotic recombination hotspots is a fundamental problem in biology, with broad impacts on areas ranging from disease study to evolution. Recently, many genomic and epigenomic features have been associated with recombination hotspots, but none of them can explain hotspots consistently. It is highly desirable to integrate the different features into a predictive model, and study the relation of the features with hotspots and themselves with a systems approach. Moreover, due to rapid and dynamic evolution of recombination hotspots, regulatory mechanisms of hotspots that are evolutionarily conserved among species remain unclear. We propose a machine learning approach that encode genomic and epigenomic features into a support vector machine (SVM). Trained on known hotspots and coldspots in human and mouse genomes, the model is able to predict hotspots based on the features with good performance in both species. Moreover, the model reports a ranking of feature importance, uncovering the interactions of the features with hotspots and themselves. Applying the method to large-scale data, we identified evolutionarily conserved patterns of trans-regulators and feature importance between human and mouse hotspots. This is the first attempt to build a predictive model to identify evolutionarily conserved mechanisms for recombination hotspots by integrating both genomic and epigenomic features.
DRNTU::Engineering::Computer science and engineering
© 2012 ACM. This is the author created version of a work that has been peer reviewed and accepted for publication by Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine - BCB '12, ACM. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [http://dx.doi.org/10.1145/2382936.2382974].