Feature selection in bioinformatics.
Date of Issue2012
Independent Component Analyses, Compressive Sampling, Wavelets, Neural Net, Biosystems, and Nanoengineering (10th : 2012 : Baltimore, USA)
School of Electrical and Electronic Engineering
In bioinformatics, there are often a large number of input features. For example, there are millions of single nucleotide polymorphisms (SNPs) that are genetic variations which determine the dierence between any two unrelated individuals. In microarrays, thousands of genes can be proled in each test. It is important to nd out which input features (e.g., SNPs or genes) are useful in classication of a certain group of people or diagnosis of a given disease. In this paper, we investigate some powerful feature selection techniques and apply them to problems in bioinformatics. We are able to identify a very small number of input features sucient for tasks at hand and we demonstrate this with some real-world data.
© 2012 Society of Photo-Optical Instrumentation Engineers (SPIE). This paper was published in Proceedings of SPIE-Independent Component Analyses, Compressive Sampling, Wavelets, Neural Net, Biosystems, and Nanoengineering X and is made available as an electronic reprint (preprint) with permission of Society of Photo-Optical Instrumentation Engineers (SPIE). The paper can be found at the following official DOI: [http://dx.doi.org/10.1117/12.921417]. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper is prohibited and is subject to penalties under law.