This subproject is one of many research subprojects utilizing theresources provided by a Center grant funded by NIH/NCRR. The subproject andinvestigator (PI) may have received primary funding from another NIH source,and thus could be represented in other CRISP entries. The institution listed isfor the Center, which is not necessarily the institution for the investigator.One of the daunting challenges facing Biology, and consequently multidisciplinary research in Computer Science, is to assign biochemical and cellular functions to the thousands of hitherto uncharacterized gene products discovered by several international gene-sequencing projects. Recent advances in bioinformatics can be attributed to the ability of the microarrays to rapidly and accurately monitor transcriptional behavior over an entire genome under varying conditions. Consequently, computing techniques for clustering and dimensionality reduction have played a major role in microarray data analysis for clinical decision support. In this project we hypothesize that computational, algorithmic approaches which are capable of (1) identifying the intrinsic behavior of microarray data and of (2) modeling a classifier for autonomous physiological discovery can discover potential biological markers for further clinical elucidation. We have designed and developed novel methods for dimensionality reduction of gene expression data to address the volume and complexity of such databases. These methods employ the principles of self-similarity and probabilistic theory to mine the physiological knowledge embodied in these databases and to exploit it for enhanced feature identification and aggregation. Subsequently, we have developed novel computational frameworks for autonomous supervised and unsupervised classification of tissue samples. These techniques include support vector machines, quadtree-based methods, and association rule aggregation methods. We have also enhanced our research to develop a protocol for performance evaluation and benchmarking of our approaches. Our experiments have shown our methods to be robust and show enhanced performance (in both specificity and sensitivity) by our feature/class discovery approaches.
Showing the most recent 10 out of 179 publications