This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. One of the challenges facing biology in general, and consequently multidisciplinary research in computational biology, is the assignment of biochemical and cellular functions to the thousands of hitherto uncharacterized gene products discovered by several international gene-sequencing projects. Data mining offers the promise of precise, objective, and accurate in-silico analysis of high-dimensional data using knowledge discovery routines that reveal embedded patterns, trends, and anomalies in order to create models for faster and more accurate physiological discovery. Feature fusion methods are capable of effectively integrating information from multiple data sources for reinforced learning and accurate prediction and analysis. In this research, we have developed multiple feature extraction, fusion, and mining algorithms for various bioinformatics challenges. We have developed novel methodologies for the identification of cell cyclic genes using information fusion from magnitude dependent methods (regulation) to magnitude independent methods (periodicity) that have been widely used on the Saccharomycescerevisiae data synchronization experiments. In another algorithm we hypothesize and present that integrating gene phase and primary protein sequence similarity to other genes in the dataset can enhance the identification of cell-cyclic genes. We have also developed a unique wavelet transformation based selective feature-filtering algorithm for feature ranking and phenotype classification. In another aspect of our research in the area, a new association rule based feature extraction technique has been developed using concave residues and residue parameters of proteins to find the frequent spatial arrangement of residue, which are exploited protein classification. The presentation will include the result highlights, key outcomes, and our future plans.
Showing the most recent 10 out of 179 publications