Molecular biology and genetics are currently at the center of an information revolution. High-throughput techniques generate very large amounts of heterogeneous data. There is a large gap between our ability to collect data and our ability to interpret it. The challenge faced by today's researchers is to develop effective ways to analyze the vast amount of data that has been and will continue to be collected. First released in 2001, Onto-Tools is an open access software suite that partially addresses this problem. This is achieved by using a probabilistic functional analysis that bridges the gap between low-level, high-throughput gene expression data and high-level functional knowledge. This proposal further addresses the existing computational challenge by developing novel data analysis techniques based on latent semantic indexing and statistical analysis. These techniques will be able to discover previously unknown interactions and phenomena at the genetic level. New gene-to-gene interactions, novel gene-to-function assignments, and perturbed interactions between biological processes can be discovered using this approach. Most techniques that try to predict gene functions rely on genes with known function and use sequence similarity, 3D structure similarity, or both, to extrapolate the functionality of novel genes. Our approach is innovative because it uses real expression data to extract the functions of the target genes. The proposed tool, Onto-Analyzer, will offer these capabilities as part of the Onto-Tools suite. Onto-Tools will also be enhanced by adding support for 17 new organisms. The goal of developing advanced computational tools is also obstructed by another gap in the current knowledge. Successful public bioinformatics software has a large and demanding user community. To radically evolve software and, at the same time, to maintain the expected level of service usually requires large resources, unavailable to small research groups. Therefore migration to open source platform is desirable. In order to enable this migration, the evolvability, reparability, modifiability, portability, understandability, and the documentation of Onto-Tools will be improved. The proposed research will develop a set of software evolution tools to address these issues trough software, visualization, change impact analysis, traceability link recovery, and re-documentation. The enhanced Onto-Tools will continue to be freely available. Onto-Analyzer and the software evolution tools developed under this proposal will be made freely available to the research community, as well.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BST-D (51))
Program Officer
Bonazzi, Vivien
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Wayne State University
Biostatistics & Other Math Sci
Schools of Arts and Sciences
United States
Zip Code
Lin, Ho-Sheng; Siddiq, Fauzia; Talwar, Harvinder S et al. (2014) Serum prognostic biomarkers in head and neck cancer patients. Laryngoscope 124:1819-26
Ansari, Nadeem A; Bao, Riyue; Voichi?a, C?lin et al. (2012) Detecting phenotype-specific interactions between biological processes from microarray data and annotations. IEEE/ACM Trans Comput Biol Bioinform 9:1399-409
Done, Bogdan; Khatri, Purvesh; Done, Arina et al. (2010) Predicting novel human gene ontology annotations using semantic analysis. IEEE/ACM Trans Comput Biol Bioinform 7:91-9
Hassan, Sonia S; Romero, Roberto; Pineles, Beth et al. (2010) MicroRNA expression profiling of the human uterine cervix after term labor and delivery. Am J Obstet Gynecol 202:80.e1-8
Tarca, Adi Laurentiu; Draghici, Sorin; Khatri, Purvesh et al. (2009) A novel signaling pathway impact analysis. Bioinformatics 25:75-82
Draghici, Sorin; Tarca, Adi L; Yu, Longfei et al. (2008) KUTE-BASE: storing, downloading and exporting MIAME-compliant microarray experiments in minutes rather than hours. Bioinformatics 24:738-40
Tarca, Adi L; Carey, Vincent J; Chen, Xue-wen et al. (2007) Machine learning and its applications to biology. PLoS Comput Biol 3:e116
Lin, Ho-Sheng; Talwar, Harvinder S; Tarca, Adi L et al. (2007) Autoantibody approach for serum-based detection of head and neck cancer. Cancer Epidemiol Biomarkers Prev 16:2396-405
Hassan, Sonia S; Romero, Roberto; Tarca, Adi L et al. (2007) Signature pathways identified from gene expression profiles in the human uterine cervix before and after spontaneous term parturition. Am J Obstet Gynecol 197:250.e1-7
Streicher, K L; Yang, Z Q; Draghici, S et al. (2007) Transforming function of the LSM1 oncogene in human breast cancers with the 8p11-12 amplicon. Oncogene 26:2104-14

Showing the most recent 10 out of 17 publications