Recent genomic and proteomic data sets are so disparate and complex that not too many studies have provided robust and sophisticated modeling for the latent information. The principal research of the underlying project strives to make efficient and maximal use of the generated information from heterogeneous temporal genomic sources where the special features are present. The investigator aims to develop, implement and validate the innovative advanced Bayesian modeling techniques, such as Bayesian state space models for studying the dynamics of heterogeneous temporal genomic metadata for (i) inferring and predicting the genomic profiles associated with diseases and treatments; (ii) estimating the important hidden biological parameters; (iii) constructing gene-time-gene and protein-protein interaction networks and pathways for hybrid biological systems. This should be sufficient to explain causal and probable relations about the interactions of genes-treatments-diseases and gene-environment. The investigator evaluates the efficacy and sensitivity of the proposed models through detailed study of specific diseases, such as time course of lymphocyte gene expression data from interferon-beta-1a treated multiple sclerosis patients and multiple tissues polygenic data such as kidney and liver data of animals sacrificed at 17 time points following administration of a bolus dose of MPL. Graphic display of the results from each model are provided to explore the dynamics of the modeling processes, which marks an important intermediate goal that allows visual examination of the degrees of heterogeneity between models. Through the research and educational activities the heterogeneous raw temporal genomic data are converted into scientific knowledge that advances our understanding of today's common complex diseases, biological processes and potentially identify new modalities of treatment.

This project describes a novel area for the field of statistical genomics and bioinformatics, which is driven by the over-availability of a variety of heterogeneous temporal genomic data and methodologies. Through the research and teaching activities, systematic understanding and overall knowledge is generated for efficient data exploration methodology, primarily contributing to the areas of statistics and computer science. If fully successful, its contribution to the fields of biology, pharmaceutical sciences, and medical sciences could be invaluable, potentially speeding up research, diagnosis, drug development, and medical decision making ultimately improving human and other life. Biomedical/genomic applications of the developed methodologies would support biologists and medical researchers to better understand the underlying causes of diseases, the risks and offer a more powerful diagnostic tool and predictive treatment and provide customized solutions to genomic data analysis. Both the theoretical and the practical foundations of the activity will make an impact on higher education, especially in training the current and next generation of statisticians and computational scientists to tackle challenges involved in the human genome research. The techniques developed in this project are be used to augment and develop related courses and made available on the internet for outreach at large.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0604639
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2006-06-01
Budget End
2009-02-28
Support Year
Fiscal Year
2006
Total Cost
$144,999
Indirect Cost
Name
Suny at Buffalo
Department
Type
DUNS #
City
Buffalo
State
NY
Country
United States
Zip Code
14260