Statistical Model Building for High Dimensional Biomedical Data

Wu, Baolin

Abstract

Typical of current large-scale biomedical data is the feature of small number of observed samples and the widely observed sample heterogeneity. Identifying differentially expressed genes related to the sample phenotye (e.g., cancer disease development) and predicting sample phenotype based on the gene expressions are some central research questions in the microarray data analysis. Most existing statistical methods have ignored sample heterogeneity and thus loss power. This project proposes to develop novel statistical methods that explicitly address the small sample size and sampe heterogeneity issues, and can be applied very generally. The usefulness of these methods will be shown with the large-scale biomedical data originating from the lung and kidney transplant research projects. The transplant projects aimed to improve the molecular diagnosis and therapy of lung/kidney allograft rejection by identifying molecular biomarkers to predict the allograft rejection for critical early treatment and rapid, noninvasive, and economical testing.
The specific aims are 1) Develop novel statistical methods for differential gene expression detection that explicitly model sample heterogeneity. 2) Develop novel statistical methods for classifying high-dimensional biomedical data and incorporating sample heterogeneity. 3) Develop novel statistical methods for jointly analyzing a set of genes (e.g., genes in a pathway). 4) Use the developed models and methods to answer research questions relevant to public health in the lung and kidney transplant projects;and implement and validate the proposed methods in user-friendly and well-documented software, and distribute them to the scientific community at no charge. It is very important to identify new biomarkers of allograft rejection in lung and kidney transplant recipients. The rapid and reliable detection and prediction of rejection in easily obtainable body fluids may allow the rapid advancement of clinical interventional trials. We propose to study novel methods for analyzing the large-scale biomedical data to realize their full potential of molecular diagnosis and prognosis of transplant rejection prediction for critical early treatment.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM083345-03
Application #: 7858165
Study Section: Biodata Management and Analysis Study Section (BDMA)
Program Officer: Lyster, Peter

Project Start: 2008-08-01
Project End: 2012-05-31
Budget Start: 2010-06-01
Budget End: 2011-05-31
Support Year: 3
Fiscal Year: 2010
Total Cost: $253,269
Indirect Cost

Institution

Name: University of Minnesota Twin Cities
Department: Biostatistics & Other Math Sci
Type: Schools of Public Health
DUNS #: 555917996

City: Minneapolis
State: MN
Country: United States
Zip Code: 55455

Related projects


NIH 2011 R01 GM	Statistical Model Building for High Dimensional Biomedical Data Wu, Baolin / University of Minnesota Twin Cities	$250,488
NIH 2010 R01 GM	Statistical Model Building for High Dimensional Biomedical Data Wu, Baolin / University of Minnesota Twin Cities	$253,269
NIH 2009 R01 GM	Statistical Model Building for High Dimensional Biomedical Data Wu, Baolin / University of Minnesota Twin Cities	$256,073
NIH 2008 R01 GM	Statistical Model Building for High Dimensional Biomedical Data Wu, Baolin / University of Minnesota Twin Cities	$255,036

Publications

Guo, Bin; Wu, Baolin (2018) Statistical methods to detect novel genetic variants using publicly available GWAS summary data. Comput Biol Chem 74:76-79

Guo, Bin; Wu, Baolin (2018) Integrate multiple traits to detect novel trait-gene association using GWAS summary data with an adaptive test approach. Bioinformatics :

Guo, Bin; Wu, Baolin (2018) Powerful and efficient SNP-set association tests across multiple phenotypes using GWAS summary data. Bioinformatics :

Guo, Bin; Wu, Baolin (2018) Reader reaction on the fast small-sample kernel independence test for microbiome community-level association analysis. Biometrics 74:1120-1124

Wu, Baolin; Pankow, James S (2018) Fast and Accurate Genome-Wide Association Test of Multiple Quantitative Traits. Comput Math Methods Med 2018:2564531

Wu, Baolin; Pankow, James S (2017) Genome-wide association test of multiple continuous traits using imputed SNPs. Stat Interface 10:379-386

Wu, Baolin; Pankow, James S (2016) On Sample Size and Power Calculation for Variant Set-Based Association Tests. Ann Hum Genet 80:136-43

Wu, Baolin; Pankow, James S (2016) Sequence Kernel Association Test of Multiple Continuous Phenotypes. Genet Epidemiol 40:91-100

Wu, Baolin; Guan, Weihua; Pankow, James S (2016) On Efficient and Accurate Calculation of Significance P-Values for Sequence Kernel Association Testing of Variant Set. Ann Hum Genet 80:123-35

Wu, Baolin; Guan, Weihua (2015) Reader reaction on the generalized Kruskal-Wallis test for genetic association studies incorporating group uncertainty. Biometrics 71:556-7

Showing the most recent 10 out of 19 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: