Statistical methods for large-scale significance and prediction analysis with app

Wu, Baolin

Abstract

Current technology advances have brought us massive biomedical data for statistical analysis, for example, the cancer microarray data. Typical of these data is the common feature that the number of observed samples is much smaller than the number of variables/predictors, which poses challenges for statistical analysis. Identifying differentially expressed genes and predicting sample phenotype based on the gene expressions data are two important research questions in analyzing these large-scale biomedical data. This project proposes to develop some new large-scale prediction and signifiance analysis statistical methods that are specially designed to address small sample size and potential sampe heterogeneity issues, incorporate existing biological information for improved inference, and can be applied very generally. The usefulness of these methods will be shown with the large-scale biomedical data originating from the leukemia cancer research projects. The cancer projects aimed to improve the cancer molecular diagnosis and prognosis by identifying molecular biomarkers for critical early treatment and rapid, noninvasive testing.
The specific aims are 1) Develop new statistical methods for significance testing of large-scale molecular markers. 2) Develop new statistical methods that appropriately model the sample heterogeneity for significance testing. 3) Develop new statistical methods that utilize the gene group information to improve cancer prediction. 4) Use the developed models and methods to answer research questions relevant to public health in the leukemia cancer projects;and implement and validate the proposed methods in user-friendly and well-documented software, and distribute them to the scientific community at no charge. Project

Public Health Relevance

It is very important to identify new biomarkers and study the molecular prediction of leukemia cancer patients. We propose to study novel statistical methods for analyzing the large-scale biomedical data to realize their full potential of molecular diagnosis and prognosis of leukemia cancer.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Research Project (R01)
Project #: 5R01CA134848-02
Application #: 7835748
Study Section: Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer: Tricoli, James

Project Start: 2009-05-07
Project End: 2012-04-30
Budget Start: 2010-05-01
Budget End: 2012-04-30
Support Year: 2
Fiscal Year: 2010
Total Cost: $152,039
Indirect Cost

Institution

Name: University of Minnesota Twin Cities
Department: Biostatistics & Other Math Sci
Type: Schools of Public Health
DUNS #: 555917996

City: Minneapolis
State: MN
Country: United States
Zip Code: 55455

Related projects


NIH 2010 R01 CA	Statistical methods for large-scale significance and prediction analysis with app Wu, Baolin / University of Minnesota Twin Cities	$152,039
NIH 2009 R01 CA	Statistical methods for large-scale significance and prediction analysis with app Wu, Baolin / University of Minnesota Twin Cities	$138,978

Publications

Guo, Bin; Wu, Baolin (2018) Reader reaction on the fast small-sample kernel independence test for microbiome community-level association analysis. Biometrics 74:1120-1124

Wu, Baolin; Pankow, James S (2018) Fast and Accurate Genome-Wide Association Test of Multiple Quantitative Traits. Comput Math Methods Med 2018:2564531

Guo, Bin; Wu, Baolin (2018) Statistical methods to detect novel genetic variants using publicly available GWAS summary data. Comput Biol Chem 74:76-79

Guo, Bin; Wu, Baolin (2018) Integrate multiple traits to detect novel trait-gene association using GWAS summary data with an adaptive test approach. Bioinformatics :

Guo, Bin; Wu, Baolin (2018) Powerful and efficient SNP-set association tests across multiple phenotypes using GWAS summary data. Bioinformatics :

Wu, Baolin; Pankow, James S (2017) Genome-wide association test of multiple continuous traits using imputed SNPs. Stat Interface 10:379-386

Wu, Baolin; Guan, Weihua; Pankow, James S (2016) On Efficient and Accurate Calculation of Significance P-Values for Sequence Kernel Association Testing of Variant Set. Ann Hum Genet 80:123-35

Wu, Baolin; Pankow, James S (2016) On Sample Size and Power Calculation for Variant Set-Based Association Tests. Ann Hum Genet 80:136-43

Wu, Baolin; Pankow, James S (2016) Sequence Kernel Association Test of Multiple Continuous Phenotypes. Genet Epidemiol 40:91-100

Wu, Baolin; Guan, Weihua (2015) Reader reaction on the generalized Kruskal-Wallis test for genetic association studies incorporating group uncertainty. Biometrics 71:556-7

Showing the most recent 10 out of 19 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: