Methods for genomic data with graphical structures

Lee, Hongzhe

Abstract

The broad, long-term objective of this project concerns the development of novel statistical methods and computational tools for statistical and probabilistic modeling of genomic data motivated by important biological questions and experiments.
The specific aim of the current project is to develop new statistical models and methods for analysis of genomic data with graphical structures, focusing on methods for analyzing genetic pathways and networks, including the development of nonparametric pathway-smooth tests for two-sample and analysis of variance problems for identifying pathways with perturbed activity between two or multiple experimental conditions, the development of group Lasso and group threshold gradient descent regularized estimation procedures for the pathway-smoothed generalized linear models, Cox proportional hazards models and the accelerated failure time models in order to identify pathways that are related to various clinical phenotypes. These methods hinge on novel integration of spectral graph theory, non-parametric methods for analysis of multivariate data and regularized estimation methods fro statistical learning. The new methods can be applied to different types of genomic data and will ideally facilitate the identification of genes and biological pathways underlying various complex human diseases and complex biological processes. The project will also investigate the robustness, power and efficiencies o these methods and compare them with existing methods. In addition, this project will develop practical a feasible computer programs in order to implement the proposed methods, to evaluate the performance o these methods through application to real data on microarray gene expression studies of human hear failure, cardiac allograft rejection and neuroblastoma. The work proposed here will contribute both statistical methodology to modeling genomic data with graphical structures, to studying complex phenotypes and biological systems and methods for high-dimensional data analysis, and offer insight into each of the clinical areas represented by the various data sets to evaluate these new methods. All programs developed under this grant and detailed documentation will be made available free-of-charge to interested researchers via the World Wide Web.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Research Project (R01)
Project #: 5R01CA127334-04
Application #: 7798186
Study Section: Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer: Li, Jerry

Project Start: 2007-07-01
Project End: 2012-04-30
Budget Start: 2010-05-01
Budget End: 2012-04-30
Support Year: 4
Fiscal Year: 2010
Total Cost: $289,814
Indirect Cost

Institution

Name: University of Pennsylvania
Department: Biostatistics & Other Math Sci
Type: Schools of Medicine
DUNS #: 042250712

City: Philadelphia
State: PA
Country: United States
Zip Code: 19104

Related projects


NIH 2016 R01 CA	Methods for genomic data with graphical structures Lee, Hongzhe / University of Pennsylvania	$293,687
NIH 2015 R01 CA	Methods for genomic data with graphical structures Lee, Hongzhe / University of Pennsylvania
NIH 2014 R01 CA	Methods for genomic data with graphical structures Lee, Hongzhe / University of Pennsylvania
NIH 2013 R01 CA	Methods for genomic data with graphical structures Lee, Hongzhe / University of Pennsylvania	$281,354
NIH 2012 R01 CA	Methods for genomic data with graphical structures Lee, Hongzhe / University of Pennsylvania	$304,000
NIH 2010 R01 CA	Methods for genomic data with graphical structures Lee, Hongzhe / University of Pennsylvania	$289,814
NIH 2009 R01 CA	Methods for genomic data with graphical structures Lee, Hongzhe / University of Pennsylvania	$290,671
NIH 2008 R01 CA	Methods for genomic data with graphical structures Lee, Hongzhe / University of Pennsylvania	$291,451
NIH 2007 R01 CA	Methods for genomic data with graphical structures Lee, Hongzhe / University of Pennsylvania	$292,160

Publications

Vajravelu, Ravy K; Scott, Frank I; Mamtani, Ronac et al. (2018) Medication class enrichment analysis: a novel algorithm to analyze multiple pharmacologic exposures simultaneously using electronic health record data. J Am Med Inform Assoc 25:780-789

Xia, Yin; Cai, Tianxi; Cai, T Tony (2018) Multiple Testing of Submatrices of a Precision Matrix with Applications to Identification of Between Pathway Interactions. J Am Stat Assoc 113:328-339

B Sohn, Michael; Li, Hongzhe (2018) A GLM-based latent variable ordination method for microbiome samples. Biometrics 74:448-457

Chen, Eric Z; Bushman, Frederic D; Li, Hongzhe (2017) A Model-Based Approach For Species Abundance Quantification Based On Shotgun Metagenomic Data. Stat Biosci 9:13-27

Shi, Pixu; Li, Hongzhe (2017) A model for paired-multinomial data and its application to analysis of data on a taxonomic tree. Biometrics 73:1266-1278

Zhao, Sihai Dave; Cai, T Tony; Cappola, Thomas P et al. (2017) Sparse simultaneous signal detection for identifying genetically controlled disease genes. J Am Stat Assoc 112:1032-1046

Liao, Katherine P; Sparks, Jeffrey A; Hejblum, Boris P et al. (2017) Phenome-Wide Association Study of Autoantibodies to Citrullinated and Noncitrullinated Epitopes in Rheumatoid Arthritis. Arthritis Rheumatol 69:742-749

Zhao, Sihai Dave; Cai, T Tony; Li, Hongzhe (2017) Optimal detection of weak positive latent dependence between two sequences of multiple tests. J Multivar Anal 160:169-184

Chen, Eric Z; Li, Hongzhe (2016) A two-part mixed-effects model for analyzing longitudinal microbiome compositional data. Bioinformatics 32:2611-7

Cai, T Tony; Li, Hongzhe; Liu, Weidong et al. (2016) Joint Estimation of Multiple High-dimensional Precision Matrices. Stat Sin 26:445-464

Showing the most recent 10 out of 63 publications

Comments

Be the first to comment on Hongzhe Lee's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: