Low-rank Approximation to Probe-level Data with Application to Exon Tiling Arrays

Hu, Jianhua; He, Xuming

Abstract

Findings from the Human Genome Project highlight the intricacy of interactions between cell regulation, genes and proteins. It is generally understood that biological functions and biological activities are controlled by subsets of genes interacting with proteins in a highly controlled manner. High throughput technologies such as microarrays are valuable for studying a large number of biological components simultaneously, but sound conclusions from these technologies depend on appropriate statistical analyses of the genomic/proteomic data. The long-term objective of this proposal is to develop appropriate statistical tools to explore gene/protein interactions and to discover how these interactions function in biological activities (e.g. induction of disease phenotype). This proposal concerns the analysis of short oligonucleotide data, as in GeneChip studies and exon tiling arrays. Low-rank approximations to the expression data matrices play a central role in the proposed research.
The specific aims are: (1) to develop a fast and robust low-rank algorithm to perform low-rank approximation to a data matrix that is subject to outliers;(2) to develop diagnostic tools and statistical tests for determining whether a low-rank representation is adequate to capture gene expression profiles;(3) to develop both nonparametric and likelihood-based approaches for flagging and detecting alternative splicing with exon tiling arrays. Singular value decomposition is a starting point for the proposed work towards those specific aims. Alternating robust (outlier resistant) regression methods will be used for Aims (1) and (3). Likelihood- based and data adaptive methods will be developed for Aims (2) and (3). The proposed research distinguishes itself from most of the existing statistical work on microarray data, as it focuses on probe-level rather than gene-level data. The investigators believe that the standard uni-dimensional summary of gene expression data could lead to loss of important information.

Public Health Relevance

Successful completion of the proposed research will lead to efficient and effective statistical tools for analyzing microarray data that have wide-ranging applications in biomedical and public health research, as evidenced by the recent discovery of target genes for cervical cancer and prostate cancer. Those tools are needed to support better applications of microarray technology in clinical and biomedical research.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM080503-03
Application #: 7860383
Study Section: Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer: Brazhnik, Paul

Project Start: 2008-07-11
Project End: 2012-05-31
Budget Start: 2010-06-01
Budget End: 2011-05-31
Support Year: 3
Fiscal Year: 2010
Total Cost: $208,665
Indirect Cost

Institution

Name: University of Texas MD Anderson Cancer Center
Department: Biostatistics & Other Math Sci
Type: Other Domestic Higher Education
DUNS #: 800772139

City: Houston
State: TX
Country: United States
Zip Code: 77030

Related projects


NIH 2011 R01 GM	Low-rank Approximation to Probe-level Data with Application to Exon Tiling Arrays Hu, Jianhua; He, Xuming / University of Texas MD Anderson Cancer Center	$206,533
NIH 2010 R01 GM	Low-rank Approximation to Probe-level Data with Application to Exon Tiling Arrays Hu, Jianhua; He, Xuming / University of Texas MD Anderson Cancer Center	$208,665
NIH 2009 R01 GM	Low-rank Approximation to Probe-level Data with Application to Exon Tiling Arrays Hu, Jianhua; He, Xuming / University of Texas MD Anderson Cancer Center	$210,824
NIH 2009 R01 GM	Low-rank Approximation to Probe-level Data with Application to Exon Tiling Arrays Hu, Jianhua; He, Xuming / University of Texas MD Anderson Cancer Center	$364,139
NIH 2008 R01 GM	Low-rank Approximation to Probe-level Data with Application to Exon Tiling Arrays Hu, Jianhua; He, Xuming / University of Texas MD Anderson Cancer Center	$221,515

Publications

Hu, Jianhua; Zhang, Liwen; Wang, Huixia Judy (2016) Sequential model selection-based segmentation to detect DNA copy number variation. Biometrics 72:815-26

Hu, Jianhua; Zhu, Hongjian; Hu, Feifang (2015) A Unified Family of Covariate-Adjusted Response-Adaptive Designs Based on Efficiency and Ethics. J Am Stat Assoc 110:357-367

Hu, Jianhua; Wang, Peng; Qu, Annie (2015) Estimating and Identifying Unspecified Correlation Structure for Longitudinal Data. J Comput Graph Stat 24:455-476

Jung, Yoonsuh; Hu, Jianhua (2015) A K-fold Averaging Cross-validation Procedure. J Nonparametr Stat 27:167-179

Maadooliat, Mehdi; Huang, Jianhua Z; Hu, Jianhua (2015) Integrating Data Transformation in Principal Components Analysis. J Comput Graph Stat 24:84-103

Jung, Yoonsuh; Huang, Jianhua Z; Hu, Jianhua (2014) Biomarker Detection in Association Studies: Modeling SNPs Simultaneously via Logistic ANOVA. J Am Stat Assoc 109:1355-1367

Maadooliat, Mehdi; Huang, Jianhua Z; Hu, Jianhua (2012) Analyzing multiple-probe microarray: estimation and application of gene expression indexes. Biometrics 68:784-92

Li, Bin; Liang, Feng; Hu, Jianhua et al. (2012) Reno: regularized non-parametric analysis of protein lysate array data. Bioinformatics 28:1223-9

Wang, Huixia Judy; Hu, Jianhua (2011) Identification of differential aberrations in multiple-sample array CGH studies. Biometrics 67:353-62

Li, Chenxi; Wei, Ying; Chappell, Rick et al. (2011) Bent line quantile regression with application to an allometric study of land mammals' speed and mass. Biometrics 67:242-9

Showing the most recent 10 out of 20 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: