Proteins play major roles as biological effectors and diagnostic markers. One level of its complexity is due to the post-translational modifications that cannot be detected at the genome level, which makes it desirable to measure proteins directly. Recently, some new protein microarray technologies have begun to bloom for this purpose. We focus on the reverse-phase protein lysate arrays that allow us to quantify the relative expression levels of a protein in many different cellular samples simultaneously. One advantage of this technology is that it requires a small amount of cells with just one antibody binding. However, it is more challenging to analyze protein lysate arrays than DNA arrays, and at the present time, the applications of protein lysate arrays are still in the exploratory stage with a lack of reliable statistical tools for quantifying the information (including the uncertainty) from protein arrays. We find that it is difficult, if at all possible, to model all the samples with a simple parametric family of response curves. We propose a robust approach to quantify the protein lysate arrays by fitting a monotone nonparametric response curve to all samples on the same array. The proposed method has been shown to fit the data more adaptively, avoiding bias due to parameterization.
We aim to incorporate the modern shrinkage ideas in statistics into the nonparametric approach, leading to more stable quantification in time course experiments where the number of replicates is small at each time point. We also propose to use wild-bootstrap for assessing uncertainty of the protein concentration estimates and for assessing the influence of such uncertainties in follow-up analyses. When completed, our research will enable more reliable analysis of protein lysate arrays, and provide feedback to chip makers to improve the design of the protein microarrays, both of which are essential in making lysate arrays a useful tool in biological and medical research.

Public Health Relevance

Successful completion of the proposed research will lead to efficient and effective statistical and computing tools for analyzing protein lysate array data that have wide-ranging applications in biomedical and public health research, as evidenced by the recent discovery of target protein in signal pathway profiling related to prostate cancer. These tools are needed to support better applications of protein lysate array technology in clinical and biomedical research.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Exploratory/Developmental Grants (R21)
Project #
1R21CA129671-01A1
Application #
7659879
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Li, Jerry
Project Start
2009-08-01
Project End
2011-07-31
Budget Start
2009-08-01
Budget End
2010-07-31
Support Year
1
Fiscal Year
2009
Total Cost
$181,024
Indirect Cost
Name
University of Texas MD Anderson Cancer Center
Department
Biostatistics & Other Math Sci
Type
Other Domestic Higher Education
DUNS #
800772139
City
Houston
State
TX
Country
United States
Zip Code
77030
Hu, Jianhua; Zhang, Liwen; Wang, Huixia Judy (2016) Sequential model selection-based segmentation to detect DNA copy number variation. Biometrics 72:815-26
Hu, Jianhua; Wang, Peng; Qu, Annie (2015) Estimating and Identifying Unspecified Correlation Structure for Longitudinal Data. J Comput Graph Stat 24:455-476
Maadooliat, Mehdi; Huang, Jianhua Z; Hu, Jianhua (2015) Integrating Data Transformation in Principal Components Analysis. J Comput Graph Stat 24:84-103
Jung, Yoonsuh; Huang, Jianhua Z; Hu, Jianhua (2014) Biomarker Detection in Association Studies: Modeling SNPs Simultaneously via Logistic ANOVA. J Am Stat Assoc 109:1355-1367
Maadooliat, Mehdi; Huang, Jianhua Z; Hu, Jianhua (2012) Analyzing multiple-probe microarray: estimation and application of gene expression indexes. Biometrics 68:784-92
Li, Bin; Liang, Feng; Hu, Jianhua et al. (2012) Reno: regularized non-parametric analysis of protein lysate array data. Bioinformatics 28:1223-9
Yang, Ji-Yeon; He, Xuming (2011) A multistep protein lysate array quantification method and its statistical properties. Biometrics 67:1197-205
Wang, Huixia Judy; Hu, Jianhua (2011) Identification of differential aberrations in multiple-sample array CGH studies. Biometrics 67:353-62
Lee, Seokho; Huang, Jianhua Z; Hu, Jianhua (2010) SPARSE LOGISTIC PRINCIPAL COMPONENTS ANALYSIS FOR BINARY DATA. Ann Appl Stat 4:1579-1601
Broglio, Steven P; Schnebel, Brock; Sosnoff, Jacob J et al. (2010) Biomechanical properties of concussions in high school football. Med Sci Sports Exerc 42:2064-71