This proposal reflects our continuing efforts in solving problems of measurement error, correlated data and longitudinal/functional (curve) data in general regression settings. With the advancement in technology, data of higher dimension and more complex structures are generated daily. It is a common practice to directly adopt existing procedures that have been applied to data with similar structure to these new studies. Nevertheless, under certain circumstances, this practice could lead to ineffective analyses or even mis-leading conclusions. The investigators of this proposal will put such practice into a framework of measurement error modeling and evaluate its effectiveness and potential drawbacks in term of inducing non-negligible biases. The learned knowledge would allow researchers to develop suitable modeling strategies and new statistical methods that best exploit the information embedded in the data. The proposed research topics have arisen naturally from several important studies. These studies include (i) a long-term longitudinal study with the goal of studying effects of life-long risk exposure on health conditions later in life, (i) nutrition dietary mea- surements and metabolites, measured by multiple-devices, from subjects of diverse backgrounds, (iii) multi-platform genomic datasets, including microRNA, polysomal and total mRNA, collected from the same subjects at the same time for the purpose of investigating colon cancer tumorigenesis, and (iv) a spectroscopic oblique incidence reflectometry skin-lesion diagnostic study. A shared objective behind these research projects is to advance understanding of information embedded in the data and consequently to enhance disease prevention and early detection. The major focus of this proposal remains to be the development of intuitive and practical models as well as efficient and computa- tionally feasible methods without imposing unnecessary parametric assumptions. Through a series of aims, this research project will provide new modeling strategies and statistical methods that (i) utilize both modeling considerations and variable selection technique to identify suitable time period or effective locations where the changes or treatment effects exist;(ii) utilize a new mixture modeling strategy to flexible yet effectively describe distributions of features/variables of sub-populations, (iii) effectively borrow information through seemingly unrelated observations through correlations while maintain interpretability of outcomes, and finally (iv) utilize measurement-error modeling considerations to effectively link disease outcomes to latent features of correlated functional or longitudinal predictors. We expect our efforts on producing new statistical methods and applying them to important biomedical studies shall have significant impact on advancements in biological and medical research.

Public Health Relevance

We will develop novel modeling strategy and statistical methods to address challenges arising from analyses of several important biomedical studies. Our methods aim to improve the precision in data analyses and to draw valid conclusions albeit the use of un-precise measurements due to practical limitations. A shared objective behind the proposed research projects is to provide information for obesity research, cancer prevention and early detection. Our findings will also provide understandings of key changes that happen in young adulthood or mid-life that could have significant impacts on health conditions later in life.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Mariotto, Angela B
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Michigan Ann Arbor
Schools of Arts and Sciences
Ann Arbor
United States
Zip Code
Li, Yun; Zhu, Ji; Wang, Naisyin (2015) Regularized Semiparametric Estimation for Ordinary Differential Equations. Technometrics 57:341-350
Jiang, Bei; Wang, Naisyin; Sammel, Mary D et al. (2015) Modeling Short- and Long-Term Characteristics of Follicle Stimulating Hormone as Predictors of Severe Hot Flashes in Penn Ovarian Aging Study. J R Stat Soc Ser C Appl Stat 64:731-753
Jiang, Bei; Elliott, Michael R; Sammel, Mary D et al. (2015) Joint modeling of cross-sectional health outcomes and longitudinal predictors via mixtures of means and variances. Biometrics 71:487-97
Jiang, Bei; Sammel, Mary D; Freeman, Ellen W et al. (2015) Bayesian estimation of associations between identified longitudinal hormone subgroups and age at final menstrual period. BMC Med Res Methodol 15:106
Mukherjee, A; Chen, K; Wang, N et al. (2015) On the degrees of freedom of reduced-rank estimators in multivariate regression. Biometrika 102:457-477
Hu, Zonghui; Follmann, Dean A; Wang, Naisyin (2014) Estimation of mean response via effective balancing score. Biometrika 101:613-624
Li, Yehua; Wang, Naisyin; Carroll, Raymond J (2013) Selecting the Number of Principal Components in Functional Data. J Am Stat Assoc 108:
Zhou, Jianhui; Wang, Nae-Yuh; Wang, Naisyin (2013) Functional Linear Model with Zero-value Coefficient Function at Sub-regions. Stat Sin 23:25-50
Cho, Youngmi; Kim, Hyemee; Turner, Nancy D et al. (2011) A chemoprotective fish oil- and pectin-containing diet temporally alters gene expression profiles in exfoliated rat colonocytes throughout oncogenesis. J Nutr 141:1029-35
Kuskie, Kyle R; Smith, Jacqueline L; Wang, Naisyin et al. (2011) Effects of location for collection of air samples on a farm and time of day of sample collection on airborne concentrations of virulent Rhodococcus equi at two horse breeding farms. Am J Vet Res 72:73-9

Showing the most recent 10 out of 41 publications