This proposal reflects our continuing efforts in solving problems of measurement error, correlated data and longitudinal/functional (curve) data in general regression settings. With the advancement in technology, data of higher dimension and more complex structures are generated daily. It is a common practice to directly adopt existing procedures that have been applied to data with similar structure to these new studies. Nevertheless, under certain circumstances, this practice could lead to ineffective analyses or even mis-leading conclusions. The investigators of this proposal will put such practice into a framework of measurement error modeling and evaluate its effectiveness and potential drawbacks in term of inducing non-negligible biases. The learned knowledge would allow researchers to develop suitable modeling strategies and new statistical methods that best exploit the information embedded in the data. The proposed research topics have arisen naturally from several important studies. These studies include (i) a long-term longitudinal study with the goal of studying effects of life-long risk exposure on health conditions later in life, (i) nutrition dietary mea- surements and metabolites, measured by multiple-devices, from subjects of diverse backgrounds, (iii) multi-platform genomic datasets, including microRNA, polysomal and total mRNA, collected from the same subjects at the same time for the purpose of investigating colon cancer tumorigenesis, and (iv) a spectroscopic oblique incidence reflectometry skin-lesion diagnostic study. A shared objective behind these research projects is to advance understanding of information embedded in the data and consequently to enhance disease prevention and early detection. The major focus of this proposal remains to be the development of intuitive and practical models as well as efficient and computa- tionally feasible methods without imposing unnecessary parametric assumptions. Through a series of aims, this research project will provide new modeling strategies and statistical methods that (i) utilize both modeling considerations and variable selection technique to identify suitable time period or effective locations where the changes or treatment effects exist;(ii) utilize a new mixture modeling strategy to flexible yet effectively describe distributions of features/variables of sub-populations, (iii) effectively borrow information through seemingly unrelated observations through correlations while maintain interpretability of outcomes, and finally (iv) utilize measurement-error modeling considerations to effectively link disease outcomes to latent features of correlated functional or longitudinal predictors. We expect our efforts on producing new statistical methods and applying them to important biomedical studies shall have significant impact on advancements in biological and medical research.

Public Health Relevance

We will develop novel modeling strategy and statistical methods to address challenges arising from analyses of several important biomedical studies. Our methods aim to improve the precision in data analyses and to draw valid conclusions albeit the use of un-precise measurements due to practical limitations. A shared objective behind the proposed research projects is to provide information for obesity research, cancer prevention and early detection. Our findings will also provide understandings of key changes that happen in young adulthood or mid-life that could have significant impacts on health conditions later in life.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Dunn, Michelle C
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Michigan Ann Arbor
Schools of Arts and Sciences
Ann Arbor
United States
Zip Code
Kuskie, Kyle R; Smith, Jacqueline L; Wang, Naisyin et al. (2011) Effects of location for collection of air samples on a farm and time of day of sample collection on airborne concentrations of virulent Rhodococcus equi at two horse breeding farms. Am J Vet Res 72:73-9
Wang, Naisyin (2010) Comments on: dynamic relations for sparsely sampled Gaussian processes. Test (Madr) 19:50-53
Steele, Russell J; Wang, Naisyin; Raftery, Adrian E (2010) Inference from Multiple Imputation for Missing Data Using Mixtures of Normals. Stat Methodol 7:351-364
Davidson, Laurie A; Wang, Naisyin; Ivanov, Ivan et al. (2009) Identification of actively translated mRNA transcripts in a rat model of early-stage colon carcinogenesis. Cancer Prev Res (Phila) 2:984-94
Davidson, Laurie A; Wang, Naisyin; Shah, Manasvi S et al. (2009) n-3 Polyunsaturated fatty acids modulate carcinogen-directed non-coding microRNA signatures in rat colon. Carcinogenesis 30:2077-84
Chapkin, Robert S; Kamen, Barton A; Callaway, Evelyn S et al. (2009) Use of a novel genetic mouse model to investigate the role of folate in colitis-associated colon cancer. J Nutr Biochem 20:649-55
Chapkin, Robert S; Wang, Naisyin; Fan, Yang-Yi et al. (2008) Docosahexaenoic acid alters the size and distribution of cell surface microdomains. Biochim Biophys Acta 1778:466-71
Hu, Zonghui; Wang, Naisyin (2008) Semiparametric latent covariate mixed-effects models with application to a colon carcinogenesis study. Stat Interface 1:75
Kolar, Satya Sree N; Barhoumi, Rola; Callaway, Evelyn S et al. (2007) Synergy between docosahexaenoic acid and butyrate elicits p53-independent apoptosis via mitochondrial Ca(2+) accumulation in colonocytes. Am J Physiol Gastrointest Liver Physiol 293:G935-43
Li, Erning; Wang, Naisyin; Wang, Nae-Yuh (2007) Joint models for a primary endpoint and multiple longitudinal covariate processes. Biometrics 63:1068-78

Showing the most recent 10 out of 29 publications