Dependent data methods are widely used in practice to conduct analyses for hierarchical and longitudinal studies. Formally, their use depends on a variety of assumptions, for example, parametric distributional assumptions or assumptions about the sampling mechanism. The overarching goal of our research program is to quantify sensitivity to these assumptions, identifying scenarios in which they are and are not important and developing new, more robust methods when necessary. Our focus is on assumptions surrounding the sampling mechanism, including missing data and outcome dependent sampling. Our research has four aims.
Our first aim considers the performance of commonly used methods that are used to avoid cluster level confounding. However, they may be sensitive to data which are missing at random (MAR), but not missing completely at random (MCAR).
Aims 2 through 4 consider outcome dependent sampling mechanisms that are frequently encountered in practice.
In Aim 2 we consider the use of parametric mixed models in situations such as for outcome dependent family studies.
Aim 3 considers the situation in which the timing of measurements may be dependent on previous results (e.g., more frequent exams in prostate cancer patients with rising prostate specific antigen (PSA) test results).
Aim 4 considers a general and unifying context in which to imbed the work on Aims 2 and 3, derive optimality results, and develop a platform for creation of new statistical methods for outcome dependent sampling. Results from our research will directly inform current statistical practice for these commonly used methods.

Public Health Relevance

Longitudinal and dependent data analysis methods are used in a wide variety of biomedical research investigations. If the assumptions on which the models are based are incorrect in important ways, it could give rise to misleading research conclusions. Knowing for which assumptions the results are sensitive and by how much is important to their proper use.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Computing and Health Informatics Study Section (BCHI)
Program Officer
Feuer, Eric J
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Francisco
Public Health & Prev Medicine
Schools of Medicine
San Francisco
United States
Zip Code
Zeng, Lily; Josephson, S Andrew; Fukuda, Keiko A et al. (2015) A Prospective Comparison of Informant-based and Performance-based Dementia Screening Tools to Predict In-Hospital Delirium. Alzheimer Dis Assoc Disord 29:312-6
Neuhaus, John M; Scott, Alastair J; Wild, Christopher J et al. (2014) Likelihood-based analysis of longitudinal data from outcome-related sampling designs. Biometrics 70:44-52
Saberi, Parya; Johnson, Mallory O; McCulloch, Charles E et al. (2011) Medication adherence: tailoring the analysis to the data. AIDS Behav 15:1447-53
McCulloch, Charles E; Neuhaus, John M (2011) Prediction of random effects in linear and generalized linear models under model misspecification. Biometrics 67:270-9
Neuhaus, John M; McCulloch, Charles E; Boylan, Ross (2010) A Note on Type II Error Under Random Effects Misspecification in Generalized Linear Mixed Models. Biometrics :
Pa, Judy; Boxer, Adam; Chao, Linda L et al. (2009) Clinical-neuroimaging characteristics of dysexecutive mild cognitive impairment. Ann Neurol 65:414-23