A poor understanding of the heterogeneity of many complex diseases prevents their accurate early diagnosis and targeted interventions focused on etiology. Of particular concern, many subtypes exist among persons in the early stages of neurodegenerative diseases, each subtype with distinct contributing causes and phenotypes. Accurately diagnosing and predicting rates of progression for these illnesses will be essential for any disease-modifying treatments. To overcome this barrier, we believe it is critically important to develop and apply an innovative method of latent class analysis -- incorporating both (1) the longitudinal trajectories of a high-dimensional collection of clinical and biomarker information, and (2) the times to specific outcomes when such data are available -- in order to arrive at subclassifications that are relevant to the underlying etiologies and the rate of disease progression. Unlike current methods of latent class analysis, our new method is scalable and requires minimal modeling assumptions. Over the short-term, we will target the heterogeneity of mild cognitive impairment (MCI), the first clinically detectable manifestation of the intermediate stage between normal aging and dementia. We will integrate the information in two existing longitudinal data sets of persons with MCI: the National Alzheimer?s Coordinating Center?s Uniform Data Set (UDS), a unique resource with 29 participating NIH-funded Alzheimer?s Disease Centers contributing standardized clinical and neuropathological variables on over 6500 unique MCI individuals; and the Emory Neurology-Cognitive Data Set (NeuCog), which addition to comprehensive clinical information also contributes standardized biomarkers on 1015 unique MCI individuals with MRI scans and 529 with cerebral spinal fluid (CSF) specimens.
The specific aims of this study are to: (1) Develop a scalable method of latent trajectory class analysis that allows the researcher to model only the means, variances, and temporal correlations of the longitudinal observations; (2) Extend the method developed in Aim 1 for researchers to incorporate the times to specific clinical or neuropathological outcomes, subject to complex survival features, into the latent class analysis; (3) Apply our new statistical methods under the guidance of expert clinical scientists, using the information available in the UDS and NeuCog data sets, to identify clinicopathologically relevant subtypes of MCI; and (4) Develop freely available software to analyze data using our new statistical methods.

Public Health Relevance

Accurate early diagnosis and the ability to predict rates of progression will be essential for any disease- modifying treatments of complex neurodegenerative illnesses. A critical barrier is the extensive and poorly understood heterogeneity of these diseases. The goal of this research is to develop and apply scalable and robust statistical methods that integrate the comprehensive clinical and biological information from two existing longitudinal data sets in order to more precisely understand the relevant subtypes among individuals at an intermediate stage between normal aging and dementia.

National Institute of Health (NIH)
National Institute on Aging (NIA)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Anderson, Dallas
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Emory University
Biostatistics & Other Math Sci
Schools of Public Health
United States
Zip Code