This project provides a data science framework and a toolbox of best practices for systematic and reproducible data-driven methods for validating and deriving RDoC constructs with relevance to psychopathology. Despite recent advances in methods for data-driven constructs, results are often hard to reproduce using samples from other studies. There is a lack of systematic statistical methods and analytical design for enhancing reproducibility. To fill this gap, we will develop a data science framework, including novel scalable algorithms and software, to derive and validate RDoC constructs. Although the proposed methods will generally apply to all RDoC domains and constructs, we focus specifically on furthering understanding of the RDoC domains of cognitive control (CC) and attention (ATT) constructs implicated in attention deficit disorder (ADHD) and obsessive-compulsive disorder (OCD). Our application will use multi-modal neuroimaging, behavioral, and clinical/self-report data from large, nationally representative samples from the on Adolescent Brain Cognitive Development (ABCD) study and multiple local clinical samples with ADHD and OCD. Specifically, using the baseline ABCD samples, in aim 1, we will apply and develop methods to assess and validate the current configuration of RDoC for CC and ATT using confirmatory latent variable modeling. We will implement and develop new unsupervised learning methods to construct new computational-driven, brain-based domains from multi-modal image data.
In Aim 2, We will introduce network analysis (via Gaussian graphical models) to characterize heterogeneity in the interrelationship of RDoC measurements due to observed characteristics (i.e., age and sex). We will further model the heterogeneity of the population due to unobserved characteristics by introducing the data-driven precision phenotypes, which are the subgroup of participants with similar RDoC dimensions. We propose a Hierarchical Bayesian Generative Model and scalable algorithm for simultaneous dimension reduction and identify precision phenotypes. The model also serves as a tool to transfer information from the community sample ABCD to local clinical enriched studies.
In aim 3, we will utilize the follow-up samples from ABCD and local clinical enriched data sets to validate the results from Aims 1 and 2 and assess the clinical utility of the precision phenotypes in predicting psychological development in follow-up time. Our project will provide a suite of analytical tools to validate existing RDoC constructs and derive new, reproducible constructs by accounting for various sources of heterogeneity.

Public Health Relevance

To advance the understanding of psychopathology using dimensional constructs of measurements from multiple units of analysis, we propose reproducible statistical framework for validating and deriving RDoC constructs with relevance to psychopathology. We will use multi-modal neuroimaging, behavioral and clinical/self-report data from multiple samples to develop this framework. The design of our study consists of analyzing large, nationally representative samples, validating the results in local clinically enriched samples, and transfer information from the large community samples to local clinical samples.

National Institute of Health (NIH)
National Institute of Mental Health (NIMH)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZMH1)
Program Officer
Zehr, Julia L
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
New York State Psychiatric Institute
New York
United States
Zip Code