The levels of circulating metabolites, and the interactions between these metabolites, are a function of variation in genes, proteins and the environment. As such, the metabolome provides an extremely detailed profile of the physiological state of cells. By integrating measurements of the identities and abundances of these metabolites, we can build a more comprehensive picture of the causes and consequences of disease that cannot be explained by genetics alone. Thus, it is increasingly recognized that the metabolome will become a crucial element in systems level biomedical research. However, analyzing metabolomic data is exceptionally difficult, because the measurements are inherently noisy, and involve complex high-dimensional interactions and relatively small sample sizes. To make sense of this rich data source, we need new, more powerful statistical tools. The work proposed here will generate a suite of statistical methods for estimating correlations across multiple groups of measurements, an approach we term ?Multi-group spiked covariance models? (MSCov). These models achieve unprecedented accuracy by utilizing known relationships in the data (e.g., disease status, genotype, and age) to estimate correlations between metabolites. Our approach also limits false discoveries by including precise uncertainty quantification for all of the values being estimated. With MSCov, we establish a powerful framework that can be used to infer differential changes in metabolic pathways across clinical groups, to identify potential biomarkers for disease prediction and to propose targets for therapeutic drugs. To illustrate the utility of MSCov, this project will analyze a new dataset of metabolomic and lipidomic profiles from the cerebrospinal fluid of 180 human subjects, half of whom are diagnosed with neurodegenerative disease, including both APO-?4-positive and APO-?4-negative individuals. When applied to these data, the methods developed here should identify novel components of metabolic pathways associated with Alzheimer's disease and Parkinson's disease. These new methods and data will be made available to the research community. This project focuses on tools for better understanding the metabolomic causes and consequences of neurodegenerative disease. However, the tools developed here will enhance our ability to compare metabolomic profiles among any groups.

Public Health Relevance

Recent studies have demonstrated the value of using metabolomic data in systems level biomedical analyses. However, existing statistical models do not allow us to fully realize the diagnostic and predictive power of the metabolome. The work proposed here develops new statistical models that resolve existing challenges, and will apply these more powerful methods to identify diagnostic signals in cerebrospinal fluid from patients with Alzheimer's disease and Parkinson's disease.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Small Research Grants (R03)
Project #
1R03CA211160-01
Application #
9218404
Study Section
Special Emphasis Panel (ZRG1-BST-U (50)R)
Program Officer
Spalholz, Barbara A
Project Start
2016-09-15
Project End
2017-08-31
Budget Start
2016-09-15
Budget End
2017-08-31
Support Year
1
Fiscal Year
2016
Total Cost
$145,129
Indirect Cost
$45,129
Name
University of Washington
Department
Pathology
Type
Schools of Medicine
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195
Hoffman, Jessica M; Lyu, Yang; Pletcher, Scott D et al. (2017) Proteomics and metabolomics in ageing research: from biomarkers to systems biology. Essays Biochem 61:379-388
Ma, Jing; Shojaie, Ali; Michailidis, George (2016) Network-based pathway enrichment analysis with incomplete network information. Bioinformatics 32:3165-3174