Early disease prevention, detection, and intervention are fundamental goals for advancing human health. Meanwhile, genetic risk is, for all intents and purposes, the earliest significant contributor to common, heritable, disease risk. Thus, in theory, genetic profiling should be the ideal tool for early disease prevention. Yet, genetic factors are rarely used directly to predict future disease risk. Rather, genetic information is typically relegated to phenotype-first scenarios: providing or confirming diagnoses for individuals with overt disease or clarifying the genetic risk for individuals with a strong family history of disease. For modern genomics to make a significant impact on disease prevention the use of genomic information must transition to a genotype-first approach; prediction of genetic disease risk in otherwise healthy individuals. A major barrier to this transition includes our limited ability to predict the precise array of risks and likely phenotypic expression of disease in an individual from genetic and other risk factors. The degree of disease risk and phenotypic expression conveyed to any single individual by genetic factors is a result of a complex interplay between direct and indirect genetic effects, other unmodifiable risk factors (age, gender, ancestry, family history), and intermediate modifiable risk factors (environment, behavior, laboratory values, health status, therapy status, etc.) many of which have their own direct genetic mediators. New approaches are required to dissect this interplay in order to personalize and contextualize preventative actions that most effectively reduce overall disease risk. The overarching goal of this proposal is the development of innovative Deep learning and machine-learning approaches to integrate baseline genetic risk predictions with the measurement of traditional risk factors in order to provide more accurate and actionable predictions of disease risk. By tying genetic risk to traditional risk factors, especially modifiable risk factors, we will enable actionability by allowing both a determination of preventative actions that may be especially effective because they offset genetic risk, as well as the identification of modifiable risk factors that should be monitored and controlled proactively given increased genetic predisposition. To accomplish this goal, we propose to develop methods to: (1) infer the likely phenotypic expressivity of monogenic risk variants via a spatial covariance machine learning approach, (2) predict prevalent disease cases and the expected value of intermediate modifiable risk factors from polygenic and other unmodifiable risk factors, and finally (3) predict prevalent disease cases through interactions between baseline genetic expectations and observed (measured) intermediate modifiable risk factors in a deep learning framework. Adjusting age and modifiable risk factors in these trained models would then allow for the interactive projection of future disease risk and the identification of modifiable risk factors that, when manipulated, lead to the greatest change in future disease risk. We focus on the development of methods for coronary artery disease given its public health importance, the known utility of polygenic risk estimation, and the current evidence for polygene-by-environment interactions. In addition, the approach we propose integrates directly with current clinical decision support tools for coronary artery disease management. However, we will build a general framework that can be extended to any common heritable adult-onset condition, especially those with known heritable, traditional risk factors

Public Health Relevance

Significant investment has been placed in the identification of genetic risk factors for common diseases. Yet, outside of family history, genetic risk is almost never included in routine clinical risk assessments. The goal of this proposal is to develop deep learning methods for the integration of genetic and traditional risk factors into comprehensive disease risk prediction.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
1R01HG010881-01A1
Application #
10051655
Study Section
Genetics of Health and Disease Study Section (GHD)
Program Officer
Sofia, Heidi J
Project Start
2020-08-21
Project End
2024-05-31
Budget Start
2020-08-21
Budget End
2021-05-31
Support Year
1
Fiscal Year
2020
Total Cost
Indirect Cost
Name
Scripps Research Institute
Department
Type
DUNS #
781613492
City
La Jolla
State
CA
Country
United States
Zip Code
92037