Longitudinal genetic studies provide a valuable resource for exploring key genetic and environmental factors that affect complex traits over time. Genetic analysis of longitudinal data that incorporates trait variation over time is critical to understanding genetic influence and biological variations of complex diseases. In recent years, many genetic studies have been conducted in cohorts in which multiple measures on a trait of interest are collected on each subject over a period of time in addition to genome sequence data. These studies not only provide a more accurate assessment of disease condition but enable researchers to investigate the influence of genes on the trajectory of a trait and disease progression. This project focuses on the development of novel association testing methods to analyze sequencing genomic data at gene levels. The research will help provide insights into the underlying biology and progression of complex diseases.

In longitudinal genetic studies and data from the Electronic Medical Records and Genomics (eMERGE) network, phenotypic traits and genetic variants may be viewed as functional data. Functional data analysis (FDA) can serve as a valuable tool for exploring key genetic and environmental factors that affect complex traits over time. In the presence of a large number of rare variants, gene-based analysis is a more powerful tool for gene mapping than testing of individual genetic variants. This project seeks to develop stochastic functional regression models and longitudinal sequence kernel association tests (LSKAT) to analyze longitudinal traits of population samples and pedigree or cryptically related samples, and to analyze pleiotropic traits. FDA techniques and kernel-based approaches are utilized to reduce the high dimensionality of sequencing data and draw useful information. A variance-covariance structure is constructed to model the measurement variation and correlations of an individual's trait based on the theory of stochastic processes and novel penalized spline models are used to estimate the trajectory mean function. The proposed methods and software will be tested and refined using real data sets and simulation studies. User-friendly software will be developed to implement the proposed methods and will be made publicly available.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1916246
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2019-08-01
Budget End
2022-07-31
Support Year
Fiscal Year
2019
Total Cost
$120,000
Indirect Cost
Name
Yale University
Department
Type
DUNS #
City
New Haven
State
CT
Country
United States
Zip Code
06520