Large scale genetic datasets have revolutionized human genetic research. In the past decade, genome-wide association studies have identified numerous genetic variants associated with various complex traits. These discoveries have informed new biology and led to novel therapeutics. Yet, most studies focused on European samples. As the next step, consortia efforts have begun to aggregate datasets from diverse non-European populations. Most of these studies seek to aggregate summary association statistics and perform meta-analysis instead of aggregating individual level data, which are easier to implement, equally powerful and more protective for participants? privacy. There are many new analytical challenges for trans-ethnic meta-analysis, which demands new methodology development. In this application, we propose to develop a series of novel approaches to understand the genetic architecture of complex traits in trans-ethnic meta-analysis. Specifically, we will develop methods to assess reproducibility of identified GWAS signals (Aim 1). We will improve models of genetic effect heterogeneity in trans-ethnic meta-analysis, in order to improve the power for association analysis (Aim 2). We will also adapt the model to enhance the identification of causal variants (Aim 3) and improve risk predictions (Aim 4). Finally, we will develop innovative software architectures to implement these methods and make them scalable for meta-analysis of sequencing age (Aim 5). To accomplish these research goals, we assembled a synergistic research team with leading expertise in complex trait genetics, statistical genetics and large scale computation. In the past few years, our research team developed software tools that are being used in hundreds of genetic studies. The team also got extensively involved in applied data analysis. We will continue our existing collaborations, and team up with leaders in the GSCAN, GIANT, GLGC, T2D and ICBP consortia to help advance the trans-ethnic analyses for smoking and drinking addiction, anthropometric traits, lipids levels, type II diabetes and blood pressures. Together, these datasets consist of >20 million phenotypic measurements on >5 million individuals. These collaborations will greatly advance our understanding on the genetic architecture, facilitate clinical translation and also maximize the impact of our developed methodologies.

Public Health Relevance

Smoking and drinking addiction is a major modifiable risk factor for a variety of human diseases, including cancer, cardiovascular disease and respiratory disorders. Addiction behaviors are heritable. Understanding the genetic basis for addiction can aid in the discovery of the underlying biology, inform preventive strategies, and improve risk prediction, which are of great public health relevance.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
High Priority, Short Term Project Award (R56)
Project #
1R56HG011035-01
Application #
10291183
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Hindorff, Lucia
Project Start
2021-01-01
Project End
2021-12-31
Budget Start
2021-01-01
Budget End
2021-12-31
Support Year
1
Fiscal Year
2021
Total Cost
Indirect Cost
Name
Pennsylvania State University
Department
Public Health & Prev Medicine
Type
Schools of Medicine
DUNS #
129348186
City
Hershey
State
PA
Country
United States
Zip Code
17033