One of the major problems in human genetics is understanding the genetic causes underlying complex phenotypes, including neuropsychiatric traits such as autism spectrum disorders and schizophrenia. Despite tremendous work over the past few decades, the underlying biological mechanisms are poorly understood in most cases. Recent advances in high-throughput, massively parallel genomic technologies have revolutionized the field of human genetics and promise to lead to important scientific advances. Despite this progress in data generation, it remains very challenging to analyze and interpret these data. The main focus of this proposal is the development of powerful statistical methods for the integration of whole-genome sequencing data with rich functional genomics data with the goal to improve the discovery of genes involved in autism spectrum disorders. We propose to integrate data from many different sources, including epigenetic data from projects such as ENCODE, Roadmap, and PsychENCODE, eQTL data from the GTEx, PsychENCODE and CommonMind consortia, data from large scale databases of genetic variation such as ExAC and gnomAD, in order to predict functional effects of genetic variants in non-coding genetic regions in a tissue and cell type specific manner, and generate functional maps across large number of tissues and cell types in the human body that we can then use to identify novel associations with autism in whole-genome sequencing studies. The proposed functional predictions and functional maps will be broadly available in the popular ANNOVAR database. We further propose to use these functional predictions in the analysis of almost 20,000 whole genomes from three large whole genome sequencing studies for autism. We believe that the proposed research is very timely and has the potential to substantially improve the analysis of non-coding genetic variation, and hence provide new insights into the biological mechanisms underlying risk to autism, and more broadly to other neuropsychiatric diseases.
Autism Spectrum Disorders are common diseases with major impact on public health. Although coding variation has been extensively studied for its role in affecting risk to autism, the analysis of non-coding variation poses tremendous challenges. The proposed statistical methods and their applications to nearly 20,000 whole genomes from three large autism whole genome sequencing studies will improve our understanding of the biological mechanisms involved in autism with important implications for disease treatment strategies.
Showing the most recent 10 out of 34 publications