We will use whole-genome sequences (WGS) from a unique set of multigenerational Utah pedigrees to explore the causes of genetic variation and the consequences of this variation for disease. We will estimate the rates of mutation and mobile element retrotransposition in 42 three-generation pedigrees, each consisting of grandparents, parents, and large numbers of offspring (626 individuals in total). Using advanced methods to detect single nucleotide variants, structural variants, and mobile element insertions in WGS data, we will address fundamental questions about mutation and mobile element evolution: In a large, well-controlled set of families, what are the rates of mutation and retrotransposition? How are these events affected by paternal and maternal age? Is variation in these rates determined by genetic factors (e.g., DNA repair genes) that segregate in families? What is the role of genomic context (e.g., GC content, recombination) in generating de novo mutations and retrotranspositions? In addition to addressing questions about the causes of genetic variation, we will address the consequences of variation by analyzing WGS in large, multigenerational Utah pedigrees in which there is a strong excess of specific inherited diseases. Under separate funding, we are obtaining WGS from at least 3,000 pedigree members as part of the Utah Genome Project (of which the PI is the Executive Director). These families, which are part of the eight-million-member Utah Population Database, provide important advantages for the genetic analysis of Mendelian and complex diseases because genetic heterogeneity, as well as environmental heterogeneity, are both greatly reduced. Furthermore, large pedigrees offer the potential to follow the transmission of rare variants detected in WGS across generations as they contribute to disease causation, including the causation of common, complex diseases. They thus provide a powerful and unique resource for disease-gene identification. We have developed the VAAST, pVAAST, and Phevor algorithms for detecting and characterizing disease-causing genes in these families. In this project, we will develop and modify these methods to address several key questions: What is the role of noncoding genetic variation in causing inherited disease? To what extent does structural variation, such as copy number variants and genomic rearrangements, contribute to inherited disease? How can existing methods be effectively adapted to identify the multiple variants that underlie susceptibility to common diseases?

Public Health Relevance

We will analyze whole-genome sequence data in a unique collection of multigenerational Utah pedigrees to answer fundamental questions about the causes and consequences of mutation and mobile element insertion. We will also develop analytic tools to find disease-causing genes in disease pedigrees.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Unknown (R35)
Project #
Application #
Study Section
Special Emphasis Panel (ZGM1)
Program Officer
Krasnewich, Donna M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Utah
Schools of Medicine
Salt Lake City
United States
Zip Code
Flygare, Steven; Hernandez, Edgar Javier; Phan, Lon et al. (2018) The VAAST Variant Prioritizer (VVP): ultrafast, easy to use whole genome variant prioritization tool. BMC Bioinformatics 19:57
Chen, Jiun-Sheng; Hu, Fulan; Kugathasan, Subra et al. (2018) Targeted Gene Sequencing in Children with Crohn's Disease and Their Parents: Implications for Missing Heritability. G3 (Bethesda) 8:2881-2888
Booth III, John N; Li, Man; Shimbo, Daichi et al. (2018) West African Ancestry and Nocturnal Blood Pressure in African Americans: The Jackson Heart Study. Am J Hypertens 31:706-714
Al-Agha, Abdulmoein Eid; Ahmed, Ihab Abdulhamed; Nuebel, Esther et al. (2018) Primary Ovarian Insufficiency and Azoospermia in Carriers of a Homozygous PSMC3IP Stop Gain Mutation. J Clin Endocrinol Metab 103:555-563
Feusier, Julie; Witherspoon, David J; Scott Watkins, W et al. (2017) Discovery of rare, diagnostic AluYb8/9 elements in diverse human populations. Mob DNA 8:9
Rustagi, Navin; Zhou, Anbo; Watkins, W Scott et al. (2017) Extremely low-coverage whole genome sequencing in South Asians captures population genomics information. BMC Genomics 18:396
Hu, Hao; Petousi, Nayia; Glusman, Gustavo et al. (2017) Evolutionary history of Tibetans inferred from whole-genome sequencing. PLoS Genet 13:e1006675
Gibson, Summer B; Downie, Jonathan M; Tsetsou, Spyridoula et al. (2017) The evolving genetic risk for sporadic ALS. Neurology 89:226-233