For nearly two decades this grant has developed statistical and computational tools vital to gene mapping. During that period, technology and genomic data changed dramatically. Expression and genotyping chips became standard scientific tools; the full genomes from a host of organisms, including the human species, were sequenced; and low-cost sequencing transitioned from fantasy to reality. The last decade has also witnessed a shift from common to rare SNVs (single nucleotide variants) and from moderate-sized studies to large consortium studies. Simultaneously, computers have grown exponentially in speed and memory. These parallel advances have powered thousands of successful human gene mapping studies for both Mendelian and complex traits. Because these successes have shed light on only a fraction of the heritability of common traits, we have not yet reached the endgame of statistical genetics. There is still need for new ideas and better software. We plan to build on our previous successes, with particular stress on adapting modern methods of data mining to genetic applications. We and others have made great strides in applying penalized estimation and model selection in genomics. Genetic analysis via penalized regression easily handles non-genetic predictors, uncertainty in genotype and sequence calls, corrections for ethnic admixture, quantitative traits and disease dichotomies, gene-gene and gene-environment interactions, and both rare and common variants. Unfortunately, it is now apparent that penalized estimation is hampered by severe shrinkage and inflated false positive rates. Our recent development of the proximal distance algorithms and AIC (Akaike information criterion) guided regression show that severe shrinkage can be eliminated and false positive rates tamed. We are also convinced that haplotypes have been underexploited in genetic analysis. These flag local gene sharing, serve as surrogates for rare variants, capture intragenic interactions, and enable both fixed and random effects QTL (quantitative trait locus) mapping. Our extensive list of aims should not be interpreted as a lack of focus. Our track record shows that we can make progress on a number of fronts simultaneously. All of our efforts are directed toward sharpening the tools of genetic analysis. As our programs SIMWALK, MENDEL, and ADMIXTURE illustrate, we are committed to translating theoretical advances into user-friendly software. These programs are notable for their comprehensiveness, speed, reliability, small memory usage, and detailed documentation. The goal of this grant is to empower the very large genetic studies on the horizon. Collectively, our Specific Aims go a long way towards that goal.
The human genome project and its offshoots have dramatically increased the amount of genetic data. In fact, the scientific community's ability to collect genetic information has now far outstripped its ability to make use of this information in understanding the basis of disease and human diversity. Our aim is to develop, implement, and freely distribute new, more efficient computational and statistical approaches that make full use of the vast amount of genetic data, and thus improve genetic researchers' ability to map and characterize genes that lead to human diseases and trait variation.
Lake, James A; Larsen, Joseph; Tran, Dan Thy et al. (2018) Uncovering the Genomic Origins of Life. Genome Biol Evol 10:1705-1714 |
vonHoldt, Bridgett M; Ji, Sarah S; Aardema, Matthew L et al. (2018) Activity of Genes with Functions in Human Williams-Beuren Syndrome Is Impacted by Mobile Element Insertions in the Gray Wolf Genome. Genome Biol Evol 10:1546-1553 |
Paul, Kimberly C; Sinsheimer, Janet S; Cockburn, Myles et al. (2018) NFE2L2, PPARGC1?, and pesticides and Parkinson's disease risk and progression. Mech Ageing Dev 173:1-8 |
Lin, Liang-Yu; Chun Chang, Sunny; O'Hearn, Jim et al. (2018) Systems Genetics Approach to Biomarker Discovery: GPNMB and Heart Failure in Mice and Humans. G3 (Bethesda) 8:3499-3506 |
Gilbert, Princess S; Wu, Jing; Simon, Margaret W et al. (2018) Filtering nucleotide sites by phylogenetic signal to noise ratio increases confidence in the Neoaves phylogeny generated from ultraconserved elements. Mol Phylogenet Evol 126:116-128 |
Zhang, Yiwen; Zhou, Hua; Zhou, Jin et al. (2017) Regression Models For Multivariate Count Data. J Comput Graph Stat 26:1-13 |
Paul, Kimberly C; Sinsheimer, Janet S; Cockburn, Myles et al. (2017) Organophosphate pesticides and PON1 L55M in Parkinson's disease progression. Environ Int 107:75-81 |
Mancuso, Nicholas; Shi, Huwenbo; Goddard, Pagé et al. (2017) Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. Am J Hum Genet 100:473-487 |
Zhou, Hua; Blangero, John; Dyer, Thomas D et al. (2017) Fast Genome-Wide QTL Association Mapping on Pedigree and Population Data. Genet Epidemiol 41:174-186 |
Kichaev, Gleb; Roytman, Megan; Johnson, Ruth et al. (2017) Improved methods for multi-trait fine mapping of pleiotropic risk loci. Bioinformatics 33:248-255 |
Showing the most recent 10 out of 156 publications