One of the paradoxes of modern genetics is the contrast between the tremendous technological advances in sequencing and genotyping during the past decade and the slow progress in identifying genes for complex diseases. These diseases involve subtle disruptions of biochemical and developmental pathways and display substantial genetic heterogeneity and gene-by-gene and gene-by-environment interactions. In response to these challenges, geneticists are collecting much larger samples and genotyping enormous numbers of SNPs (single nucleotide polymorphisms). To handle the massive increases in data flow and extract the maximum amount of information from available data, better statistical analysis tools must be made available to the human genetics community. The current grant supports construction of new statistical methods and their translation into user friendly software via the widely distributed program Mendel. Under the auspices of the grant, we will tackle a series of related projects on computational statistics, association mapping, estimation of DNA copy numbers, population genetics, and software for managing and displaying human pedigree data. Our research in computational statistics revolves around three classes of optimization algorithms - MM and EM algorithms, block relaxation methods, and lasso penalized estimation. We will apply these methods to estimation in random graphs, nonnegative matrix factorization, and multicategory discriminant analysis. These methods are also pertinent to fast logistic regression with case-control data and fast mapping of QTLs (quantitative trait loci). We further plan to develop fast tests of association based on contingency tables, robust testing procedures for multivariate traits, and algorithms for modeling gene-by-gene and gene-by-environment interactions. Our efforts on copy number variation will focus on penalized estimation of DNA copy number by signal intensity, and hidden Markov modeling of copy numbers from the Illumina genotyping platform. In population genetics we will develop methods and software for testing Hardy-Weinberg equilibrium in pedigree data, penalized estimation of haplotype frequencies, and estimation of ethnic admixture. Finally our software development efforts will concentrate on making Mendel more conducive to dense, genome-wide SNP data, including: parallelization of the existing Mendel code;restructuring of the data structures in Mendel;making it easier to run complete analysis routines within Mendel;and perfection of MendelPro, the graphical user interface to Mendel. This ambitious agenda is all part of our coherent effort to provide a single platform for managing, displaying, and analyzing genetic data. This kind of software infrastructure is necessary if genetic epidemiology is to move rapidly forward in the twenty-first century.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM053275-16
Application #
7790794
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Krasnewich, Donna M
Project Start
1995-08-01
Project End
2012-03-31
Budget Start
2010-04-01
Budget End
2011-03-31
Support Year
16
Fiscal Year
2010
Total Cost
$562,424
Indirect Cost
Name
University of California Los Angeles
Department
Biostatistics & Other Math Sci
Type
Schools of Medicine
DUNS #
092530369
City
Los Angeles
State
CA
Country
United States
Zip Code
90095
Shi, Huwenbo; Mancuso, Nicholas; Spendlove, Sarah et al. (2017) Local Genetic Correlation Gives Insights into the Shared Genetic Architecture of Complex Traits. Am J Hum Genet 101:737-751
Zhou, Hua; Blangero, John; Dyer, Thomas D et al. (2017) Fast Genome-Wide QTL Association Mapping on Pedigree and Population Data. Genet Epidemiol 41:174-186
Zhang, Yiwen; Zhou, Hua; Zhou, Jin et al. (2017) Regression Models For Multivariate Count Data. J Comput Graph Stat 26:1-13
Mancuso, Nicholas; Shi, Huwenbo; Goddard, Pagé et al. (2017) Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. Am J Hum Genet 100:473-487
Crandall, Carolyn J; Manson, JoAnn E; Hohensee, Chancellor et al. (2017) Association of genetic variation in the tachykinin receptor 3 locus with hot flashes and night sweats in the Women's Health Initiative Study. Menopause 24:252-261
Kichaev, Gleb; Roytman, Megan; Johnson, Ruth et al. (2017) Improved methods for multi-trait fine mapping of pleiotropic risk loci. Bioinformatics 33:248-255
Thompson, Michael J; vonHoldt, Bridgett; Horvath, Steve et al. (2017) An epigenetic aging clock for dogs and wolves. Aging (Albany NY) 9:1055-1068
Paul, Kimberly C; Sinsheimer, Janet S; Cockburn, Myles et al. (2017) Organophosphate pesticides and PON1 L55M in Parkinson's disease progression. Environ Int 107:75-81
Keys, Kevin L; Chen, Gary K; Lange, Kenneth (2017) Iterative hard thresholding for model selection in genome-wide association studies. Genet Epidemiol 41:756-768
Brown, Robert; Lee, Hane; Eskin, Ascia et al. (2016) Leveraging ancestry to improve causal variant identification in exome sequencing for monogenic disorders. Eur J Hum Genet 24:113-9

Showing the most recent 10 out of 149 publications