One of the paradoxes of modern genetics is the contrast between the tremendous technological advances in sequencing and genotyping during the past decade and the slow progress in identifying genes for complex diseases. These diseases involve subtle disruptions of biochemical and developmental pathways and display substantial genetic heterogeneity and gene-by-gene and gene-by-environment interactions. In response to these challenges, geneticists are collecting much larger samples and genotyping enormous numbers of SNPs (single nucleotide polymorphisms). To handle the massive increases in data flow and extract the maximum amount of information from available data, better statistical analysis tools must be made available to the human genetics community. The current grant supports construction of new statistical methods and their translation into user friendly software via the widely distributed program Mendel. Under the auspices of the grant, we will tackle a series of related projects on computational statistics, association mapping, estimation of DNA copy numbers, population genetics, and software for managing and displaying human pedigree data. Our research in computational statistics revolves around three classes of optimization algorithms - MM and EM algorithms, block relaxation methods, and lasso penalized estimation. We will apply these methods to estimation in random graphs, nonnegative matrix factorization, and multicategory discriminant analysis. These methods are also pertinent to fast logistic regression with case-control data and fast mapping of QTLs (quantitative trait loci). We further plan to develop fast tests of association based on contingency tables, robust testing procedures for multivariate traits, and algorithms for modeling gene-by-gene and gene-by-environment interactions. Our efforts on copy number variation will focus on penalized estimation of DNA copy number by signal intensity, and hidden Markov modeling of copy numbers from the Illumina genotyping platform. In population genetics we will develop methods and software for testing Hardy-Weinberg equilibrium in pedigree data, penalized estimation of haplotype frequencies, and estimation of ethnic admixture. Finally our software development efforts will concentrate on making Mendel more conducive to dense, genome-wide SNP data, including: parallelization of the existing Mendel code;restructuring of the data structures in Mendel;making it easier to run complete analysis routines within Mendel;and perfection of MendelPro, the graphical user interface to Mendel. This ambitious agenda is all part of our coherent effort to provide a single platform for managing, displaying, and analyzing genetic data. This kind of software infrastructure is necessary if genetic epidemiology is to move rapidly forward in the twenty-first century.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
3R01GM053275-15S1
Application #
7989689
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Krasnewich, Donna M
Project Start
2010-01-01
Project End
2010-12-31
Budget Start
2010-01-01
Budget End
2010-12-31
Support Year
15
Fiscal Year
2010
Total Cost
$145,210
Indirect Cost
Name
University of California Los Angeles
Department
Biostatistics & Other Math Sci
Type
Schools of Medicine
DUNS #
092530369
City
Los Angeles
State
CA
Country
United States
Zip Code
90095
Lake, James A; Larsen, Joseph; Tran, Dan Thy et al. (2018) Uncovering the Genomic Origins of Life. Genome Biol Evol 10:1705-1714
vonHoldt, Bridgett M; Ji, Sarah S; Aardema, Matthew L et al. (2018) Activity of Genes with Functions in Human Williams-Beuren Syndrome Is Impacted by Mobile Element Insertions in the Gray Wolf Genome. Genome Biol Evol 10:1546-1553
Paul, Kimberly C; Sinsheimer, Janet S; Cockburn, Myles et al. (2018) NFE2L2, PPARGC1?, and pesticides and Parkinson's disease risk and progression. Mech Ageing Dev 173:1-8
Lin, Liang-Yu; Chun Chang, Sunny; O'Hearn, Jim et al. (2018) Systems Genetics Approach to Biomarker Discovery: GPNMB and Heart Failure in Mice and Humans. G3 (Bethesda) 8:3499-3506
Gilbert, Princess S; Wu, Jing; Simon, Margaret W et al. (2018) Filtering nucleotide sites by phylogenetic signal to noise ratio increases confidence in the Neoaves phylogeny generated from ultraconserved elements. Mol Phylogenet Evol 126:116-128
Keys, Kevin L; Chen, Gary K; Lange, Kenneth (2017) Iterative hard thresholding for model selection in genome-wide association studies. Genet Epidemiol 41:756-768
Crandall, Carolyn J; Manson, JoAnn E; Hohensee, Chancellor et al. (2017) Association of genetic variation in the tachykinin receptor 3 locus with hot flashes and night sweats in the Women's Health Initiative Study. Menopause 24:252-261
Zhang, Yiwen; Zhou, Hua; Zhou, Jin et al. (2017) Regression Models For Multivariate Count Data. J Comput Graph Stat 26:1-13
Paul, Kimberly C; Sinsheimer, Janet S; Cockburn, Myles et al. (2017) Organophosphate pesticides and PON1 L55M in Parkinson's disease progression. Environ Int 107:75-81
Mancuso, Nicholas; Shi, Huwenbo; Goddard, Pagé et al. (2017) Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. Am J Hum Genet 100:473-487

Showing the most recent 10 out of 156 publications