The Human Genome Project is providing powerful resources for the identification of genes that predispose to human diseases: the complete sequence of the human genome and the sequences of many other species, a catalog of most common and many rare human genetic variants and their dependency relationships, and increasingly detailed sequence annotation. Along with these resources have come increasingly efficient means of genotyping and DNA sequencing. These resources and technologies will be critical as we seek to unravel the complex etiologic basis of common human diseases. In this proposal, I address a set of statistical problems that arise in human disease gene mapping. I describe how my colleagues and I will address these problems through analytic methods, computer simulation, and application to interesting test data, and how we will generalize these solutions through the production, distribution, and support of efficient computer software. ? ? First, we will continue to develop and test statistical designs and analysis methods for association mapping for complex human diseases. Specifically, we will: (A.I) develop two- and multi-stage methods for genetic association studies; (A.2) assess the impact of the preferential mistyping and nontyping of heterozygotes in association analysis; (A.3) develop permutation-based methods to assess the significance of association tests given multiple traits, markers, and/or groups of individuals; (A.4) model the effect of the """"""""winner's curse"""""""" on the estimation of the strength of association in complex disease studies; and (A.5) develop a parametric statistical framework to assess disease-marker association given family data of variable structure and to assess the role of a genetic marker in explaining a linkage signal for disease. ? ? Second, we will continue to address the impact of violating model assumptions on linkage analysis of genes for human diseases. Specifically, we will: (B.I) assess the impact of assuming equal male and female recombination fractions when they are different given various sampling designs; and (B.2) assess the impact of modeling marker-marker linkage disequilibrium in linkage analysis given genotype data on a dense set of SNPs. ? ? Third, we will continue to: (C) develop, test, distribute, and support computer software based on the methods that arise from the other aims of this project, and update, distribute, and support our current software, including SIMLINK, RHMAP, RELPAIR, and SIBMED. Finally, we will continue to be opportunistic in identifying and addressing important statistical problems that are related to the other goals of this project. Under separate funding, we will apply the resulting methods to the analysis of data from studies of type 2 diabetes, bipolar disorder, and other complex diseases. ? ?
Showing the most recent 10 out of 67 publications