Human genetics research has accelerated in the last decade owing to our evolving understanding of the human genome. With the recent completion of the International HapMap Project, the development of largescale genotyping technology, and rapid decline in genotyping costs, an immense amount of genotype data have been generated, which in turn raises many new challenging problems for analysis and interpretation of the data. This application proposes developing new statistical methodologies that aim to address a wide range of statistical issues in current candidate gene and genome-wide association (GWA) studies. ? ? Specifically, the proposal will address the following problems. (1) Recent high-resolution genome mapping indicates that copy number variations (CNVs) are ubiquitous and common in the general population, and may play a major role in phenotypic variation.
In Aim 1, we will develop a Bayesian hidden Markov model based algorithm for highresolution CNV detection using whole-genome SNP genotyping data. Our algorithm has the ability to incorporate both unrelated individuals and family data. (2) Given the high density of genetic markers in largescale candidate gene and GWA studies, it is reasonable to expect that multilocus genotypes offer more information on genetic association than single-marker analysis.
In Aim 2, we will develop a powerful multimarker test for gene-based association analysis and extend the method to analysis of gene-gene interactions. The virtue of our method lies in its ability to borrow strength from nearby markers while reducing the degrees of freedom. (3) In many disease gene-mapping studies, individuals are ascertained from a recently admixed population.
In Aim 3, we will develop novel association tests in genetics studies using recently admixed populations. By considering ancestry level and genotypes together, our method offers higher resolution and power than traditional admixture mapping methods. (4) Appropriate adjustment for multiple dependent tests has long been a problem in genetics studies, especially for studies with limited sample size and without replication datasets.
In Aim 4, we propose new methods to estimate the effective number of tests that reflect the amount of independent information contained in the data. (5) In Aim 5, we will develop, test, distribute, and support freely available implementations of the methods proposed in this application. The methods will be evaluated through analytical approaches, computer simulations and applications to multiple real datasets. ? ? Recent development of large-scale genotyping technologies has led to the generation of an immense amount of genotype data, which raises many new challenging problems for the analysis and interpretation of the data. This application proposes developing new statistical methodologies that address a set of unresolved issues. ? ? ? ? ?
Showing the most recent 10 out of 36 publications