The overall goal of our research is to develop and extend powerful exact statistical tools for testing genetic association, and to incorporate these methods into two existing, widely used software packages (Cytel Studio, SAS) that will serve the needs of data analysts in pharmaceuticals, genetic epidemiology and public health, and other fields which require a greater understanding of the genetic determinants of complex disease. The demand for these analytic tools is rising dramatically, as rapid progress in genotyping technology is making it easier and less costly to measure sampled subjects for ever larger numbers of genetic markers. Genetic association represents an observed correlation between an investigative genetic marker and some physical trait, and can be assessed using either traditional case-control or family-based study designs. In either case, there are compelling applications of permutation or exact statistical approaches that are computationally challenging, yet are simply unavailable in currently used software or are implemented in a manner that requires excessive memory or computation. The computational innovations developed for this project will fill this gap, significantly improving the efficiency and power of existing tools used for genetic association under both family-based and case-control designs. During Phase I, we will build a prototype computer program that includes (i) exact family-based tests for both biallelic and multiallelic markers, and (ii) a permutation procedure that simultaneously tests genetic association assuming various modes of inheritance (i.e., recessive, dominant, additive, or codominant). We will also investigate the feasibility of incorporating these procedures into a SAS PROC, complementing and extending currently implemented SAS JMP Genomics procedures for testing genetic association. As a part of Phase II, we will integrate our Phase I tools into Cytel's StatXact system and into the SAS JMP Genomics system as an external procedure. We will additionally (i) extend the exact family-based procedures to accommodate haplotype data, (ii) develop and implement algorithms for permutation approaches to large-scale screening experiments, (iii) incorporate exact versions of basic genetic epidemiologic procedures, and (iv) incorporate efficient Monte Carlo sampling tools to extend the usefulness of the exact procedures to larger data sets.
Rapid progress in genotyping technology is making it easier and less costly to identify increasingly large numbers of genetic markers from sampled humans. These markers can be used to identify new genes potentially associated with many complex diseases. This project will provide genetics researchers with more accurate and efficient statistical tools for analyzing data from these studies.