A central goal of human genetics is to fully understand the causes of human disease, for in this knowledge lies the hope of improved treatments and perhaps ultimately prevention of those diseases. The central challenges of human genetics are driven by the inherent complexity of most diseases, combined with the simple physical size of the human genome. Only a few years ago, the size of the genome alone was sufficient to prevent substantial progress towards understanding all but the most simple of human diseases. There was, until very recently, simply no technology that could even hope to acquire genetic information at a scale necessary to address the causes of complex disease. Those technological barriers are now falling, rapidly. We currently possess technology that is capable of acquiring large amounts of genomic information, and we know with certainty that these technologies will only get better, faster and cheaper in the years to come. If we are to make progress towards our ultimate goal, though, we will need numerous advances in our ability to efficiently design high throughput custom assays, to collect high throughput laboratory data, and most importantly in our ability to interpret that data with respect to its contributions to human disease. This proposal highlights methods to analyze high-throughput, large-scale biological data in a powerful and efficient manner, in an attempt to discover how genes contribute to human disease. ? ? Specific Aim 1: To develop analytical and computational techniques to perform whole genome association studies on hundreds of thousands of Single Nucleotide Polymorphisms (SNPs) simultaneously, using all available haplotypic data, in ways that are both computationally tractable, and highly powered to find association when it does exist. ? A. To use these techniques for nuclear families (two-parents and offspring), and simple categorical phenotypes (diseased vs. not diseased). ? B. To extend these techniques to population level data (no family information, only Case vs. Control). ? C. To extend these techniques to quantitative data (e.g. blood pressure, blood glucose level, etc.). D. To extend these techniques to full pedigreed (extended family) data. ? ? Specific Aim 2: To provide automated tools to help biologists take advantage of these advances. A. To provide automated tools to collect the data from high-throughput experiments. ? B. To provide tools for automated analysis of that data. ? C. To provide reports of the analysis in a format human geneticists can easily interpret and use. ? ? ?
|Johnston, Henry Richard; Hu, Yijuan; Cutler, David J (2015) Population genetics identifies challenges in analyzing rare variants. Genet Epidemiol 39:145-8|
|Johnston, Henry R; Cutler, David J (2013) A comprehensive search for recombinogenic motifs in the human genome. PLoS One 8:e62920|
|Jakubek, Yasminka A; Cutler, David J (2012) A model of binding on DNA microarrays: understanding the combined effect of probe synthesis failure, cross-hybridization, DNA fragmentation and other experimental details of affymetrix arrays. BMC Genomics 13:737|
|Johnston, Henry R; Cutler, David J (2012) Population demographic history can cause the appearance of recombination hotspots. Am J Hum Genet 90:774-83|
|Wingo, Thomas S; Lah, James J; Levey, Allan I et al. (2012) Autosomal recessive causes likely in early-onset Alzheimer disease. Arch Neurol 69:59-64|
|Carney, Amanda E; Sanders, Rebecca D; Garza, Kerry R et al. (2009) Origins, distribution and expression of the Duarte-2 (D2) allele of galactose-1-phosphate uridylyltransferase. Hum Mol Genet 18:1624-32|
|Kohler, Jared R; Cutler, David J (2007) Simultaneous discovery and testing of deletions for disease association in SNP genotyping studies. Am J Hum Genet 81:684-99|