Methods Development 1) The Amelioration of Inflated Type I errors with Tiled Regression The last few months of 2016 saw the publication of two projects focused on the investigation of the type I error rate using Tiled Regression. In the first, the effects of the minor allele frequency of the single nucleotide variant (SNV), the degree of departure from normality of the trait, and the position of the SNVs on type I error rates were investigated in the Genetic Analysis Workshop (GAW) 19 whole exome sequence data. To test the distribution of the type I error rate, five simulated traits were considered: standard normal and gamma distributed traits; two transformed versions of the gamma trait (log10 and rank-based inverse normal transformations); and trait Q1 provided by GAW 19. Tests of association were performed with standard linear regression and average type I error rates were determined for minor allele frequency classes. Rare SNVs (minor allele frequency < 0.05) showed substantially inflated type I error rates for nonnormally distributed traits that increased as the minor allele frequency decreased. The inflation of average type I error rates increased as the significance threshold decreased. Normally distributed traits did not show inflated type I error rates with respect to the minor allele frequency for rare SNVs Schwantes-An et al. 2016. In the second, Sung et al. 2016 investigated the effects of different sets of critical values on type I error rates in tiled regression with genotype data from the Trinity Student Study (TSS). Two hundred replications of simulated null traits from the standard normal distribution were analyzed using four different sets of critical values for stepwise regression at each stage of tiled regression. Results indicate that the multicollinearity among the SNPs considered and the aggregate type I error rates decreased through the three tiling stages; the region-specific type I error rates were slightly lower than the nominal critical values at the tile level; and the critical value at the tile level was between two aggregate type I error rates defined under two different assumptions about the number of tests (the number of SVs and the number of tiles). 2) Ad-hoc replication with Complementary Pairs Stability Selection (ComPaSS) Methods development in 2017 has also focused on reducing type I errors with a different approach. Results from association studies are traditionally confirmed by replicating the findings in an independent dataset. Although replication studies may be comparable for the main phenotype(s) of interest, it is unlikely that secondary phenotypes will be comparable across studies, making replication of these phenotypes problematic. An alternate approach based on complementary-pairs stability selection (ComPaSS-GWAS) may be considered as an ad-hoc alternative to replication. In this approach, the sample is randomly split into two conditionally independent halves multiple times and a GWAS is performed on each half. Similar in spirit to testing for association with independent discovery and replication samples, a marker is corroborated if its p-value is significant in both halves of the sample. Simulation experiments were performed to evaluate the type I error rate and power of ComPaSS-GWAS using simulated data based on genotypes from the Trinity Student Study (TSS). Phenotypes for both a non-genetic model (null hypothesis) and a genetic model were simulated. Simulation results show that this approach reduced the type I error rate with only a small reduction in power. The previously published pyridoxal 5-phosphate (PLP) phenotype from the TSS was used to validate this approach. The results from the validation study were compared to, and were consistent with, those obtained from previously published independent replication data and functional studies Sabourin et al. submitted. Software Development The tiled regression methodology has been implemented in the Tiled Regression Analysis Package (TRAP); version 2.0 of the software includes additional penalized regression models and was released in June, 2016. The package is freely available on the NHGRI website: http://research.nhgri.nih.gov/software/TRAP. Collaborations Craniosynostosis Justice et al. 2012 reported a genome-wide association study (GWAS) for non-syndromic sagittal craniosynostosis and these associations were replicated in an independent Caucasian population of 186 unrelated probands with non-syndromic sagittal craniosynostosis and 564 unaffected controls. Zebra fish were used to test the expression of the previously identified conserved non-coding regulatory elements to determine if the expression of identified sequence variants differed from that of the wild type expression. To accomplish this, a putative regulatory element was created with site-directed mutagenesis and inserted into the Zebra fish Enhancer Detection (ZED) vector construct. The embryos were screened with fluorescent microscopy for red and green florescent protein (RFP and GFP, respectively) positive embryos. Embryos demonstrating RFP/GRP expression were grown to adulthood and bred with wild type fish. Several germline transmitting founders were identified for each ZED vector construct and their progeny were screened for patterns of RFP/GFP expression, again using fluorescent microscopy. GFP expression in the fish with the risk allele (C) appears to occur in the midbrain and hindbrain, while in the fish with the wild type (T) allele, GFP expression was observed in the midbrain-hindbrain boundary Justice et al. 2017, in press. Other ongoing collaborations 1) The ClinSeq project (Les Biesecker, NIH/NHGRI) 2) Genetic analysis of neuro-anatomic quantitative traits in patients with ADHD. Dr. Philip Shaw (NIH/NHGRI) 3) Statistical properties of fixed and mixed effects models. Dr. Ruzong Fan, NIH/NICHD and Georgetown University.
Showing the most recent 10 out of 35 publications