Methods Development Much of the work in the past year has focused on the development of linear regression based methods for intra-familial tests of association for quantitative traits that address non-independence both at the marker and observational level. Efforts have focused on the development of linear regression methods that use multiple, stepwise and/or spatial regression in regions that are bounded by recombination hotspots (areas of increased recombination) over the entire genome. This Tiled Regression approach is being used to test for trait-marker associations in genome-wide association studies and in sequence data and it allows for the both the inclusion of genetic markers that are physically very close together (in linkage disequilibrium) and the inclusion of family data. With this approach, it becomes practical to analyze hundreds of thousands or millions of markers and their significant gene x gene interaction terms. This approach can substantially reduce the total number of tests to a number closer to the number of tiles rather than the number of markers. Furthermore, the tiled approach can be incorporated into a linear regression framework that allows for non-independence between observations incorporating features from the Regression of Offspring on Mid-Parant (ROMP) and Generalized Estimating Equations approaches. If successful for candidate gene studies using SNPs, this approach will be applied to targeted sequence variant data currently being generated for the ClinSeq project. Computer Simulations During the past year computer simulation has been used in two projects to investigate the statistical properties of the methods involved. Lack of agreement among methods used to test for intra-familial association. In a study of platelet aggregation in the GeneSTAR project, Herrera-Galeano et al. 2007 used several different intra-familial association methods and found little correlation between results. In this study, we used computer simulation was used to investigate the lack of agreement among methods. The Genetic Analysis Simulation Program G.A.S.P. was used to generate 500 samples, each with 200 nuclear families with sibship size three. A quantitative trait was simulated based on a single biallelic locus with equally frequent alleles. The underlying genetic model was additive and heritabilities considered included 0 (the null hypothesis), .01 and .05. Four tests of association were performed: ASSOC, FBAT, linear regression with GEE (SASGEE) and ROMP. Pair-wise Pearson correlations of resulting p-values and Spearman rank correlations were calculated. McNemar tests using 0.001 as cutoff value were performed to test for significant differences between the results of each pair of methods. When the heritability attributable to the locus was .05 or greater, there was fairly good agreement between SASGEE and ASSOC, somewhat lesser agreement between ROMP and ASSOC, and little agreement between FBAT and ASSOC. The results under the null hypothesis were somewhat correlated for the SASGEE-ASSOC pair, less correlated for the ROMP-ASSOC pair and almost completely uncorrelated for the FBAT-ASSOC pair. Clearly, the kind of information being used by ROMP and FBAT is different than that used by ASSOC and a linear regression with generalized estimating equations. In general, pair-wise Spearman rank correlations were higher than Pearson correlations. Intra-familial tests of association in the present of genetic heterogeneity The statistical properties of any proposed method should be compared to accepted methods currently in use. In this study, the statistical power of methods of intra-familial tests of association are compared in the presence of genetic heterogeneity. G.A.S.P. v3.3 was used to simulate two subpopulations. In the first, denoted the association subpopulation, the trait was due to a single causative SNP with equal allele frequencies and additive allelic effects. In the second, the non-association subpopulation, the trait was due to a random effect. The two subpopulations were combined in different proportions (100:0%, 50:50% and 30:70%) for the association:non-association subpopulations. Three study designs were considered: 2000 unrelated individuals, 400 nuclear families with sibship size 3, and 117 extended three generation families, each with 17 individuals. The trait heritability was fixed to be .05. Two thousand replications were generated for each experiment. The power of three intra-familial tests of association (ASSOC, FBAT, ROMP) were compared to an analysis of variance of the unrelated individuals for each combination of the two subpopulations. For combinations of 100:0%, 50:50% and 30:70% the power of ASSOC and ROMP was quite good, with ASSOC performing better than the ANOVA for all combinations. The power for ROMP improved from the nuclear to extended family sample, because each extended family was comprised of four nuclear families, but closely approximated that of the likelihood based ASSOC. When a combination of 10:90% association:non-association subpopulations was considered, ASSOC performed considerably better than the ANOVA of unrelated individuals, which performed better than ROMP, which performed better than FBAT, although the power to detect an association for all methods was greatly reduced due to the very small number of individuals from the association subpopulation. Somewhat surprisingly in the 10:90% combination the power of the likelihood-based ASSOC, which incorporates all the phenotypic and genotypic information in the family did not improve when comparing extended families to nuclear families. Collaborations Familial Idiopathic Scoliosis Several analyses focusing on candidate regions and phenotypic subsets have been completed and manuscripts have either been submitted or are in preparation. These include: 1) In this study of susceptibility loci in FIS families with at least one individual with a triple curve, candidate regions have been identified on chromosomes 6 and 10 (Marosy et al. submitted). 2) Statistical genetic analysis of a replication sample of families with familial idiopathic scoliosis with characteristics nearly identical to those of the sample analyzed in Miller et al. 2005 suggest that of the 11 regions identified in the original study, at least 7 of these regions can be considered to be true replications (Behnemann et al. in preparation). 3) Statistical genetic analysis of STRPs and SNPs on chromosomes 9 and 16 (in preparation). 4) A study based on the presence of males with surgery (submitted). Other ongoing collaborations include: 1) the ClinSeq project (Les Biesecker, NIH/NHGRI) 2) the India Diabetes Project (Rasika Mathias, Johns Hopkins University School of Medicine) 3) the GeneSTAR project (Diane and Lewis Becker, Johns Hopkins University School of Medicine) 4) Clinical characterization of NF1 (Douglas Stewart, NIH/NHGRI)
Showing the most recent 10 out of 35 publications