To analyze mitochondrial DNA variation and its possible effects on aging-related traits, the genotype-calling and analytic programs developed for nuclear DNA are not adequate, because each cell has 100-10,000 mtDNA copies that can vary at any site (heteroplasmy), and can therefore have each of the 4 bases at any position in various copies. We have developed an algorithm that is specific to identify variants in mtDNA;it incorporates the sequencing error rate of each base in each sequence read and has the flexibility to allow for different allele fractions at a variant site across all individuals. It has thus far been successful in determining homoplasmies in the mtDNA sequence from 300 individuals of a total of 2,000 sequenced;and especially because the Sardinian cohort is highly inter-related, we have been able to distinguish newly-arising variants in children compared to their mothers and other relatives. To take advantage of repeated visits, which can increase the accuracy of data and thereby provide more highly significant results with a given size sample, we have, instead of using the average of multiple measurements, developed an empirical Bayes shrinking estimator that summarizes the multiple measurements. Simulations and analysis of real data from the SardiNIA data set show that combining values from repeated visits in an association study yields an expected increase in the GWAS signals compared to using a single visit for measures of many traits at 3 visits over a 10-year period. Furthermore, we have showed that in unbalanced data sets (that is, with different individuals having different numbers of visits), the shrinking estimator further improves GWAS signals relative to the average.
Showing the most recent 10 out of 14 publications