To help to analyze and understand aging-related complex traits that are affected by many genes and environmental factors, we have followed the path of developing statistical algorithms for the analyses of genome-wide genotyping and high-throughput sequencing studies. Our proposed new computational tools provide means to analyze additional types of data (e.g., to identify mitochondrial DNA (mtDNA) variants and to estimate mtDNA copy number) efficiently from whole-genome sequences. For experimental tests of the algorithms, we are capitalizing on the special advantages of the InCHIANTI project (see Annual Report AG001050) and SardiNIA project (see Annual Report AG000675) to help in the assembly of mitochondrial sequence data and multiple phenotypic data in the two Italian cohorts. In order to conduct analyses on large-scale consortium data to study mtDNA variation and copy number, we have developed two computational programs, providing a general solution for the analysis of mtDNA dynamics based on whole-genome sequencing studies. One program (mitoCaller) is designed specifically to identify mtDNA variants; the other (mitoCalc) infers mtDNA copy number in a cell directly from genome sequences. Applying the programs to leukocyte sequences of 2,000 SardiNIA participants and 1,000 InCHIANTI participants, we have shown that heteroplasmies (mtDNA variants with more than one allele at a site) increase with age, and that copy number is relatively highly heritable and is correlated with metabolic traits, particularly central fat levels. In more recent work, we have increased the speed of mitoCalc 100-fold (fastMitoCalc). The new program is being applied to white cells of 65,000 deeply sequenced individuals (TOPMed program, NHLBI), for GWAS on copy number. We have also initiated the development of programs to test possible effects of mtDNA variants on traits and diseases. With our expertise in studying mtDNA variation, we have an ongoing collaborative effort to study a special structural feature of DNA, G-quadruplex (G4) structures, as potential DNA roadblocks that perturb mitochondrial replication machinery. We used computational analyses of 3,000 individual genomes from two Italian cohorts to demonstrate an association between G4s and mtDNA variation. Using the software G4Hunter to predict G4-forming regions in mtDNA, we found statistically significant enrichment of mutations in stable G4 regions, with preferential occurrence of variants in the loop segments of G4 structures. Biochemical studies demonstrated a potent block of human mitochondrial replicative polymerase in DNA synthesis by G4 structure, which could be overcome by the G4-resolving helicase Pif1. Altogether, the computational and biochemical approaches indicate that mtDNA point mutations are enriched at stable G4 structures, consistent with replisome stalling at G-quadruplexes and reliance on error-prone DNA synthesis. In another study, we have created a program that uses machine learning methods to measure effective rates of aging for individuals. We assess the extent to which an individual's physiological age could be determined as a composite score inferred from a broad range of biochemical and physiological traits from the SardiNIA and InCHIANTI longitudinal studies of aging. Physiological age inferred from our framework was highly correlated with chronological age (R2>0.8). We then defined a physiological aging rate (PAR) for each subject, a continuous trait measured as the ratio of the subjects predicted physiological age to his/her chronological age. We found that PARs were reproducible across follow-up studies, heritable (h2=0.3), and predictive of lifespan and mortality. Genome-wide association studies (GWAS) on the PARs identified both previously established age-associated loci and several new genetic associations. Our findings support a whole-body, pathology-independent aging effect that can be summarized by the physiological aging rate and our method can be used to evaluate the efficacy of treatments that target aging-related processes and disease.
Showing the most recent 10 out of 14 publications