To help to analyze and understand aging-related complex traits that are affected by many genes and environmental factors, we have followed the path of developing statistical algorithms for the analyses of genome-wide genotyping and high-throughput sequencing studies. Our proposed new computational tools provide means to analyze additional types of data e.g., to identify mitochondrial DNA (mtDNA) variants and to estimate mtDNA copy number efficiently from whole-genome sequences. For experimental tests of the algorithms, we are capitalizing on the special advantages of the SardiNIA project (see Annual Report AG000675) and InCHIANTI project (see Annual Report AG001050) to help in the assembly of mitochondrial sequence data and multiple phenotypic data in the two Italian cohorts. At the same time, we are carrying out genome-wide association studies (GWAS) and epidemiological analyses with NIA and other collaborators for a series of age-related traits. In order to conduct analyses on large-scale consortium data to study mtDNA variation and copy number, we have developed two computational programs, providing a general solution for the analysis of mtDNA dynamics based on whole-genome sequencing studies. One program (mitoCaller) is designed specifically to identify mtDNA variants; the other (mitoCalc) infers mtDNA copy number in a cell directly from genome sequences. Applying the programs to leukocyte sequences of 2,000 SardiNIA participants and 1,000 InCHIANTI participants, we have shown that heteroplasmies (mtDNA variants with more than one allele at a site) increase with age, and that copy number is relatively highly heritable and is correlated with metabolic traits, particularly central fat levels. In more recent work, we have increased the speed of mitoCalc 100-fold (fastMitoCalc). The new program is being applied to white cells of 65,000 deeply sequenced individuals (TOPMed program, NHLBI), for GWAS on copy number. We have also initiated the development of programs to test possible effects of mtDNA variants on traits and diseases. In another study, we have created a program that uses machine learning methods to measure effective rates of aging for individuals. We assess the extent to which an individual's physiological age could be determined as a composite score inferred from a broad range of biochemical and physiological data. Data were collected in the SardiNIA population study. We use machine learning strategies on data for 6,000 Sardinian participants, who ranged in age from 12 to 81. The best predictive models are determined from multiple combinations of dimensionality reduction, classification, and regression algorithms. They reach very strong correlations (R > 0.9) between predicted and actual ages, and show relative stability in successive visits of the same individuals (R>0.5). We then define an Effective Rate of Aging (ERA) for each participant, a trait measured as the ratio of an individual's predicted age to his/her chronological age. The inference that individuals have a characteristic rate of aging is supported by findings that in the SardiNIA cohort, the inferred values of ERA shows genetic heritability of 40%. This has been sufficient to initiate genome-wide association studies that identify genetic variants influencing the rate of aging. In an ongoing collaborative effort, we study a special structural feature of DNA, G-quadruplex (G4) structures as potential DNA roadblocks that perturb mitochondrial replication machinery. The computational analyses of whole-genome sequences from two large Italian cohorts demonstrated an association between G4s and mitochondrial DNA variation. We found significant enrichment of mutations in stable G4 regions, with preferential enrichment of variants in the loop segments of G4 structures. Biochemical studies demonstrated a potent block of human mitochondrial replicative polymerase in DNA synthesis by G4 structure, which could be overcome by the G4-resolving helicase Pif1. Altogether, our results suggest that mitochondrial variants are enriched at stable G4 structures, which effectively delay or stall the mitochondrial replisome in vitro.
Showing the most recent 10 out of 14 publications