Tandem Repeat Expansions (TREs), most commonly of triplet repeats such as poly(CAG), are known to underlie >30 different human neurological diseases. While the majority of TREs identified to date have been found in late-onset neuro-degenerative disorders such as hereditary ataxias and Huntington disease, TREs have been identified in patients with AD and certain types of dementia. In addition to expansions of short tandem repeats (those with motif sizes between 1 and 6 base pairs), copy number variation of larger repeats with motifs ?20bp, also known as Variable Number of Tandem Repeats (VNTRs), has recently been linked to risk of AD. However, despite this evidence that variation in tandem repeat (TR) sequences can act as the causative mutations in some cases of AD and dementia, there have been no concerted efforts in AD cohorts to either systematically screen for novel TREs, or to genotype VNTR copy numbers. Newly developed bioinformatic approaches that can be applied to analyze Whole Genome Sequencing (WGS) data now provide an opportunity to fill this knowledge gap. Utilizing the expertise and knowledge that we have gained working on other large datasets, we propose to apply these approaches to analyze 4,750 genomes sequenced by the Alzheimer's Disease Sequencing Project that are available to the community, and will use these data to investigate two hypotheses: 1. We hypothesize that some cases of AD are caused by rare, highly penetrant pathogenic TREs. Using novel bioinformatic tools that can identify TREs, we will search for rare TREs that are observed only in AD samples, or which show significant enrichment in AD cases compared to controls, and thus are likely causative for AD. Potentially pathogenic TREs will then be validated by PCR or long-read sequencing in available DNA samples. 2. We hypothesize that common polymorphic copy number variation of VNTRs can act as genetic risk factors for AD. We have developed a novel approach based on read depth to estimate copy number of VNTRs from sequencing data. We will analyze available WGS data from 1,643 sporadic late-onset AD samples and 2,253 unrelated controls, generating copy number estimates for ~154,000 VNTRs genome- wide, which will be used to perform association analysis of VNTR copy number with AD status in a case:control study. Given that TREs, and polymorphic variation in VNTRs, both represent established mutational mechanisms that contribute to a variety of late-onset neuro-degenerative conditions, we believe that the study of TR variation in AD represents a logical step that has a high likelihood of uncovering novel genetic causes of AD.

Public Health Relevance

Although >30 tandem repeat expansions (TREs) have been identified in a variety of late-onset neurological diseases, including in patients diagnosed with Alzheimer?s disease (AD) and dementia, there have been no concerted efforts to systematically screen for novel TREs in AD cohorts. Similarly, recent studies have identified copy number variations in large tandem repeats as a risk factor for Alzheimer?s disease. Using novel bioinformatics approaches, in this proposal we will analyze exome sequencing data from 4,750 AD samples and controls generated by the Alzheimer's Disease Sequencing Project to identify novel TREs and large tandem repeat variants that are responsible for AD.

Agency
National Institute of Health (NIH)
Institute
National Institute of Neurological Disorders and Stroke (NINDS)
Type
Research Project (R01)
Project #
3R01NS105781-03S1
Application #
10102561
Study Section
Genetics of Health and Disease Study Section (GHD)
Program Officer
Miller, Daniel L
Project Start
2020-08-15
Project End
2023-07-31
Budget Start
2020-08-15
Budget End
2023-07-31
Support Year
3
Fiscal Year
2020
Total Cost
Indirect Cost
Name
Icahn School of Medicine at Mount Sinai
Department
Genetics
Type
Schools of Medicine
DUNS #
078861598
City
New York
State
NY
Country
United States
Zip Code
10029