Next-generation sequencing (NGS) has the potential to pro?le all clinically relevant genetic variants simultane- ously in a single genetic test. However, clinical variant discovery pipelines have mostly focused on coding single nucleotide variants (SNVs), regulatory SNVs and small indels. This proposal aims to make repeat analysis a standard component of existing pipelines, focusing in particular on short tandem repeats (STRs), variable number tandem repeats (VNTRs), and low-copy repeats or segmental duplications. Together, these repeats account for 8% of the human genome, but are implicated in a disproportionately large number of Mendelian diseases. The proposed methods are primarily aimed at Illumina sequencing, which forms the vast majority of current Mendelian sequencing pipelines, but also includes alternative technologies such as Paci?c Biosciences and 10X Genomics. The ?rst aim develops algorithms for discovery of repeat variants currently inaccessible from NGS. In the second aim, the PIs propose to generate gold-standard validation data for Mendelian repeats using multiple technologies. In the third aim, the PIs will integrate the proposed methods into existing NGS pipelines for clinical variant discov- ery, and also apply them to large existing data-sets to obtain genotype frequencies of large control populations. The project serves an unmet need by augmenting Mendelian variant pipelines to include highly relevant disease variants.

Public Health Relevance

Next-generation sequencing (NGS) has the potential to pro?le all clinically relevant genetic variants simultaneously in a single genetic test. This project aims to augment existing pipelines to include highly relevant disease variants found in repeated regions that account for over 8% of the genome and are implicated in a large number of Mendelian diseases.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG010149-03
Application #
9960540
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Sofia, Heidi J
Project Start
2018-09-14
Project End
2022-06-30
Budget Start
2020-07-01
Budget End
2021-06-30
Support Year
3
Fiscal Year
2020
Total Cost
Indirect Cost
Name
University of California, San Diego
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
804355790
City
La Jolla
State
CA
Country
United States
Zip Code
92093
Bakhtiari, Mehrdad; Shleizer-Burko, Sharona; Gymrek, Melissa et al. (2018) Targeted genotyping of variable number tandem repeats with adVNTR. Genome Res 28:1709-1719