Toxoplasma gondii is an important opportunistic pathogen of humans where it can cause severe disease in the developing fetus and those with HIV/AIDS. Despite extensive efforts by the research community to sequence, assemble and annotate multiple genomes for this organism, these genome sequences remain incomplete due to repetitive and uncloneable sequence. A major reason for this knowledge gap is that the sequencing technologies used (1st and 2nd generation) cannot fully resolve these loci. This prevents fully effective use of the data (which is hosted on the EuPathDB Bioinformatics Resource Center; BRC) by the research community since there are thousands of base pairs of missing and/or unassembled data. Here we propose to resequence and generate de novo assemblies for multiple T. gondii isolates (as well as two other species that serve as comparators) using 3rd generation sequencing and Chromosome conformation-based sequencing approaches, and then annotate them and integrate them into EuPathDB BRC. Our preliminary data show the feasibility of this approach where we have used it to revise the karyotype for T. gondii (discovering that it harbors 13, rather than 14, chromosomes), increase the total genome assembly by ~2 Mb, and perform genome-wide analyses of structural and/or copy number variation at loci with a known role in T. gondii pathogenesis. The proposed studies are responsive to RFA PA-19-068, ?Secondary Analysis of Existing Datasets for Advancing Infectious Disease Research? by specifically using data outside of the EuPathDB BRC (our de novo assemblies and annotations) to improve the utility of data within the EuPathDB BRC (gene expression, annotation and proteomics data, for example). Moreover the analysis pipeline will rely on using the existing genome sequence data within the EuPathDB BRC to identify sequence differences between our new assemblies and those hosted by the BRC. In addition to the expertise of the PI in genome sequencing and function of multicopy loci encoding pathogenesis determinants, the success of the proposed studies is also facilitated by the assembled team, including an expert in Chromosome Conformation Capture-based sequencing approaches (Le Roch) and sequence assembly and annotation (Lorenzi).

Public Health Relevance

Toxoplasma is an important opportunistic pathogen of humans and has infected over a billion people worldwide, where it can cause severe disease in HIV/AIDS patients, the developing fetus and other susceptible children and adults. Our goal is to improve existing genome sequence assemblies for this organism using the latest technologies.

Agency
National Institute of Health (NIH)
Institute
National Institute of Allergy and Infectious Diseases (NIAID)
Type
Exploratory/Developmental Grants (R21)
Project #
1R21AI154386-01
Application #
10048453
Study Section
Pathogenic Eukaryotes Study Section (PTHE)
Program Officer
Joy, Deirdre A
Project Start
2020-06-10
Project End
2022-05-31
Budget Start
2020-06-10
Budget End
2021-05-31
Support Year
1
Fiscal Year
2020
Total Cost
Indirect Cost
Name
University of Pittsburgh
Department
Biology
Type
Schools of Arts and Sciences
DUNS #
004514360
City
Pittsburgh
State
PA
Country
United States
Zip Code
15260