Disseminated coccidioidomycosis (DCM) is a rare and potentially life-threatening consequence of infection with a desert soil dwelling fungal pathogen native to the Southwestern USA (Coccidioides spp.). The reason why a subset (<5%) of otherwise healthy people develop this adverse outcome after infection while most others do not is largely unknown. However, evidence points to genetics, primarily involving variation in the immune system. To discover the systems genetic patterns and pathways associated with DCM, we will examine the differential distribution of variants in biologically meaningful gene sets at genome-wide scale to find patterns that underlie disease susceptibility. Focusing on aggregated systems-level sets allows us to find patterns in the presence of cross-patient differences, and substantially increases our statistical discovery power by reducing the number of features being directly tested. This study will be the first of its scope and kind, using the largest cohort ever assembled for this disease (DNA collected from 147 susceptible DCM cases and 388 resistant controls presenting as self-limited pulmonary coccidioidomycosis). The data and results gathered under this proposal thus present a unique resource to lay important foundations for the study of DCM pathogenesis. The DNA has been both (i) genome-wide genotyped for common variation, and (ii) exome sequenced to look for rare, protein-altering variation. In our first Aim, we will study the genotype data in three different ways. First, we will look for association between DCM and variation in the human leukocyte antigen region (HLA) which plays an important role in many infectious diseases. Second, we will look at the distribution of patient genotypes at infection-relevant reQTL variant sets to model if DCM versus PUL have differences correlated with their phenotype. These reQTL variant sets are groups of DNA positions where different alleles can cause stronger or weaker gene expression responses after infection or stimulation, and differences between DCM and PUL at those positions could imply that their Coccidioides-response capacity may differ. Third, we will conduct a pathway-association study using both hypothesis-driven and unbiasedly selected pathways (sets of genes) to see if these genesets are enriched for variants associated with DCM. For our second Aim, we will analyze the rare variants found in patient exomes to compare whether DCM participants have an excess of rare and protein-damaging mutations in immune or other candidate pathways. We will use a two-step version of the small sample size optimized Sequence Kernel Association test, comparing the distribution of these mutations between DCM and PUL participants. Using a pathway, or gene set approach allows us to look at differentially impacted systems, rather than requiring each participant to carry the same single-gene mutation. Results of the studies under these two aims will lead to a better understanding the human genetic variation associated with DCM, will help us understand this emerging infectious disease (NIAID Category C), identify people at lower or higher risk, and ultimately build towards improvements in clinical care and patient experience.
We will compare the DNA (whole genome genotype and exome sequencing) of 147 well-phenotyped people who developed disseminated coccidioidomycosis (DCM, aka, disseminated Valley Fever) with that of 388 people who had self-limited pulmonary coccidioidomycosis (PUL) in order to identify genetic patterns and differences associated with risk. In analyzing genotype data, we will look for associations between DCM and human leukocyte antigen (HLA) haplotypes, known immune response variants sets (reQTLs), and biological process pathways. In analyzing exome sequence data, we determine whether DCM patients carry more, or a different distribution of rare, damaging variation in candidate and unbiased biological process pathways.