We have established a large multi-center collaboration designated as GRASP (Genome Research in African American Scleroderma Patients), comprising of 24 centers outside of the NIH and have enrolled the largest cohort of African American scleroderma (systemic sclerosis, SSc) patients. Currently, we have collected DNA samples on 1200 African American SSc patients and have received DNA samples on 1200 controls from Charles Rotimi at NHGRI. 1039 sera samples from controls have been tested for anti-nuclear antibody (ANA) and ANA negative control DNA samples have been used for further genetic studies. 1008 SSc patients and 1008 controls have been genotyped on the Illumina Multi-Ethnic Global Array (MEGA) that contains 1.7 million markers and a second custom Illumina OmniExpressExome array that contains 1 million markers to conduct genome wide association study (GWAS). We will genotype an additional 432 samples (both patients and controls) on the MEGA array for replication and joint analysis. Whole exome sequencing (WES) has been performed on 400 SSc patients and 482 controls. Based on our preliminary analysis, 392 genes from WES and 44 genes from GWAS have undergone targeted resequencing in another 600 SSc patients and 360 controls. After quality control filtering on the Illumina MEGA array data, 934 patients and 946 controls remained and were included in the analysis. High quality variants from the MEGA array were imputed into the 1000 Genomes Phase 3 v5 reference panel. GWAS of the MEGA array identified class II Human leukocyte antigen (HLA) genes as the strongest risk factor in SSc susceptibility in the African American population. We are analyzing the HLA region using two different approaches, classical HLA allele association and single nucleotide polymorphism (SNP) association. On imputing classical HLA types, the most significantly SSc associated HLA type was a predominantly African allele, HLA-DRB1*08:04, with odds ratio 2.95, 95%CI 2.26-3.85. Regression analysis conditioning on the disease-associated alleles identified another African DRB1 allele, *11:02, as well as HLA-DPB1*13:01, and HLA-DRB4*01:01 as independent contributors to disease risk. 34.6% of African American patients carry the African ancestry HLA alleles, DRB1*08:04 or *11:02 compared with 16.3% of controls. On stratifying the SSc samples by autoantibodies, very strong and specific HLA allele associations were identified with HLA-DRB1*0804 increasing risk by 7.2-fold in anti-fibrillarin antibody subset of African American SSc and HLA-DPB1*1301 increasing risk by 4.1-fold in anti-topoisomerase I antibody subset of African American SSc. HLA-DRB1*0804 is associated with a specific amino acid change in the peptide binding groove of DRB1 molecule at position 74 and this change could lead to recognition of a specific self-antigen (i.e. fibrillarin). HLA-DPB1*1301 is associated with a specific amino acid change in the peptide binding groove of DPB1 molecule at position 76 and this change could lead to recognition of a specific self-antigen (i.e. topoisomerase I). The top SNP in the GWAS was rs35915063 near HLA-DQB1 gene with a P=2.2x10-17 and OR=1.96 (95%CI 1.7-2.3). Two non-HLA, African ancestry specific loci- IFT43/TGFB3 and FSD2/HOMER2 were also identified. On performing eQTL analysis of the GTEx RNA sequencing data of sun- exposed skin of the lower leg, we observed decreased expression of TGFB3 associated with the minor variant in the IFT43/TGFB3 region. We have performed WES on 400 patients and 482 controls using a Nimblegen capture kit that targets 64 Mb of coding exons and miRNA regions, plus 32 Mb of untranslated regions. We divided the exomic variants into three categories based on frequency and deleteriousness. Next, we performed gene-level tests for association in the discovery series using Combined Multivariate and Collapsing (CMC), C-alpha, sequence kernel association test-optimized (SKAT-O), and kernel-based adaptive cluster (KBAC) tests for all 3 category variants. Our preliminary analysis of the WES data for overall SSc using the 3 categories of variants identified 392 genes at p-value<0.005 by SKAT-O. Using a Nimblegen capture-based target enrichment kit, 436 genes selected from WES and GWAS have been sequenced in 600 SSc patients and 360 controls. We will use the meta-SKAT test to perform a gene-level meta-analysis. This approach will be taken for overall SSc as well as clinical and autoantibody subsets of SSc. Samples with genes identified to be enriched in rare and low frequency variants will be confirmed by Sanger sequencing. We will also search for presence of rare, homozygous coding variants in SSc patients and analyze them for aggregation in a gene or pathway. Each of these genes enriched in rare and low frequency variants will have their own unique story and may involve fibrosis, cytokine signaling, inflammatory pathways or epithelial to mesenchymal transition. These genes could also be part of a common pathway and studying the dysregulation of that specific pathway may yield greater insight into SSc pathogenesis.

Project Start
Project End
Budget Start
Budget End
Support Year
1
Fiscal Year
2019
Total Cost
Indirect Cost
Name
National Institute of Arthritis and Musculoskeletal and Skin Diseases
Department
Type
DUNS #
City
State
Country
Zip Code