Once diagnosed with Fanconi anemia (FA), identification of the causative gene and the mutations is an arduous task. The conventional screening process is a sequential, multi-step approach and, thus, is inefficient and expensive to perform. FA genes are large, with multiple exons, and harbor a wide spectrum of compound heterozygous mutations spread throughout the gene including large genomic deletions. Thus, molecular diagnosis of nearly half of the 800 families enrolled in the International Fanconi Anemia Registry (IFAR) remained unknown. We employed the massively parallel sequencing technologies to sequence large (2Mb) regions of the genome representing all FA and related DNA-repair pathway genes. We designed Comparative Genome Hybridization arrays (aCGH) arrays to explore large-size copy number variants in the same set of genes. We also employed RNAseq technologies for determining the pathogenicity of unsuspecting variants resulting in aberrant splicing. The use of complementary technologies allowed for successful identification of mutations in FA genes in 43 individuals: FANCA (17), FANCB (4), FANCC (5), FANCD1 (1), FANCE (1), FANCD2 (3), FANCF (2), FANCG (2), FANCI (1), FANCJ (4) and FANCL (3). The strategy we employed was an effective approach to identify variations underlying a highly genetically heterogeneous disorder such as FA, and ensures a timely and efficient molecular diagnosis of future enrollees. Though FA patients can carry mutations in any of the 16 known genes, about two-thirds are affected by mutations in FANCA gene. Thus, for all FA individuals checking for FANCA mutations may serve as an efficient initial step. Earlier, we had used Sanger sequencing method to sequence FANCA coding region and splice junctions in DNA from 195 FA patients. This year, we explored a next-generation sequencing methodology, Truseq custom amplicon, for screening all the 43 FANCA exons along with 100bp of the adjacent regions in DNA from 58 patients. Sixty seven custom amplicons (200 bp in length) were designed, and they targeted a total of 14,642 bp that covered nearly the entire length of the RefSeq FANCA transcript (6090 of the 6191bp). Upon capturing, sequencing and aligning to the reference genome, the sequence depth ranged widely from 204 - 6215 (median 2762) except for four exons (1, 6-7, 15) where the depth was much less and ranged from 20 - 78. Of the 58 DNA sequenced, we found two FANCA mutations for 24, one in 24, and none in ten. The Truseq custom amplicon allows for an efficient evaluation of sequence variations in a large number of DNA samples at once, and the read depth (100s-1000s fold) should allow for detection of variants present in a small proportion of patient DNA. Deletions contribute to a substantial proportion of mutations in FANCA. As part of a comprehensive effort to identify all the disease-causing mutations for patients enrolled IFAR, we analyzed 202 FA families for deletion and insertion mutations using high throughput methods including Comparative Genome Hybridization arrays (aCGH). The arrays contained 135,000 50mer probes, spaced an average interval of 37bp, spanning up to 200kb upstream and downstream of the 15 known FA genes and 12 other functionally relevant genes. We found deletions in 98 families consisting of 88 FANCA, seven FANCC, two FANCD2, and one FANCB families. The precise boundaries identified by aCGH enabled design of PCR assays across the deleted regions, followed by cloning and sequencing across the breakpoints. Fifty-two FANCA deletion ends, and one FANCC deletion end were found to extend beyond the gene boundaries, potentially affecting neighboring genes. Eighty percent of the FANCA deletion breakpoints are Alu-Alu mediated, predominantly by AluY elements. Individual Alu hotspots were identified in introns 21, 17 and 5. Defining the haplotypes of four FANCA deletions shared by multiple families revealed that three share a common ancestry, and all are of recent origin. We are now employing MLPA for detection of deletions in FANCA exons for patient DNA samples in smaller quantities and thus insufficient for CGH analysis. Detailed characterization of deletions is critical for a better understanding of the FA phenotypes. In summary, our sequencing and arrayCGH efforts have resulted in identifying a spectrum of FA gene mutations for over 230 patients. Our goal is to comprehensively catalog mutations in all patients enrolled in the IFAR.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
National Human Genome Research Institute
Zip Code
Yang, Jianying; Chandrasekharappa, Settara C; Vilboux, Thierry et al. (2014) Immune complex-mediated autoimmunity in a patient With Smith-Magenis syndrome (del 17p11.2). J Clin Rheumatol 20:291-3
Zainabadi, Kayvan; Jain, Anuja V; Donovan, Frank X et al. (2014) One in four individuals of African-American ancestry harbors a 5.5kb deletion at chromosome 11q13.1. Genomics 103:276-87
Novotny, Elizabeth; Compton, Sheila; Liu, P Paul et al. (2009) In vitro hematopoietic differentiation of mouse embryonic stem cells requires the tumor suppressor menin and is mediated by Hoxa9. Mech Dev 126:517-22
Shen, H-C Jennifer; Rosen, Jennifer E; Yang, Lauren M et al. (2008) Parathyroid tumor development involves deregulation of homeobox genes. Endocr Relat Cancer 15:267-75
Camps, Jordi; Grade, Marian; Nguyen, Quang Tri et al. (2008) Chromosomal breakpoints in primary colon cancer cluster at sites of structural variants in the genome. Cancer Res 68:1284-95