The mouse t-complex on chromosome 17 has long been a focus of interest for developmental biologists. Because many embryonic lethal mutations are localized in the region, and corresponding genes of importance in development are therefore inferred to reside there, it is a prime target for high priority sequence analysis. The findings of our collaborators that the region contains a large number of otherwise unknown genes, including many expressed selectively in the ectoplacental cone, both underlines the importance of the region and provides probes to screen for substrates for long-range sequence analysis. We have thus far recovered 11.6 Mb DNA in BACs, representing just over 1/3rd of the estimated 30Mb region. This includes contiguous stretch a 5 Mb region from Brachyury to Sod2 and 1.4 Mb contig covering Fgd2 through Tbp, 2.5 Mb region between MGD CM region 10.0-12.0 and 1.7 Mb contig covering tw5 and Tir1 critical regions. The remaining portions exist in seed contigs which are being actively extended. Overall the current map is formatted with 592 STSs (i.e., 5.0 STSs/100kb) and 400 BACs. Currently a total of 7.5 Mb of sequence has been generated with greater than 99.9% accuracy from 38 BAC clones. The database now contains 5.8 Mb sequence of which 1.75 Mb has been fully annotated, with the rest of the sequence represented as HTGS phase 1. The remaining 1.7 Mb is being annotated for submission. The GC content of the sequenced region varies from 40% to a high of 50%, with a corresponding range of gene concentrations. Our main tools for sequence analysis are comparison with the Genbank EST database, the exon-predicting program GRAIL, the repetitive sequence-finding program CENSOR, and the use of CpG content as an assay for CpG islands and genes. Thus far analysis and annotation of 17 BACs has defined the gene structures for the following known genes in the region, T, T2, QkII, multiple isoforms of Qk-1, Plg, Slca22a1, Slca22a2, Slca22a3, Igf2r, Thbs2, Dll1, Psmb1, Tbp, Pdcd2, RabIIb and MyoIf, Zfp54, and Zfp51. We have also assigned previously known genes without precise localization to the t-complex at defined places in this contig. These include Parkinson-related-2 (Parkin2), Brp44, Lysophosphatidic acid acyltransferase, Mekk4, Zfp118, Stromelysin, Clc-7, Ubiquitin conjugating enzyme, and Pdpk1 genes and a gene for Zn metalloendopeptidase. The sequence from one BAC, which has been published, has identified 11 genes of which 5 are known. They are Rsp29, Als, Nubp3, Jsap1, and Ndk3. The remaining 6 predicted genes are new. Other novel genes predicted in the studies thus far include genes homologous to human brain expressed KIAA0183, two metalloproteases, two Krab-Zinc finger containing genes, two genes with Zinc-finger motifs, a gene with a PHD finger domain as well as a gene similar to MORC, a nuclear protein required for mouse spermatogenesis. BACs with candidate genes from the embryonic lethal loci are being used to construct transgenes to investigate their role by complementation.
Abe, Kuniya; Yuzuriha, Misako; Sugimoto, Michihiko et al. (2004) Gene content of the 750-kb critical region for mouse embryonic ectoderm lethal tcl-w5. Mamm Genome 15:265-76 |
Kargul, G J; Nagaraja, R; Shimada, T et al. (2000) Eleven densely clustered genes, six of them novel, in 176 kb of mouse t-complex DNA. Genome Res 10:916-23 |