Repbase Update ( is a relational database of repetitive elements representing over 8500 families and subfamilies of transposable elements (TEs) from eukaryotes. Each family is identifiable by a unique name, and its annotation includes species of origin, systematic classification, keywords, reference to the scientific literature, names of contributors, brief commentaries, etc. Most of the 5198 repetitive families added to Repbase Update (RU) during the last cycle are either unreported anywhere else, or have been thoroughly revised. Our approach is based on computer-assisted reconstruction and analysis of public DMA sequence data. Original contributions are first reported in an electronic journal named Repbase Reports. RU is used throughout the world by academic and research institutions for basic research and genome annotations. This database became a unique resource for individual research projects of biological and medical importance and for creation of secondary databases. During the next five years RU needs to grow at the rate of ~1000 entries per year to meet the demand created by the genome-sequencing projects. This information will be extracted, reconstructed, analyzed, annotated, classified, indexed and made available to researchers over the internet. In addition we will discover and study diverse superfamilies and conserved repeats, which are of primary interest for cutting-edge research. We propose the following specific aims to meet the challenge: (1) continue detection, reconstruction, annotation and electronic distribution of reference sequences for repetitive families from all sequenced eukaryotic species (2) continue studies of TEs in newly sequenced eukaryotic genomes in collaboration with sequencing consortia (3) continue systematic identification and analysis of new superfamilies and classes of transposable elements (TEs) (4) excavate and characterize remnants of TEs overrepresented in conserved non-coding regions and cis-regulatory modules in eukaryotic genomes (5) organize two conferences devoted to stimulation of Repbase-related research, data submission, data dissemination, training and standardization of repeat nomenclature.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Biotechnology Resource Grants (P41)
Project #
Application #
Study Section
Special Emphasis Panel (ZLM1-AP-M (J2))
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Genetic Information Research Institute
Mountain View
United States
Zip Code
Hubley, Robert; Finn, Robert D; Clements, Jody et al. (2016) The Dfam database of repetitive DNA families. Nucleic Acids Res 44:D81-9
Kojima, Kenji K (2015) A New Class of SINEs with snRNA Gene-Derived Heads. Genome Biol Evol 7:1702-12
Kojima, Kenji K; Jurka, Jerzy (2015) Ancient Origin of the U2 Small Nuclear RNA Gene-Targeting Non-LTR Retrotransposons Utopia. PLoS One 10:e0140084
Kojima, Kenji K; Jurka, Jerzy (2013) A superfamily of DNA transposons targeting multicopy small RNA genes. PLoS One 8:e68260
Wheeler, Travis J; Clements, Jody; Eddy, Sean R et al. (2013) Dfam: a database of repetitive DNA based on profile hidden Markov models. Nucleic Acids Res 41:D70-82
Jurka, Jerzy; Bao, Weidong; Kojima, Kenji K et al. (2012) Distinct groups of repetitive families preserved in mammals correspond to different periods of regulatory innovations in vertebrates. Biol Direct 7:36
Groenen, Martien A M; Archibald, Alan L; Uenishi, Hirohide et al. (2012) Analyses of pig genomes provide insight into porcine demography and evolution. Nature 491:393-8
Lehnert, Stefan; Kapitonov, Vladimir; Thilakarathne, Pushpike J et al. (2011) Modeling the asymmetric evolution of a mouse and rat-specific microRNA gene cluster intron 10 of the Sfmbt2 gene. BMC Genomics 12:257
Jurka, Jerzy; Bao, Weidong; Kojima, Kenji K (2011) Families of transposable elements, population structure and the origin of species. Biol Direct 6:44
Kojima, Kenji K; Kapitonov, Vladimir V; Jurka, Jerzy (2011) Recent expansion of a new Ingi-related clade of Vingi non-LTR retrotransposons in hedgehogs. Mol Biol Evol 28:17-20

Showing the most recent 10 out of 46 publications