Conducting experiments on model organisms is fundamental to biomedical research and underpins research that leads to healthcare advances. Five of the most important models are Mouse (developmental and behavioural studies), nematode (developmental and parasitological studies), budding yeast (fundamental molecular studies), rat (pharmacological, behavioural and neurological studies) and zebrafish (developmental, neurological and toxicological studies). Model Organism Databases (MODs), which capture and curate the wealth of data on these model organisms, have been established. Modern biology has resulted in the complete DNA sequence ("genome") of the human as well as these model organisms. In turn this has led to a new era of research in which experiments are carried out at the whole genome scale. The success of genomics has fuelled a challenge to integrate genomic datasets within the MODs in such a way that querying them and extracting data in a flexible fashion is possible for all scientists as well as specialist bioinformaticians. As part of previous work in support of another model organism, the fruitfly, and more recently to manage the data from the multi-institutional NIH-funded modENCODE project, InterMine software was developed to greatly increase the power and flexibility with which scientists can utilize genomic data. InterMine was designed to be applied easily to other areas of biology and organisms.
The aim of this project is to apply the InterMine software to the above five MODs: mouse, nematode, budding yeast, rat and zebrafish. This provides a number of advantages to each database: user-community driven functionalities that are not yet available;a standard interface common to all MODs;greater inter-operation between MODs to provide a common set of tools to compare and contrast the properties of genes and proteins within this set of organisms, a feature that is not generally available today. This project will be carried out as a collaboration between the team that developed InterMine, based in Cambridge UK, and the teams that develop and maintain the five MODs, based at the Jackson Laboratory (mouse, MGI), the Ontario Institute for Cancer Research (nematode, WormBase), Stanford University (yeast, SGD), the Medical College of Wisconsin (rat, RGD) and the University of Oregon (zebrafish, ZFIN). This proposal provides one staff member per site, and the resulting team will work together to transfer data into, and add analysis tools to, InterMine databases that will be integrated at each MOD site. A benefit of working together in this way is that developments at one site can immediately benefit the others. By the end of the project the MODs will be able to provide far greater functionality to their research communities, and improvements to the underpinning InterMine software will be freely available to the broader biological database community. The proposed project is unique in its integration of experimental results across the major model organisms. This integration is essential for our advanced understanding of molecular genetics, cell biology, developmental biology, physiology, and most importantly, human health and disease.

Public Health Relevance

The recent decoding of the human genome sequence has unprecedented implications for the future of human healthcare through improved understanding of human development, functioning, aging and disease. However, much of the experimental work required to fully understand these events cannot be done in humans and must therefore be carried out in so-called model organisms. The proposed project will address a pressing need to improve the efficiency with which the huge amounts of model organism data being generated can be integrated, analyzed and compared: improved understanding of model systems will lead to improved understanding of humans and thus to better disease diagnosis, prognosis, prevention and cure.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Bonazzi, Vivien
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Cambridge
United Kingdom
Zip Code
CB2 1-TN
Kalderimis, Alex; Lyne, Rachel; Butano, Daniela et al. (2014) InterMine: extensive web services for modern biology. Nucleic Acids Res 42:W468-72
Howe, Douglas G; Bradford, Yvonne M; Conlin, Tom et al. (2013) ZFIN, the Zebrafish Model Organism Database: increased support for mutants and transgenics. Nucleic Acids Res 41:D854-60
Cherry, J Michael; Hong, Eurie L; Amundsen, Craig et al. (2012) Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res 40:D700-5
Balakrishnan, Rama; Park, Julie; Karra, Kalpana et al. (2012) YeastMine--an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit. Database (Oxford) 2012:bar062
Bradford, Yvonne; Conlin, Tom; Dunn, Nathan et al. (2011) ZFIN: enhancements and updates to the Zebrafish Model Organism Database. Nucleic Acids Res 39:D822-9
Howe, Douglas G; Frazer, Ken; Fashena, David et al. (2011) Data extraction, transformation, and dissemination through ZFIN. Methods Cell Biol 104:311-25