Seattle Children's Research Institute is awarded a grant to build a community-driven, high-throughput proteomics analysis environment called SPIRE (Systematic Protein Investigative Research Environment). SPIRE is designed to be modular, and will integrate the best publicly available methods and open community standards. The broad applicability of proteomics analyses in biology is hindered by low knowledge penetration throughout the community regarding how and why tools are used and which tools should be used when to produce statistically-sound results. The proliferation of piecemeal, single-use tools and the absence of comprehensive approaches places a high integration and management burden on researchers. A solution is to create an environment that emphasizes integration, modularity, flexibility, ease of use, and community involvement. The SPIRE modules will encompass each stage of mass spectrometry proteomics analyses, such as raw data uploads, experimental design, peptide and protein identification, protein expression measures, and analysis. SPIRE development will be tailored to, and driven by, the needs of the research community. SPIRE is a community science project for proteomics and, as a result, will drastically accelerate proteomics-based biological and environmental research and empower the community as a whole.

In conjunction with SPIRE, the team will create InSPIRE, a web-based proteomics resource for middle-school education. InSPIRE will provide resources, reviews, and reference materials related to protein science to "inspire" the next generation of researchers. Additional information about SPIRE and InSPIRE will be available at www.proteinspire.org

Project Report

Proteins are the building blocks of living things, and although a lot is known about many of them, there is much more still to be discovered. Further, proteins are just part of the complex set of molecules that work together in all of us. Scientists around the world are working to understand these molecules through gathering data about DNA (genomics), RNA (transcriptomics), and proteins (proteomics) for humans and other organisms. We aspire to know: Which genes and proteins are identified, where (organism, tissue, condition, experiment) and when (developmental stage, time point) they are expressed. How much of each of these genes and proteins are expressed How these genes and proteins interact and in what biological processes (pathways). This project resulted in two new community resources for omics data, a meta-data checklist for omics data and an educational resource for school age children and the public about protein research SPIRE (Systematic Proteomics Investigative Research Environment, www.proteinspire.org) integrates search tools and statistical models into a freely available web-based proteomics research pipeline. By implementing novel methods to identify proteins, SPIRE improves the researchers’ ability to identify proteins in biological samples and provides a consistent method for combining results across experiments. During this project we identified the proteomics community’s great need and desire for a standardized and integrated database of protein expression. Using the ability of SPIRE to re-process and re-analyze publically available proteomics data, we created MOPED (originally the Model Organism Protein Expression Database, moped.proteinspire.org). Yet given that proteins are only part of the complexity, we have since expanded MOPED to be a multi-omics resource (Multi-Omics Profiling Expression Database). MOPED enables researchers to examine data for proteins and genes (transcripts) in one place, making investigations easier and more informative. MOPED supports querying, browsing, and data visualization of expression data across organisms (human, yeast, worm and mouse), tissues (e.g. brain, lung, blood), conditions (e.g. anaerobic, acidic), and pathways (sets of proteins working together within the organism). MOPED also links to resources around the world to allow even more information to be accessed, resources such as Entrez, GeneCards, GO, PDB and UniProt. In order to improve data quality and support reproducibility we created a multi-omics checklist. The checklist gathers information about the experimental design, experimental protocols, instrumentation, data processing and analysis methods to allow researchers to determine which experiments are similar enough to be compared with confidence. Through DELSA (Data Enable Life Sciences Alliance, www.delsaglobal.org) community outreach and the Advisory Committee, we continued to refine the omics-specific checklist using a combination of established biological terms and relationships available through Open Biological and Biomedical Ontologies, best community practices, and developments from the NSF-funded communities. In addition to the scientific resources the project has developed an educational resource called InSPIRE for K-12 education consisting of a hands-on demonstration of proteomics. This demonstration had been presented to nearly 30,000 students through the Pacific Science Center and elsewhere over period of four years. This presentation has been summarized and is being disseminated through the Kolker Lab website (www.kolkerlab.org/projects/principles-of-proteomics). The resources created by this NSF funded project provide the scientific community, students and the public with the means to explore a wide variety of omics data and experiments. This will enable them to make data-driven biological discoveries and better understand complex biological systems.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Application #
0969929
Program Officer
Peter H. McCartney
Project Start
Project End
Budget Start
2010-09-01
Budget End
2014-08-31
Support Year
Fiscal Year
2009
Total Cost
$1,973,699
Indirect Cost
Name
Children's Hospital and Regional Medical Center
Department
Type
DUNS #
City
Seattle
State
WA
Country
United States
Zip Code
98105