Within the biological scientific community, having access to high quality data is essential for producing reliable results that contribute to an understanding of life and nature. Billions of dollars have been invested each year to generate such data. However, the pace at which data is being generated far outstrips the scientific community's ability to locate the best data for a given project. The traditional method of waiting for peer-reviewed publications to be text-mined so a researcher can then search via PubMed to locate interesting conclusions is too slow. Thus, a new approach is needed to accelerate the search, find, and analysis process that is at heart of scientific inquiry. The proposed technology, OmicSeq, addresses this by extracting signal directly from raw data thus avoiding the publication cycle, which itself can be biased and typically reflects only a fraction of the value of the data. Moreover, researchers seeking quality data can exploit the search portal offered by OmicSeq to rapidly identify publicly available (or privately generated) data sets known to be correlated to a give gene or pathway query. In effect OmicSeq does for biological data what Google has done for web pages-find the data that is most closely related to genomic query. This accelerates the research cycle and allows investigators to reuse and re-examine repository data that would otherwise go unused. OmicSeq saves the researcher time (therefore money), and helps to make more use out of the data that has been generated.

This I-Corps team plans to market the OmicSeq system using the software-as-a service (SaaS) business model. While a basic version of OmicSeq will be provided free online, the team will make majority of the database and data mining tools available only to customers who pay an annual licensing fee. A similar model has been adopted by numerous bioinformatics service providers. This team expects to attract customers in academic, industry as well as research institutes and government agencies. In surveying university research institutions, this team has determined that annual site licenses run in the range from $20,000 to $50,000 depending upon the degree of access offered. However, the team is actively conducting research to determine what the appropriate licensing costs would be to pharmaceutical companies since they typically pay a much higher license fee for at least two reasons: 1) pharmaceutical companies have private data; thus security is of utmost importance. 2) the magnitude of data can be much larger than the publicly processed data sources that comprise the base offering of OmicSeq. Most of the proposed work at this point relates to accurately assessing the pharmaceutical market.

Project Start
Project End
Budget Start
2016-05-15
Budget End
2018-01-31
Support Year
Fiscal Year
2016
Total Cost
$50,000
Indirect Cost
Name
Emory University
Department
Type
DUNS #
City
Atlanta
State
GA
Country
United States
Zip Code
30322