Neomorphic Software, Inc. intends to develop new algorithmic analysis methods, software implementations, and annotated biosequence data sets for the elucidation of new genes and genomic and proteomic relationships. The motivation behind this SBIR grant is to develop new methods for the improved annotation of genomic, cDNA, and protein data given some or all of these types of data in consort. Faced with the massive sequencing efforts of expressed sequence tag (EST) and genomic data in both the public and private sector, Neomorphic's goal is to derive knowledge from data through the use of new statistical analysis techniques. An SBIR Phase II research project will continue with the success of phase I in which new Hidden Markov Model (HMM) based algorithmic methods were invented for the alignment, error correction, and homology identification of ESTs and the identification of genes in genomic DNA. Phase II research will focus on the annotation of nucleic acid sequences with specific emphasis on: 1. the identification of protein motifs, domains and remote homologies that would aid in the classification of ESTsequences that include relatively high rates of indels and substitutions. and are currently unclassified and 2.the identification and functional characterization of new genes using EST and protein homology information from preliminary consensus genomic DNA obtained from low coverage shotgun sequencing. The new analysis methods will aid scientists in assimilating evidence for the precise annotation of genomic or transcriptional (cDNA) data including intron/exon boundaries, UTR regions, transcription start sites and other regulatory elements, codon structure, frame shifts, base call corrections, single nucleotide polymorphisms (SNPs), alternative splicing, putative protein prediction, and associations with homologous protein sequences, families, and motifs.

Proposed Commercial Applications

Our software will allow biotechnology and pharmaceutical companies to mine EST databases for critical.new lead targets, and as further human genomic sequence becomes available and new functional genomics platforms are developed, to fully characterize human genes involved in critical disease pathways. We will contribute substantial value-added information to both public and private biosequence databases, greatly enhancing the value of this vital data.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Small Business Innovation Research Grants (SBIR) - Phase II (R44)
Project #
5R44HG001801-03
Application #
6181637
Study Section
Special Emphasis Panel (ZRG1-SSS-Y (01))
Program Officer
Brooks, Lisa
Project Start
1998-05-22
Project End
2001-06-30
Budget Start
2000-07-01
Budget End
2001-06-30
Support Year
3
Fiscal Year
2000
Total Cost
$433,341
Indirect Cost
Name
Neomorphic Software, Inc.
Department
Type
DUNS #
City
Berkeley
State
CA
Country
United States
Zip Code
94710