The broader impact/commercial potential of this Small Business Innovation Research (SBIR) project is the development of an artificial intelligence-based software platform to elucidate inherited diseases by identifying the causative genetic factors in the whole genome at large, not just the tiny portion of the genome called the exome. This technology will help solve the longstanding problems of multigenic inherited diseases and help unlock the full potential of gene therapy modalities such as CRISPR, which require knowledge of the causal mutation for targeting. With a defined genetic target, truly curative therapeutics and early, accurate diagnostics will be made possible. Through this platform, pharmaceutical companies will benefit from a shortened drug development cycle and a lower risk of clinical trial failure while diagnostics companies will be able to develop fast and accurate molecular diagnostics targeting the mutations identified by the platform.

This SBIR Phase I project proposes to build a build a proof-of-concept software platform that employs a combination of supervised and unsupervised machine learning algorithms to process, sort, and analyze human whole genome sequencing data from an Amyotrophic Lateral Sclerosis (ALS) cohort from a set genetic background. ALS is an incurable, debilitating disease whose genetic causes remain largely unknown. Identifying these mutations will enable the development of efficacious treatments for the conditions. Three main objectives will be accomplished for the platform. One, the ability to process full-size, high coverage human whole genome data automatically through the pipeline in a scalable manner. Two, the ability to identify the genetic background of a test subject with a top-5 error rate of <1%, an important verification step to minimize incorrect cohort stratification from false ancestry self-reports. Three, the ability to rapidly identify at least one genetic feature known to be associated with ALS (e.g., SOD1, C9ORF72), which will help provide early validation for the platform.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2018-07-01
Budget End
2019-06-30
Support Year
Fiscal Year
2018
Total Cost
$225,000
Indirect Cost
Name
Genetic Intelligence
Department
Type
DUNS #
City
New York
State
NY
Country
United States
Zip Code
10029