Automated genome-phenome analysis The declining cost of whole exome sequencing (WES) is nearing the point at which the spread of WES into clinical practice will be limited largely by the cost of interpreting the results and comparing the to the patient's clinical findings. This project tests the feasibility of reducing this interpretaton cost by pairing automated genome sequencing with an automated comparison of the patient's findings to the "phenotype" of findings of known diseases. This uses the SimulConsult diagnostic tool to provide "phenome analysis", and then integrate with genome analysis results to provide an automated genome-phenome analysis.
Aim 1 is to compute unified severity scores for genome-phenome analysis so as to replace the current methods, which use iterative manual modifications of Boolean filtering of variants. The new approach is a one- pass method based on quantitative severity scores that are then processed by comparison to the phenome. This approach combines many assessments of gene variants provided by SeattleSeq, including conservation scores, read quality scores and variant frequency in the population, to automatically construct quantitative severity scores. To refine the quantitative severity score input weightings, 10 patients will be analyzed for whom SimulConsult has already been used to assist in diagnosis. This builds upon the ability added in 2012 to SimulConsult to import and process the "variant table" of WES results that includes the HGNC gene name, severity score, and zygosity. Also, the ability will be added to import more than one variant table and compute with the intersection (i.e., variants present in both) so familial genetic information can be incorporated.
Aim 2 is to assess the effectiveness of automated genome-phenome analysis to identify known disease- causing genes in patients by retrospectively analyzing 20 patient cases in which WES was already performed on a family with two or more affected members and a known disease-causing mutation was found. The diagnostic accuracy will be assessed by (1) the rank of the correct diagnosis and (2) the probability assigned by the software. This will compare the genome alone, phenome alone, and genome + phenome approaches, as well as other situations involving incidence and onset ages.
Aim 3 is to determine the need for having genomes from others in the family, by assessing differences between examining only the proband versus utilizing information on a second affected family member. The overall goal is to create and test the capability for making WES more practical to analyze and more accurate by integrating phenome information with the genome information, combining two independent assessments of the diagnosis. Today, interpretation costs exceed reimbursement rates, and interviews with relevant labs suggest need for lower costs. As the phenotype becomes known for a greater fraction of genetic abnormalities, the applicability of the automated genome-phenome analysis and the market for it will grow.

Public Health Relevance

Automated genome-phenome analysis. With the declining cost of whole genome sequencing, the main cost of such testing is becoming the cost of interpreting the huge amount of data that is generated. This project combines the power of using diagnostic software to examine all known diagnoses (the phenome) with the power of whole exome sequencing to examine the genome. In automating the genome-phenome analysis, this project brings the power of genome analysis to clinical practice - lowering costs while increasing accuracy.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Small Business Innovation Research Grants (SBIR) - Phase I (R43)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-IMST-J (15))
Program Officer
Bonazzi, Vivien
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Simulconsult, Inc.
Chestnut Hill
United States
Zip Code