High-throughput Epistasis Screening using Genetical Genomics A fast software tool is proposed for identifying potential sets of interacting genes involved in human disease pathways. A meta-analysis of marker and expression-trait studies is performed using penalized regression software running in parallel on commodity graphics cards. The research team includes experts from genomics, statistics and software acceleration. Data will come from published studies. Initial results suggest promise for our approach. Epistasis is a key area of investigation in the elucidation of human- disease pathways. eQTL experiments have shown promise in identifying epistasis for given expression traits. We will leverage the success of eQTLs by employing the results of GWAS experiments to suggest specific expression traits to study. In this way we will exploit the findings of multiple, disparate studies in an overall meta-analysis of a disease trait. Various forms of regression analysis are currently used to screen eQTL data for epistasis, especially stepwise linear regression. We will employ penalized regression techniques, because of their speed advantage, their ability to identify multiple candidates simultaneously and their relative novelty. We will apply several distinct types of penalized regression, each with its own predictor-selection characteristics. We have strong in-house expertise in penalized regression. As more and larger genomic data sets become available, effective means for combining and mining them become essential. The sheer mass of the data, moreover, will require high-performance software in order to provide analysis in reasonable time. Parallel computation is one promising area for improving software performance. We will employ the new generation of inexpensive, widely-available graphics coprocessors to run our software in parallel. Successful application will demonstrate that relevant, large- data bioinformatics solutions can be implemented on modestly-priced desktop hardware.

Public Health Relevance

Personalized medicine is based on the observation that susceptibility to disease has a strong genetic component. This genetic component consists of groups of highly interacting genes. We will develop high- speed software able to process the huge amounts of data needed to identify these interactions and the role they play in disease susceptibility.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Small Business Innovation Research Grants (SBIR) - Phase I (R43)
Project #
1R43HG005936-01
Application #
8002139
Study Section
Special Emphasis Panel (ZRG1-IMST-E (15))
Program Officer
Struewing, Jeffery P
Project Start
2010-09-27
Project End
2012-08-31
Budget Start
2010-09-27
Budget End
2012-08-31
Support Year
1
Fiscal Year
2010
Total Cost
$181,857
Indirect Cost
Name
Insilicos
Department
Type
DUNS #
126643241
City
Seattle
State
WA
Country
United States
Zip Code
98109