Multi-point and multi-locus analysis of genomic association data

Li, Jing

Abstract

Genome-wide association studies (GWAS) provide a new and powerful approach to investigate the effect of inherited genetic variation on risks of complex diseases. With recent advances in genotyping technology, genome-wide association studies are now becoming a reality. Data from GWAS are expected in an accelerated rate. Despite tremendous efforts in developing efficient algorithms for mapping complex diseases/traits, single-locus based approaches are still the primary method for GWAS. However, it is known that usually multiple genetic factors, environmental factors as well as their interactions play an important role in the etiology of complex diseases. Novel and practical approaches to simultaneously model multiple variables and their interactions from hundreds of thousands single nucleotide polymorphisms (SNPs) are greatly needed. In this project, we propose to develop efficient algorithms and practical statistical tools to address two important problems in the context of genome- wide association studies: multi-point analysis and multi-locus analysis. For multi-point analysis, our Dynamic Hidden Chain Markov Model (DHCMM) can jointly model historical recombination and muta- tions, haplotype structures and frequencies, and associations, which is expected to be more effective than existing approaches. For multi-locus analysis, we propose to use an advanced machine learning approach to jointly screen SNPs that are predictive of diseases. Our integrated software system MAVEN will facilitate management, analysis, visualization and results sharing of GWA data using cut- ting edge technologies. The true value of GWAS is pending the development of effective computational models and tools. We anticipate that this research project will greatly accelerate the understanding of the genetic architecture of complex diseases.

Public Health Relevance

Li, Jing Title: Multi-point and multi-locus analysis of genomic association data Abstract: Genome-wide association studies (GWAS) provide a new and powerful approach to inves- tigate the effect of inherited genetic variation on risks of complex diseases. With recent advances in genotyping technology, genome-wide association studies are now becoming a reality. Data from GWAS are expected in an accelerated rate. Despite tremendous efforts in developing efficient algorithms for mapping complex diseases/traits, single-locus based approaches are still the primary method for GWAS. However, it is known that usually multiple genetic factors, environmental factors as well as their interactions play an important role in the etiology of complex diseases. Novel and practical approaches to simultaneously model multiple variables and their interactions from hundreds of thousands single nucleotide polymorphisms (SNPs) are greatly needed. In this project, we propose to develop efficient algorithms and practical statistical tools to address two important problems in the context of genome- wide association studies: multi-point analysis and multi-locus analysis. For multi-point analysis, our Dynamic Hidden Chain Markov Model (DHCMM) can jointly model historical recombination and muta- tions, haplotype structures and frequencies, and associations, which is expected to be more effective than existing approaches. For multi-locus analysis, we propose to use an advanced machine learn- ing approach to jointly screen SNPs that are predictive of diseases. Our integrated software system MAVEN will facilitate management, analysis, visualization and results sharing of GWA data using cut- ting edge technologies. The true value of GWAS is pending the development of effective computational models and tools. We anticipate that this research project will greatly accelerate the understanding of the genetic architecture of complex diseases. PHS 398 Page 1

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Research Project (R01)
Project #: 2R01LM008991-04
Application #: 7652746
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Ye, Jane

Project Start: 2006-03-15
Project End: 2011-07-31
Budget Start: 2009-08-01
Budget End: 2010-07-31
Support Year: 4
Fiscal Year: 2009
Total Cost: $951,009
Indirect Cost

Institution

Name: Case Western Reserve University
Department: Engineering (All Types)
Type: Schools of Engineering
DUNS #: 077758407

City: Cleveland
State: OH
Country: United States
Zip Code: 44106

Related projects


NIH 2010 R01 LM	Multi-point and multi-locus analysis of genomic association data Li, Jing / Case Western Reserve University	$930,276
NIH 2009 R01 LM	Multi-point and multi-locus analysis of genomic association data Li, Jing / Case Western Reserve University	$951,009
NIH 2008 R01 LM	Efficient Analysis of SNPs &Haplotypes with Applications in Gene Mapping Li, Jing / Case Western Reserve University	$409,445
NIH 2007 R01 LM	Efficient Analysis of SNPs &Haplotypes with Applications in Gene Mapping Li, Jing / Case Western Reserve University	$393,862
NIH 2006 R01 LM	Efficient Analysis of SNPs &Haplotypes with Applications in Gene Mapping Li, Jing / Case Western Reserve University	$422,754

Publications

Wang, Wenhui; Yang, Sen; Zhang, Xiang et al. (2014) Drug repositioning by integrating target information through a heterogeneous network model. Bioinformatics 30:2923-30

Wang, Wei-Bung; Jiang, Tao; Gardner, Shea (2013) Detection of homologous recombination events in bacterial genomes. PLoS One 8:e75230

Wang, Wenhui; Yin, Xiaolin; Soo Pyon, Yoon et al. (2013) Rare variant discovery and calling by sequencing pooled samples with overlaps. Bioinformatics 29:29-38

Wang, Wenhui; Yang, Sen; Li, Jing (2013) Drug target predictions based on heterogeneous graph inference. Pac Symp Biocomput :53-64

Hayes, Matthew; Li, Jing (2013) Bellerophon: a hybrid method for detecting interchromosomal rearrangements at base pair resolution using next-generation sequencing data. BMC Bioinformatics 14 Suppl 5:S6

Azad, Rajeev K; Li, Jing (2013) Interpreting genomic data via entropic dissection. Nucleic Acids Res 41:e23

Pirola, Yuri; Bonizzoni, Paola; Jiang, Tao (2012) An efficient algorithm for haplotype inference on pedigrees with recombinations and mutations. IEEE/ACM Trans Comput Biol Bioinform 9:12-25

Li, Xin; Li, Jing (2012) Haplotype inference. Methods Mol Biol 850:411-21

Xie, Minzhu; Li, Jing; Jiang, Tao (2012) Detecting genome-wide epistases based on the clustering of relatively frequent items. Bioinformatics 28:5-12

Hayes, Matthew; Pyon, Yoon Soo; Li, Jing (2012) A model-based clustering method for genomic structural variant prediction and genotyping using paired-end sequencing data. PLoS One 7:e52881

Showing the most recent 10 out of 54 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: