Genome-wide association studies (GWAS) are commonplace despite the lack of a comprehensive bioinformatics approach to the analysis of the data. The common method of analysis is to employ parametric statistics and then adjust for the large number of tests performed to limit false-positives (i.e. type 1 errors). This agnostic approach is preferred by some because no assumptions are made about which genes or genomic regions might be important. This logic suggests that the data should tell us where the important genetic variants are. The goal of our proposed research program is to specifically compare this agnostic approach with a bioinformatics approach that selects associated SNPs based on expert knowledge about biochemical pathways and gene function. We propose to develop a bioinformatics approach for selecting SNPs from a GWAS using knowledge about the biology of the genes being studied and the molecular pathology of disease (AIM 1). We will modify and extend the Exploratory Visual Analysis (EVA) database and software that was originally designed for microarray studies with pilot funding from the NLM BISTI program. We will then use this bioinformatics approach along with an agnostic statistical approach for detecting SNPs associated with plasma levels of tissue plasminogen activator (t-PA) and plasminogen activator inhibitor one (PAI-1) in a large population-based sample of Caucasians (n=2000) from the PREVEND study in Groningen, The Netherlands (AIM 2). Those SNPs identified by both methods in the PREVEND study will be evaluated first for replication in an independent population-based sample of Caucasians (n=2000) from the Rotterdam Study in the Netherlands and then for validation in a population-based sample of Blacks (n=2000) from the HeART Study in Ghana, Africa (AIM 3). Finally, we will specifically compare how many and which SNPs replicate and validate using the statistical approach and the bioinformatics approach (AIM 4). Our working hypothesis is that we will obtain more validated and hence more real SNPs using the bioinformatics approach.

Public Health Relevance

The technology to measure information about the human genome is advancing at a rapid pace. Despite these advance, the computational methods for analyzing the data have not kept pace. We will develop new computer algorithms and software that can be used to identify genetic biomarkers of common human diseases and then compare this approach with an analysis strategy that is based only on statistical methods.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
5R01LM010098-04
Application #
8332339
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
2009-09-30
Project End
2013-09-29
Budget Start
2012-09-30
Budget End
2013-09-29
Support Year
4
Fiscal Year
2012
Total Cost
$312,154
Indirect Cost
$94,945
Name
Dartmouth College
Department
Genetics
Type
Schools of Medicine
DUNS #
041027822
City
Hanover
State
NH
Country
United States
Zip Code
03755
Madan, Juliette C; Hoen, Anne G; Lundgren, Sara N et al. (2016) Association of Cesarean Delivery and Formula Supplementation With the Intestinal Microbiome of 6-Week-Old Infants. JAMA Pediatr 170:212-9
Greene, Anna C; Giffin, Kristine A; Greene, Casey S et al. (2016) Adapting bioinformatics curricula for big data. Brief Bioinform 17:43-50
Frost, H Robert; Shen, Li; Saykin, Andrew J et al. (2016) Identifying significant gene-environment interactions using a combination of screening testing and hierarchical false discovery rate control. Genet Epidemiol 40:544-557
Yao, Xiaohui; Yan, Jingwen; Kim, Sungeun et al. (2016) Two-dimensional enrichment analysis for mining high-level imaging genetic associations. Brain Inform :
Du, Lei; Huang, Heng; Yan, Jingwen et al. (2016) Structured sparse CCA for brain imaging genetics via graph OSCAR. BMC Syst Biol 10 Suppl 3:68
Lin, Honghuang; Mueller-Nurasyid, Martina; Smith, Albert V et al. (2016) Gene-gene Interaction Analyses for Atrial Fibrillation. Sci Rep 6:35371
Huang, Minjun; Graham, Britney E; Zhang, Ge et al. (2016) Evolutionary triangulation: informing genetic association studies with evolutionary evidence. BioData Min 9:12
Meddens, Claartje A; Harakalova, Magdalena; van den Dungen, Noortje A M et al. (2016) Systematic analysis of chromatin interactions at disease associated loci links novel candidate genes to inflammatory bowel disease. Genome Biol 17:247
Qiu, Jingya; Moore, Jason H; Darabos, Christian (2016) Studying the Genetics of Complex Disease With Ancestry-Specific Human Phenotype Networks: The Case of Type 2 Diabetes in East Asian Populations. Genet Epidemiol 40:293-303
Cheng, Samantha; Andrew, Angeline S; Andrews, Peter C et al. (2016) Complex systems analysis of bladder cancer susceptibility reveals a role for decarboxylase activity in two genome-wide association studies. BioData Min 9:40

Showing the most recent 10 out of 123 publications