Integrated discovery and hypothesis testing of new associations in rare diseases

Rabadan, Raul

Abstract

Rare diseases are studied in isolated laboratories, forgotten by main stream pharmacological companies, and considered almost academic curiosities. Finding variables that correlate/cause rare diseases (a condition is rare when it affects less than 1 person per 2,000) is a difficult task. The low number of cases and the sparse nature of the reports make it difficult to obtain significant/meaningful statistical results. There are two ways to avoid these problems. The first is to integrate reported cases and associations to generate enough statistical power. The second way is to have an independent data set, big enough to cover rare cases. Each of the two methods has intrinsic problems. For instance, the search in the literature puts together different studies, each of them with their own biases in population, methodology and objectives. On the other hand, blind searches for associations in big databases introduce a large number of false positives due to multiple hypothesis testing. These problems could be avoided by developing innovative methods that allow the integration of information and methodologies in the literature and longitudinal databases. To achieve this goal, we propose a team that combines expertise in natural language processing systems (Carol Friedman), electronic health records (George Hripcsak), statistics in combined databases and computational virology (Raul Rabadan). This team will generate an interdisciplinary approach to mine and integrate the literature and the dataset collected at Columbia/New York Presbyterian hospital. Identifying unusual correlations in rare diseases is the first step to understanding the origin of the diseases and to finding a cure for them. We hypothesize that we will develop effective methods aimed at improving our understanding of rare diseases by combining hypothesis testing and hypothesis discovery, and by integrating information from the literature and from the patient record to obtain increased statistical power. This will involve using natural language processing and statistical methods to mine both the literature and the electronic health record (EHR).

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Research Project (R01)
Project #: 3R01LM010140-02S1
Application #: 8142701
Study Section: Special Emphasis Panel (ZLM1-AP-E (M3))
Program Officer: Sim, Hua-Chuan

Project Start: 2009-07-01
Project End: 2011-06-30
Budget Start: 2010-07-01
Budget End: 2011-06-30
Support Year: 2
Fiscal Year: 2010
Total Cost: $10,000
Indirect Cost

Institution

Name: Columbia University (N.Y.)
Department: Internal Medicine/Medicine
Type: Schools of Medicine
DUNS #: 621889815

City: New York
State: NY
Country: United States
Zip Code: 10032

Related projects


NIH 2010 R01 LM	Integrated discovery and hypothesis testing of new associations in rare diseases Rabadan, Raul / Columbia University (N.Y.)	$531,496
NIH 2010 R01 LM	Integrated discovery and hypothesis testing of new associations in rare diseases Rabadan, Raul / Columbia University (N.Y.)	$10,000
NIH 2009 R01 LM	Integrated discovery and hypothesis testing of new associations in rare diseases Rabadan, Raul / Columbia University (N.Y.)	$533,007

Publications

Pasqualucci, Laura; Khiabanian, Hossein; Fangazio, Marco et al. (2014) Genetics of follicular lymphoma transformation. Cell Rep 6:130-40

Chan, Joseph M; Rabadan, Raul (2013) Quantifying pathogen surveillance using temporal genomic data. MBio 4:e00524-12

Trifonov, Vladimir; Pasqualucci, Laura; Tiacci, Enrico et al. (2013) SAVI: a statistical algorithm for variant frequency identification. BMC Syst Biol 7 Suppl 2:S2

Anthony, S J; St Leger, J A; Pugliares, K et al. (2012) Emergence of fatal avian influenza in New England harbor seals. MBio 3:e00166-12

Vilar, Santiago; Harpaz, Rave; Uriarte, Eugenio et al. (2012) Drug-drug interaction through molecular structure similarity analysis. J Am Med Inform Assoc 19:1066-74

Silverstein, Samuel C; Rabadan, Raul (2012) How many neutrophils are enough (redux, redux)? J Clin Invest 122:2776-9

Singh, Devendra; Chan, Joseph Minhow; Zoppoli, Pietro et al. (2012) Transforming fusions of FGFR and TACC genes in human glioblastoma. Science 337:1231-5

Dapito, Dianne H; Mencin, Ali; Gwak, Geum-Youn et al. (2012) Promotion of hepatocellular carcinoma by the intestinal microbiota and TLR4. Cancer Cell 21:504-16

Greenbaum, Benjamin D; Li, Olive T W; Poon, Leo L M et al. (2012) Viral reassortment as an information exchange between viral segments. Proc Natl Acad Sci U S A 109:3341-6

Ntziachristos, Panagiotis; Tsirigos, Aristotelis; Van Vlierberghe, Pieter et al. (2012) Genetic inactivation of the polycomb repressive complex 2 in T cell acute lymphoblastic leukemia. Nat Med 18:298-301

Showing the most recent 10 out of 21 publications

Comments

Be the first to comment on Raul Rabadan's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: