Penn State Biomedical Big Data to Knowledge (B2D2K) Training Program

Ritchie, Marylyn; Honavar, Vasant; Li, Runze

Abstract

The BD2K initiative was developed by the NIH to enable biomedical researchers to capitalize on the Big Data being generated, foster new discovery and increase biological knowledge. The need to train a new generation of skilled scientists in computation, informatics, and statistics to surmount the challenges of big data analysis for biological and biomedical science is widely recognized. An important recommendation with respect to big data computing was to build capacity by training the workforce in the relevant quantitative sciences such as bioinformatics, biomathematics, biostatistics, and clinical informatics. Basic science and biomedical advances rely increasingly on these very large, complex datasets generated by high throughput -omic and other biological technologies, and sound statistical reasoning and sophisticated computational techniques are needed throughout the process of analysis and discovery. This includes all stages of investigation, from experimental design and data pre-processing, de-noising and normalization, to integrating multiple datasets, testing hypotheses, and visualizing data in interactive and informative ways. The new challenges posed by high dimensional and complex data require that life and computer scientists working with big data acquire a substantive understanding of statistics and bioinformatics, and that statisticians working in this area, in return, acquire a substantive understanding of biological principles, experimental technologies and computation. These will converge into an interdisciplinary domain where existing statistical and computational tools are used and combined effectively, and novel methods are generated, to promote innovation and discovery in big data analysis for biomedical science. This interdisciplinary communication is essential for the emergence of a new cadre of researchers who can effectively communicate with their peers in the complementary disciplines required for tackling real problems important for life sciences in big data. The Biomedical Big Data to Knowledge (B2D2K) Training Program at The Pennsylvania State University will bring together Data Science researchers and educators from 5 colleges at Penn State: the Colleges of Science, Engineering, Health and Human Development, Information Sciences and Technology, and Medicine, and Geisinger Health System to create a truly transformative multi-disciplinary predoctoral training environment. The goal of the B2D2K program is to train a diverse cohort comprising the next-generation biomedical data scientists with a deep knowledge of Data Science to develop novel algorithmic and statistical methods for building predictive, explanatory, and causal models through integrative analyses of disparate types of biomedical data (including Electronic Health Records, genomics, behavioral, socio-economic, and environmental data) to advance science and improve health. We believe that the investment in this generation of data scientists will be critical to see all of the `Biomedical Big Data' fully utilized to its greatest potential.

Public Health Relevance

The Biomedical Big Data to Knowledge (B2D2K) Training Program at The Pennsylvania State University will bring together Data Science researchers and educators to create a truly transformative multi-disciplinary predoctoral training environment. The goal of the B2D2K program is to train a diverse cohort comprising the next-generation biomedical data scientists.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Institutional National Research Service Award (T32)
Project #: 5T32LM012415-02
Application #: 9248417
Study Section: Special Emphasis Panel (ZRG1-IMST-T (50)R)
Program Officer: Ye, Jane

Project Start: 2016-04-01
Project End: 2021-03-31
Budget Start: 2017-04-01
Budget End: 2018-03-31
Support Year: 2
Fiscal Year: 2017
Total Cost: $282,768
Indirect Cost: $15,036

Institution

Name: Pennsylvania State University
Department: Biochemistry
Type: Schools of Arts and Sciences
DUNS #: 003403953

City: University Park
State: PA
Country: United States
Zip Code: 16802

Related projects


NIH 2020 T32 LM	Penn State Biomedical Big Data to Knowledge (B2D2K) Training Program Broach, James R.; Honavar, Vasant G.; Li, Runze / Pennsylvania State University
NIH 2019 T32 LM	Penn State Biomedical Big Data to Knowledge (B2D2K) Training Program Broach, James R.; Honavar, Vasant G.; Li, Runze / Pennsylvania State University
NIH 2018 T32 LM	Penn State Biomedical Big Data to Knowledge (B2D2K) Training Program Broach, James R.; Honavar, Vasant G.; Li, Runze; Ritchie, Marylyn D. / Pennsylvania State University
NIH 2017 T32 LM	Penn State Biomedical Big Data to Knowledge (B2D2K) Training Program Ritchie, Marylyn D.; Honavar, Vasant G.; Li, Runze / Pennsylvania State University	$282,768
NIH 2016 T32 LM	Penn State Biomedical Big Data to Knowledge (B2D2K) Training Program Honavar, Vasant G.; Li, Runze; Ritchie, Marylyn D. / Pennsylvania State University

Publications

Basile, Anna Okula; Ritchie, Marylyn DeRiggi (2018) Informatics and machine learning to define the phenotype. Expert Rev Mol Diagn 18:219-226

Li, Runze; Ren, Jian-Jian; Yang, Guangren et al. (2018) Asymptotic Behavior of Cox's Partial Likelihood and its Application to Variable Selection. Stat Sin 28:2713-2731

Tian, Yuan; Nichols, Robert G; Cai, Jingwei et al. (2018) Vitamin A deficiency in mice alters host and gut microbial metabolism leading to altered energy homeostasis. J Nutr Biochem 54:28-34

El-Manzalawy, Yasser; Hsieh, Tsung-Yu; Shivakumar, Manu et al. (2018) Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data. BMC Med Genomics 11:71

Kürüm, Esra; Hughes, John; Li, Runze et al. (2018) Time-varying copula models for longitudinal data. Stat Interface 11:203-221

Coble, Joel L; Sheldon, Kathryn E; Yue, Feng et al. (2017) Identification of a rare LAMB4 variant associated with familial diverticulitis through exome sequencing. Hum Mol Genet 26:3212-3220

Hubbard, Troy D; Murray, Iain A; Nichols, Robert G et al. (2017) Dietary Broccoli Impacts Microbial Community Structure and Attenuates Chemically Induced Colitis in Mice in an Ah receptor dependent manner. J Funct Foods 37:685-698

Hall, Molly A; Wallace, John; Lucas, Anastasia et al. (2017) PLATO software provides analytic framework for investigating complexity beyond genome-wide association studies. Nat Commun 8:1167

Kim, Dokyoon; Basile, Anna O; Bang, Lisa et al. (2017) Knowledge-driven binning approach for rare variant association analysis: application to neuroimaging biomarkers in Alzheimer's disease. BMC Med Inform Decis Mak 17:61

Walia, Rasna R; El-Manzalawy, Yasser; Honavar, Vasant G et al. (2017) Sequence-Based Prediction of RNA-Binding Residues in Proteins. Methods Mol Biol 1484:205-235

Showing the most recent 10 out of 13 publications

Comments

Be the first to comment on Marylyn Ritchie's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: