In our rapidly evolving information era, methods for handling large quantities of data obtained in biomedical research have emerged as powerful tools for confronting critical research questions, with significant impacts in diverse domains ranging from genomics to health informatics to environmental research. The NIH's Big Data to Knowledge (BD2K) Training Consortium is expected to empower current and future generations of researchers with a comprehensive understanding of the data science ecosystem: the ability to explore, prepare, analyze, visualize, and interpret Big Data. To these ends, we propose a novel Training Coordinating Center (TCC) to coordinate the diverse activities occurring within the BD2K Training Consortium into a synergistic training effort. The TCC will create an inclusive and collaborative virtual environment - entitled "Big Data U" - serving trainees from a wide spectrum of educational backgrounds and scientific domains. Big Data U will make personalized educational resources easy accessible and facilitate novel research collaborations through scientific rotations. We will harvest the web to automatically identify, model, and incorporate online resources into an Educational Resource Discovery Index (ERuDIte) and a Big Data U Knowledge Map. This unique system will alleviate the burden of sifting through hundreds of educational resources and searching across multiple research and training program websites, allowing users to easily determine which resources are didactically significant and correspond to the appropriate scientific domain of interest, level of education, and learning objective. Over the long term, our efforts will cultivate a diverse network of data scientists that can propagate their knowledge and experience for generations to come. Our PI and team have a demonstrated commitment to training in biomedical data science. The University of Southern California is ideally suited to host this NIH BD2K effort, having a strong history of data science training and recently founded two new masters programs of relevance to Big Data biomedicine. The TCC is the logical extension of our outstanding track record in data science, and we will leverage our comprehensive experience and infrastructure in developing the TCC.

Public Health Relevance

A novel Training Coordinating Center (TCC) will be developed to coordinate the diverse activities occurring within the BD2K Training Consortium into a synergistic training effort. The TCC will create an inclusive and collaborative virtual environment - entitled Big Data U - serving trainees from a wide spectrum of educational backgrounds and scientific domains and facilitate novel research collaborations through scientific rotations.

Agency
National Institute of Health (NIH)
Institute
National Institute of Environmental Health Sciences (NIEHS)
Type
Resource-Related Research Projects--Cooperative Agreements (U24)
Project #
1U24ES026465-01
Application #
9044624
Study Section
Special Emphasis Panel (ZRG1-VH-J (90))
Program Officer
Shreffler, Carol K
Project Start
2015-09-30
Project End
2018-07-31
Budget Start
2015-09-30
Budget End
2016-07-31
Support Year
1
Fiscal Year
2015
Total Cost
$2,125,666
Indirect Cost
$725,666
Name
University of Southern California
Department
Neurology
Type
Schools of Medicine
DUNS #
072933393
City
Los Angeles
State
CA
Country
United States
Zip Code
90089
Garmire, Lana X; Gliske, Stephen; Nguyen, Quynh C et al. (2016) THE TRAINING OF NEXT GENERATION DATA SCIENTISTS IN BIOMEDICINE. Pac Symp Biocomput 22:640-645
Van Horn, John Darrell (2016) Opinion: Big data biomedicine offers big higher education opportunities. Proc Natl Acad Sci U S A 113:6322-4
Eickhoff, Simon; Nichols, Thomas E; Van Horn, John D et al. (2016) Sharing the wealth: Neuroimaging data repositories. Neuroimage 124:1065-8