Mounting amounts of diverse biomedical data have been generated. Extracting meaningful information from these datasets has relied on the efforts of informaticians, who are extensively trained in the computer science realm, with little to no training in biology. Similarly, biologists in general are not proficient to analyze, annotate, and translate their large datasets into valuable biomedical insights. In addition, there has been an overall lack of public understanding for the importance of Big Data science, hindering the enthusiasm to advance data science in the biomedical field. To bridge the gaps that exist among data generation, interpretation and awareness, our training program will provide critical data science education to current biomedical researchers, expand the data science workforce in the biomedical field, and elicit a broad public recognition of data science. Accordingly, we have engineered an integrated training program with four specific aims: 1) To empower current biomedical researchers with the ability to manage and interpret Big Data by gaining proficiency in utilizing data science software tools;2) To utilize the training component as an interactive testing field for software packages developed by the Data Science Research (DSR) component. User critiques/feedback will refine and transform software tools to a professional grade, facilitating the community to capture the full value of Big Data;3) To cultivate a new generation of developers with transdisciplinary expertise in both computational biology and biomedical informatics;and 4) To heighten public awareness of and enthusiasm for the substantial opportunities embedded within computational biology, which has the potential to transform biomedical research and medicine. To achieve these aims, we have constructed three trainee-oriented modules: Biomedical Researcher /User-Oriented Module, Big Data Science Researcher-Oriented Module, and General Public-Oriented Module. A trans-institutional collaboration has been organized (i.e., UCLA, TSRI, UMMC, and EMBL-EBI), and all components have demonstrated outstanding track records in education. This collaboration will ensure successful execution of the training component substantiated by distinguished experts and meritorious educators from a wide breadth of disciplines, spanning -omics, bioinformatics, and computational science.

Public Health Relevance

The challenges of biomedical Big Data are multifaceted. Advances in biomedical sciences using Big Data will require an adequate workforce with the appropriate data science expertise and skills, including those in computational biology, biomedical informatics, and related areas. Users of Big Data software tools and resources must be trained to use them well. This Training Component is designed to address these challenges.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Specialized Center--Cooperative Agreements (U54)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BST-R (52))
Program Officer
Lyster, Peter
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California Los Angeles
Los Angeles
United States
Zip Code
Buffon, Giseli; Blasi, Édina A R; Adamski, Janete M et al. (2016) Physiological and Molecular Alterations Promoted by Schizotetranychus oryzae Mite Infestation in Rice Leaves. J Proteome Res 15:431-46
Lavallée-Adam, Mathieu; Yates 3rd, John R (2016) Using PSEA-Quant for Protein Set Enrichment Analysis of Quantitative Mass Spectrometry-Based Proteomics. Curr Protoc Bioinformatics 53:13.28.1-16
Wei, Bin; Jin, J-P (2016) TNNT1, TNNT2, and TNNT3: Isoform genes, regulation, and structure-function relationships. Gene 582:1-13
Liu, Rong; Jin, J-P (2016) Calponin isoforms CNN1, CNN2 and CNN3: Regulators for actin cytoskeleton functions in smooth muscle and non-muscle cells. Gene 585:143-53
Scruggs, Sarah B; Wang, Ding; Ping, Peipei (2016) PRKCE gene encoding protein kinase C-epsilon-Dual roles at sarcomeres and mitochondria in cardiomyocytes. Gene 590:90-6
Lindsey, Merry L; Hall, Michael E; Harmancey, Romain et al. (2016) Adapting extracellular matrix proteomics for clinical studies on cardiac remodeling post-myocardial infarction. Clin Proteomics 13:19
Ma, Yonggang (2016) LRP5: A novel anti-inflammatory macrophage marker that positively regulates migration and phagocytosis. J Mol Cell Cardiol 91:61-2
Francis Stuart, Samantha D; De Jesus, Nicole M; Lindsey, Merry L et al. (2016) The crossroads of inflammation, fibrosis, and arrhythmia following myocardial infarction. J Mol Cell Cardiol 91:114-22
Turnham, Rigney E; Scott, John D (2016) Protein kinase A catalytic subunit isoform PRKACA; History, function and physiology. Gene 577:101-8
Lau, Edward; Cao, Quan; Ng, Dominic C M et al. (2016) A large dataset of protein dynamics in the mammalian heart proteome. Sci Data 3:160015

Showing the most recent 10 out of 55 publications