Problems in data generation, acquisition, management, analysis, visualization, and interpretation, which have always been important players in biomedical science, now assume leading roles in the massive effort to understand health and disease. The unprecedented size, complexity, and heterogeneity of big biomedical data demands research that will allow us to more efficiently extract knowledge from data in order to make better predictions, to characterize biological systems, and generally to enable subsequent investigation. The research activity directed towards these problems is an amalgamation of multiple disciplines: computer sciences, statistics/biostatistics, and specific biomedical science areas all offer critical insights into biomedical data science, or what we call bio-data science in this proposal. This science affects basic biological investigations as well as translational studies and clinical research. Taking advantage of standing PhD programs, the collaborative research infrastructure, and various initiatives in data science and big data at UW, we propose cross training in bio-data science for pre-doctoral students. Trainees will come from one of three focus areas, will complete course work in the three areas, and will be trained in interdisciplinary research, computing infrastructure, and the responsible conduct of research on their way to discovering new knowledge in their PhD thesis work.

Public Health Relevance

Modern biological, medical, and health studies often involve large heterogeneous data sets from which useful, accurate information cannot be efficiently extracted with available methods. Research to improve the analysis of biomedical big data is active at the interface of computer sciences, statistics, and various other biomedical domains, such as genomics and brain science. We propose to train research workers for this interface in order to further advance developments in areas of biomedicine that are reliant on big data.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Institutional National Research Service Award (T32)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-IMST-T (50)R)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Wisconsin Madison
Biostatistics & Other Math Sci
Schools of Arts and Sciences
United States
Zip Code
Bartels, Christie M; Ramly, Edmond; Johnson, Heather M et al. (2018) Connecting Rheumatology Patients to Primary Care for High Blood Pressure: Specialty clinic protocol improves follow-up and population blood pressures. Arthritis Care Res (Hoboken) :
Bacher, Rhonda; Chu, Li-Fang; Leng, Ning et al. (2017) SCnorm: robust normalization of single-cell RNA-seq data. Nat Methods 14:584-586
Gasch, Audrey P; Yu, Feiqiao Brian; Hose, James et al. (2017) Single-cell RNA sequencing reveals intrinsic and extrinsic regulatory heterogeneity in yeast responding to stress. PLoS Biol 15:e2004050
Ye, Shuyun; Bacher, Rhonda; Keller, Mark P et al. (2017) Statistical Methods for Latent Class Quantitative Trait Loci Mapping. Genetics 206:1309-1317
Vreede, Andrew P; Johnson, Heather M; Piper, Megan et al. (2017) Rheumatologists Modestly More Likely to Counsel Smokers in Visits Without Rheumatoid Arthritis Control: An Observational Study. J Clin Rheumatol 23:273-277
Bacher, Rhonda; Kendziorski, Christina (2016) Design and computational analysis of single-cell RNA-sequencing experiments. Genome Biol 17:63