In this dawning era of `Big Data' it is vital to recruit and train the next generation of biomedical data scientists in `Big Data'. The collection of `Big Data' in the biomedical sciences is growing rapidly and has the potential to solve many of today's pressing medical needs including personalized medicine, eradication of disease, and curing cancer. Realizing the benefits of Big Data will require a new generation of leaders in (bio) statistical and computational methods who will be able to develop the approaches and tools necessary to unlock the information contained in large heterogeneous datasets. There is a great need for scientists trained in this specialized, highly heterogeneous, and interdisciplinary new field. Thus, the recruitment of talented undergraduates in science, technology, engineering and mathematics (STEM) programs is vital to our ability to tap into the potential that `Big Data' offer and the challenges that it presents. The University of Michigan Undergraduate Summer Institute: Transforming Analytical Learning in the Era of Big Data will draw from the expertise and experience of faculty from four different departments within four different schools at the University of Michigan: Biostatistics in the School of Public Health, Computer Science in the School of Engineering, Statistics in the College of Literature, Sciences and the Arts, and Information Science in the School of Information. The faculty instructors and mentors have backgrounds in Statistics, Computer Science, Information Science and Biological Sciences. They have active research programs in a broad spectrum of methodological areas including data mining, natural language processing, statistical and machine learning, large-scale optimization, matrix computation, medical computing, health informatics, high-dimensional statistics, distributed computing, missing data, causal inference, data management and integration, signal processing and imaging. The diseases and conditions they study include obesity, cancer, diabetes, cardiovascular disease, neurological disease, kidney disease, injury, macular degeneration and Alzheimer's disease. The areas of biology include neuroscience, genetics, genomics, metabolomics, epigenetics and socio-behavioral science. Undergraduate trainees selected will have strong quantitative skills and a background in STEM. The summer institute will consist of a combination of coursework, to raise the skills and interests of the participants to a sufficient level to consider pursuing graduate studies in `Big Data' science, along with an in depth mentoring component that will allow the participants to research a specific topic/project utilizing `Big Data'. We have witnessed tremendous enthusiasm and response for our pilot offering in 2015 with 153 applications for 20 positions and a yield rate of 80% from the offers we extended. We plan to build on the success of this initial offering in the next three year funding cycle of this grant (2016-2018). The overarching goal of our summer institute in big data is to recruit and train the next generation of big data scientists using a no-traditional, action-based learning paradigm. This six week long summer institute will recruit a group of approximately 30 undergraduates nationally and expose them to diverse techniques, skills and problems in the field of Big Data. They will be taught and mentored by a team of interdisciplinary faculty, reflecting the shared intellectual landscape needed for Big Data research. At the conclusion of the program there will be a concluding capstone symposium showcasing the research of the students via poster and oral presentation. There will be lectures by UM researchers, outside guests and a professional development workshop to prepare the students for graduate school. The resources developed for the summer institute, including lectures, assignments, projects, template codes and datasets will be freely available through a wiki page so that this format can be replicated anywhere in the world. This democratic dissemination plan will lead to access of teaching and training material for undergraduate students in this new field across the world.
We propose a six week long summer institute: 'Transforming Analytical Learning in the Era of Big Data' to be held at the Department of Biostatistics, University of Michigan, Ann Arbor, with a group of approximately 30 undergraduates recruited nationally, from 2016-2018. We plan to expose them to diverse techniques, skills and problems in the field of Big Data. They will be taught and mentored by a team of interdisciplinary faculty from Biostatistics, Statistics, Computer Science and Engineering, reflecting the shared intellectual landscape needed for Big Data research. At the conclusion of the program there will be a concluding capstone symposium showcasing the research of the students via poster and oral presentation. There will be lectures by UM researchers, outside guests and a professional development workshop to prepare the students for graduate school. The resources developed for the summer institute, including lectures, assignments, projects, template codes and datasets will be freely available through a Wiki page so that this format can be replicated anywhere in the world. This democratic dissemination plan will lead to access of teaching and training material in this new field across the world. The overarching goal of our summer institute in big data is to recruit and train the next generation of big data scientists using a non-traditional, action-based learning paradigm.