The Computational Social Science Training Program (CSSTP) at UC Berkeley provides training in advanced analytics to predoctoral students in the social and behavioral sciences studying health topics covered by the Eunice Kennedy Shriver National Institute for Child and Human Development. CSSTP is a new program that combines Berkeley's long-standing strength in quantitative social and behavioral science with its nationally- recognized campus programs in data science education, practice, and research. It will serve five entering trainees per year over five years. The training faculty includes 22 social scientists who have exemplary records of developing and applying novel statistical methods to health-related social/behavioral science problems, as well as 13 data scientists who are leading figures in the foundations of mathematics, statistics/biostatistics, and computer science. Trainees, who will be drawn from a diverse pool of students in six social science doctoral programs, are provided with a rigorous and tailored program designed to teach a team science-based approach to problem solving and to emphasize the analysis of intensive or voluminous longitudinal data and high-density, large sample or population level agency databases. Each trainee is supported by a dual-preceptor model in which s/he is provided with a social sciences faculty mentor and a data science mentor who help to facilitate the trainee's progress through the program. CSSTP trainees are provided with community space at the Berkeley Institute for Data Science (BIDS), a dynamic multi-disciplinary data science research center, where trainees work alongside other data science fellows in residence. After completing their first-year course requirements in their home departments, trainees formally enter the program in their second year of graduate school, devise an individual development plan, and take a core two-semester course in computational social science, team-taught by training faculty. This course introduces students to essential data science methods and tools, including Python programming, data management, natural language processing, machine learning, causal inference, and responsible conduct and reproducibility of research, through lectures, in-depth discussion of social science applications, and small group learning exercises. In the following year, students apply these skills through placements on collaborative health-related research teams or labs on campus and/or with external industry partners, thus developing skills in advanced analytics through research practice involving the development and implementation of new methods. Additional training tailored to student needs and interests is provided through elective courses, a weekly computational social science workshop series, and ongoing working groups at the Berkeley Institute for Data Science and the Social Science D-Lab, a campus hub for data science training and research for social scientists. CSSTPs benefits will ripple out to the greater campus and beyond by stimulating new faculty collaborations and by creating a critical mass of rigorously trained computational social science students who will be competitive and qualified for jobs in rapidly changing and evolving data intensive fields.
Advanced data analytics is transforming social and behavioral science research on medicine and health. This training program in Computational Social Science will equip a diverse cadre of social and behavioral science health investigators to conduct novel research using advanced computational and statistical methods, preparing them for transdisciplinary careers in data analytics. All aspects of the training will emphasize rigor in research and reproducibility of research results.