The advent of Big Data has produced the opportunity to monitor, store and manipulate an abundant amount of data with the potential for addressing various human health issues; however, Big Data also presents utilization challenges, as well as the necessity for a trained diverse workforce to successfully explore, analyze and synthesize the various Big Data types from imaging to genomics. California State University, Fullerton (CSUF), a Hispanic- and Asian and Pacific Islander-serving institution in partnership with the Big Data for Discovery Science (BDDS) NIH Centers of Excellence, at the University of Southern California (USC), proposes the Big Data Discovery and Diversity through Research Education Advancement and Partnerships (BD3-REAP) program. The BD3-REAP program's primary objective is to train underrepresented students in Big Data science (BDs) exploration, computation, and synthesis that incorporates BDs didactic learning, while emphasizing comprehensive undergraduate BDs research experiences related to investigating Big Data neuroimaging, genomics, proteomics, and epidemiologic data types in relation to brain health. The BD3-REAP program has two specific aims. The first is to train and engage, through CSUF/USC faculty mentored, yet student-owned research experiences, three cohorts of six predominantly unrepresented students (total n=18) in an in-depth, 2- year BDs research intensive program on neuroimaging, proteomics, genomics and epigenetics data types, and comparative open-source databases in relation to brain health. Specifically students will: (a) Engage in small- group research and scientific discovery at CSUF (academic year) and USC (summer experience) (b) Increase BDs computational and analytic skills, and self-efficacy by participating in curricula/didactic training integrated with in-depth research experience (c) Prepare and present research posters (d) Improve scientific written and oral skills (e) Participate in BD3-REAP program advisement to ensure program/major graduation, increase knowledge and linkages to BDs career pathways and graduate school entry.
The second aim on curricula development is to develop a new and novel BDs educational framework across two colleges (Natural Science and Mathematics; Health and Human Development) and three departments, specifically developing and integrating novel curricula to increase students' BDs scope, utilization and application to health in four existing courses and a new course on Big Data, impacting nearly 320 students yearly and approximately 1300 diverse students during the entire program. We have the epidemiologic/behavioral, computational, statistical and neuroscience expertise to develop the BD3-REAP research education program that strengthens BDs student research capacities and participation/belonging in the broader scientific community. Further, BD3-REAP program aims to establish a pipeline of underrepresented BDs students into graduate studies and career placement, who may potentially serve as BDs role models in the scientific community, and for subsequent underrepresented students.

Public Health Relevance

Discovering science through exploring and understanding the vast amounts of data, including neuroimaging, proteomics, genomics and epidemiologic data types is key to improving human health; however, the gap in trained underrepresented scientists capable of appropriately utilizing, computing, accessing and charactering these Big Data, and communicating these findings to those communities mirrored by these underrepresented scientists, heralds the need for training a diverse undergraduate population in Big Data science (BDs) and the aforementioned issues. Therefore, this project will develop a faculty-mentored, yet student-owned intensive BDs research experience in neuroimaging, genomics and epidemiologic data types for underrepresented students. By establishing a talented, committed, and sustained pipeline of underrepresented students in BDs higher education and careers, this project will help develop diverse scientists reflective of community demographics, capable of managing, analyzing, and intelligently organizing and conveying biomedical/health information to scientific and local communities.

National Institute of Health (NIH)
National Institute on Minority Health and Health Disparities (NIMHD)
Education Projects (R25)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-HDM-K (51))
Program Officer
Zhang, Xinzhi
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
California State University Fullerton
Other Health Professions
Schools of Allied Health Profes
United States
Zip Code
Canner, Judith E; McEligot, Archana J; Pérez, María-Eglée et al. (2017) Enhancing Diversity in Biomedical Data Science. Ethn Dis 27:107-116
Cuajungco, Math P; Kiselyov, Kirill (2017) The mucolipin-1 (TRPML1) ion channel, transmembrane-163 (TMEM163) protein, and lysosomal zinc handling. Front Biosci (Landmark Ed) 22:1330-1343
Zhou, Bo; Moorman, David E; Behseta, Sam et al. (2016) A Dynamic Bayesian Model for Characterizing Cross-Neuronal Interactions During Decision-Making. J Am Stat Assoc 111:459-471
McEligot, Archana Jaiswal; Behseta, Sam; Cuajungco, Math P et al. (2015) Wrangling Big Data Through Diversity, Research Education and Partnerships. Calif J Health Promot 13:vi-ix