Big data offer an opportunity to study specific control populations (age / sex / environmental factors / demographics / genetics) and identify substantive homogeneous sub-cohorts so that one may understand the roles that potential factors play in brain development, differentiating abnormal trajectories from normal development. The image processing, statistical, and informatics tools to effectively and efficiently use big data imaging archives for quantitative population-level research and personalized medicine do not yet exist. This research will enable discovery science on a scale considerably larger than routinely possible with traditional study designs by creating novel informatics resources that tie archives of 3-D images into accessible research databases. This research will discover genetic and environmental factors that influence an individual's brain development and characterize the developing human brain through personal developmental trajectories. To accomplish this goal, new informatics technologies will be created to enable (1) image processing and segmentation based on image content in the context of heterogeneous, low quality, and error prone data with minimal human oversight and (2) routine archival, query, and image processing of large medical imaging datasets. This research will impact the areas of (1) informatics via novel computation models, (2) neuroscience via a new structural model of brain development, and (3) public health via newly accessible data sets for research. The science and technology innovations enabled by using big data to understand personalized brain development will be communicated in a tiered method. Outreach to the K-12 audience will target conceptualizing design criteria, inspiring students with interactive demonstrations, and providing capabilities for students to apply key concepts in hands-on engineering projects. For advanced students and researchers, new accessible course materials and online modules will be developed so that others may build upon the foundations established by this research.

Novel software, data wrangling tools, and resources will be created through two research thrusts organized around a novel test bed infrastructure and synthesized in a third education/outreach thrust. Thrust 1 (Personal Brain Trajectories) will focus on extracting meaningful information from medical images when performed at scale through (1) creating automated methods robust to variations in image quality, acquisition, and transfer errors, and (2) enabling efficient human-in-loop control at scale. The research will extend novel statistical models for image content labeling while adapting quality control techniques from industrial engineering. Thrust 2 (Novel Storage & Processing) will create novel medical imaging data models to describe data acquisition / retrieval, storage, cleaning, access / security, query and processing by integrating of medical imaging standards with big data architecture derived from social network and e-commerce communities. This infrastructure will provide practical access to petabyte imaging archives, integrate with existing data workflows, and effectively function with commodity hardware. The PI will develop and release a reference test bed to evaluate new technologies in the context of computer-aided detection (CADe) of brain abnormalities while considering age, sex, and demographics. Using the test bed, researchers and students will be able to efficiently evaluate existing and emerging image processing software to screen for potential prognostic markers. In Thrust 3 (Education and Outreach), the research results will be integrated into two classes targeting undergraduate students and interactive online modules created and released through an established graduate student/faculty training program. Each summer, an undergraduate and high school student will participate in research by implementing and extending research contributions within an interactive demonstration platform. In the second through fifth summers, a high school teacher will assist in the development of curricula targeting high school students using the demonstration platform. High school students and teachers will be recruited from Nashville Metro schools with a high underrepresented minority / reduced cost lunch populations. These efforts will create an open-source, open-hardware system for public demonstration and K-12 classroom exercises.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1452485
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2015-02-01
Budget End
2021-08-31
Support Year
Fiscal Year
2014
Total Cost
$465,951
Indirect Cost
Name
Vanderbilt University Medical Center
Department
Type
DUNS #
City
Nashville
State
TN
Country
United States
Zip Code
37235