Collecting data is easier than ever before, but extracting actionable information from it is more challenging than ever. Society needs a workforce who understands data science and can apply it to creatively solve problems. In this project, a novel, introductory data science course and undergraduate summer research program will be designed to motivate students early in their academic studies to pursue careers in data science by connecting them with authentic projects and stakeholders. The data-rich industry of water and wastewater treatment (W/WWT) will provide an extensive portfolio of tractable projects. Although data are easy to collect and abundant in the W/WWT industry, methods of monitoring, maintenance, and sensor calibration lag behind the state-of-the art. This funded work will create opportunities to critically assess the suitability of current methods and produce creative alternatives. Recruiting from diverse and underrepresented populations of students, this project will produce the next generation of data scientists who are ready to fill "mid-level" data science positions. At the same time, this project will help W/WWT facility operators reduce costs and improve water quality by utilizing the information in their data.
Statisticians, computer scientists, and environmental engineers at Baylor University and Colorado School of Mines (Mines) will collaborate to (i) develop a three-credit, prerequisite-free sophomore-level course; (ii) organize a five-week data science summer program; and (iii) cultivate relationships with W/WWT stakeholders and community colleges (CC). The course will introduce data science through inquiry-driven modules to attract students who may not have previously considered a data science career. It will be offered in parallel at both universities each year and will weave not only W/WWT facility problems throughout but also problems associated with water scarcity, such as climate change, agricultural demands, and urbanization. The summer program will be developed with a singular focus on solving W/WWT problems with data. A diverse cohort will be recruited from Baylor, Mines, and CC partners. A one-week pre-program coding boot camp will be offered to bolster skills. PIs will curate and oversee team projects designed to develop data acumen, teamwork, and communication. Established relationships with urban and rural W/WWT utilities; manufacturers of W/WWT systems for decentralized use; W/WWT treatment operators; and academic partners will provide data and problem context. All project data will be well documented and made freely available. The student and program-level outcomes will be formally assessed, with results disseminated through publication in peer-reviewed journals and conference presentations. Methods for optimal operation and monitoring developed by student teams will be publicized through instructional videos and technical reports to our W/WWT stakeholders and rural industry service organizations.
NSF's Harnessing the Data Revolution Data Science Corps program focuses on building capacity for harnessing the data revolution at the local, state, national, and international levels to help unleash the power of data in the service of science and society. Projects in this program are being jointly funded by the NSF's Harnessing the Data Revolution Big Idea; the Directorate for Computer and Information Science and Engineering, Division of Information and Intelligent Systems; the Directorate for Education and Human Resources, Division of Undergraduate Education; the Directorate for Mathematical and Physical Sciences, Division of Mathematical Sciences; and the Directorate for Social, Behavioral and Economic Sciences, Office of Multidisciplinary Activities and Division of Behavioral and Cognitive Sciences.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.