This project aims to improve the big-data preparedness of diverse STEM audiences by defining a comprehensive framework called TIDE (Timely Introduction of Emerging Data-intensive computing). The project creates a certificate program and provides methods to institutionalize the framework developed. The project is motivated by and guided by collective experiences in grid computing and in solving domain-specific problems in life sciences and environmental engineering.
Data-intensive computing has been receiving much attention as a solution to address the data deluge that has been brought about by tremendous advances in distributed systems and Internet-based computing. An innovative programming model called MapReduce and a peta-scale distributed file system to support it have revolutionized and fundamentally changed approaches to large scale data storage and processing. However there exists no systematic approach to teach the big-data concepts to STEM undergraduates. TIDE addresses these issues through the following objectives: (a) Define a set of core competencies that are required for research (to advance the field) and for practical application design (to build systems) in data-intensive computing areas; (b) Define a certificate program that consists of five courses that effectively addresses the competencies defined above: data structures and algorithms, distributed systems, data-intensive computing, a domain-specific course, and a capstone project solving a data-intensive problem. These courses are developed from existing courses as part of the TIDE project; (c) Define and develop the curriculum for the courses, including teaching materials such as laboratory exercises and case studies for practical experimentation; (d) Broaden participation in the certificate program; (e) Assess the progress and the effectiveness of the program; and (f) Provide strategies for educators to effectively adopt the TIDE framework.