Data from instruments and observations are being generated at increasing rates which leads to increased challenges in data management and computation. It is critical that we address these challenges because these large datasets enable massive breakthroughs across many scientific disciplines. Instruments and observations can generate terabytes of data per day, which often must be transferred from remote locations, field stations, or core facilities to the user's home system for storage and analysis. To address these challenges, The University of Chicago (UChicago) will acquire and operate a Data Lifecycle Instrument (DaLI) to manage and share data from instruments and observations at UChicago and the Marine Biological Laboratory (MBL). DaLI will simplify data management for researchers, allowing them to acquire, transfer, process, and share data from instruments and observations in a single workflow as well as share their data with a larger community of users. In partnership with UChicago, MBL, and collaborating Minority Serving Institutions, DaLI will be used as a training instrument to prepare students to meet the data challenges of the 21st century and will support several outreach programs.
The Data Lifecycle instrument (DaLI) for management and sharing of data from instruments and observations will enable researchers to (a) acquire, transfer, process, and store data from experiments and observations in a unified workflow, (b) manage data collections over their entire life cycle, and (c) share and publish data. DaLI will also (d) enhance outreach and education opportunities and (e) provide a replicable model for other institutions. DaLI will create a scalable, seamless, and replicable infrastructure for data management and sharing to enable new transformative science and enable researchers to implement best practices in data management. The DaLI platform will consist of four pools of resources: a high-performance compute resource for pre- and post-processing of data, a high-performance storage pool, a low cost storage pool, and a tape backup pool. In addition to hardware, DaLI is designed to have software tools that create intuitive interfaces for data lifecycle management and integration with the campus and national cyberinfrastructures.