Full Center Proposal for an I/UCRC for Intelligent Storage 0934396 University of Minnsota; David Lilja 0934401 University of California-Santa Cruz; Ethan Miller
The purpose of this proposal is to start a new I/UCRC "Intelligent Storage" to conduct research on new storage architectures and storage system designs, new data models and new ways to access and deliver data. The lead of the proposed Center will be the University of Minnesota, with site location at the University of University of California-Santa Cruz.
The goals of the proposed Center are to develop innovative storage systems and new storage architectures, solve the long-term data preservation issues, develop efficient benchmarking, tracing, performance management and tuning tools for I/O and input systems, propose solutions that ensure data/information privacy and security, and to explore ways to save energy in data center. The proposed Center will build on the respective University's research talent and technology transfer skills to attract industrial partners who will subsequently play a significant role in planning, selecting, and implementing the output of the research.
The broader impact of the potential research outcomes includes fostering the advancement of science and technology, making the society more efficient and secure, providing better health-care delivery, and better ways of preserving information. The industry participation will enhance the students' educational experience by providing a pipeline for talented engineers and scientists to industry. The proposed Center is committed to enhancing the education process by bringing input from industry, developing new courses at both undergraduate and graduate levels, and emphasizing the diversity of the student population. The Center also has plans to recruit more female and under-represented minority students and faculty into its research group. Research results will be disseminated to general public via journal publications and conference presentations.
The University of California, Santa Cruz site of the NSF I/UCRC Center for Research in Intelligent Storage (CRIS) ended on June 30, 2013, at which time it became the independent I/UCRC Center for Research in Storage Systems (CRSS). During the four years it was part of CRIS, UCSC worked closely with over ten storage-related companies on advanced research into archival storage systems, deduplication, non-volatile memory systems, next-generation disk technologies, scalable storage, and search and indexing. During this time, the UCSC site produced 4 PhDs and 6 MS students, most of whom have gone on to work at member companies; several more students supported by CRIS are expected to complete their MS and PhD degrees in the next year. The UCSC CRIS site also published over 25 papers during this time, including over 10 that were collaborations with CRIS industrial sponsors. This combination of graduates taking positions in industry and ongoing research collaboration led to strong technology transfer from CRIS to sponsor companies. CRIS industrial sponsors continue to incorporate our research advances into their products. Our SmartSSD work is being used by Samsung to develop more efficient SSDs by providing a mechanism to leverage the computing power in individual SSDs. This technology will further increase performance of systems built from flash drives, accelerating performance of next-generation storage devices. Our research into data organization for shingled disk is being used by Seagate to improve performance of hard drives that use new techniques for storing data more densely. This research will allow Seagate to provide higher capacity hard drives without increasing cost or decreasing performance. Hewlett Packard is integrating our deduplication research into their technology portfolio, allowing them to reduce the amount of storage necessary to store large data sets, thus decreasing cost. NetApp and LSI have based advances in distributing data in large disk arrays on algorithms developed by CRIS at UCSC. These techniques allow distribution of replicated data across hundreds of disks while ensuring that resiliency goals are met (i.e., three copies of each disk block, but no two copies may be stored on disks in the same shelf), even as disks are added to or removed from the system. Our archival storage research is also serving as a launching point for programs at several CRIS sponsors, including Seagate. Adoption of technology for archival storage is typically slower, given the long time frame across which data is stored, and we expect increasing use of our research results in the next few years. Our investigation of archival storage access patterns has, for the first time, shown how archival storage systems evolve over decades; this knowledge will be invaluable for companies building long-term data storage systems for the next few decades. Beyond specific research results, the UCSC CRIS site has served as a model for university-industry interactions for other UCSC groups. Given UCSC's proximity to Silicon Valley, this model is seeing increasing use, further deepening ties between diverse university research groups and companies based in the world's premier center for technology development in information technology and other fields.