This project, "Accelerating the Build-out of a Dedicated Network for Education and Research in Big Data Science and Engineering", seeks to significantly expand and upgrade an existing network at Penn State. The existing network links five separate data centers over 10 gigabits per second (Gbps) connections and has enabled virtual co-location of computational and data resources in a way that is transparent to the research community. However these few private network links cannot be directly used by the academic community to move data between their laboratories or local facilities and the distributed data centers. The Penn State research community at present relies on the integrated enterprise data backbone to connect to resources at campus data centers or at national centers. While some of the connections reach a peak rate of 1 Gbps, most operate at peak rates of less than 100 Mbps, thus significantly limiting the ability of researchers to move large datasets to locations within or outside the campus. The significant expansion of the network being undertaken in this project will make several hundred 10 Gbps ports available in laboratories and offices across two major campuses of Penn State. This expansion is accomplished by deploying 48-port 10 Gbps Ethernet switches in 12 separate buildings with highest concentration of faculty who rely on large computational and data resources for advanced teaching and research. Each of these 12 switches is slated to have two 10 Gbps uplinks, one to each core router for redundancy. Both core routers will have 10 Gbps and 100 Gbps ports for external connectivity. The overall goal of the project is to provide at least a ten-fold increase in end-to-end network connectivity in labs and offices, making sustained 10 Gbps connections ubiquitous between faculty laboratories, data-generating instruments, select classrooms, and local and national computational and data resources.
The network expansion seeks to accomplish three key goals: (i) make it possible to use Big Data in advanced undergraduate and graduate classes,(ii) support NSF and other federal agency funded research projects across the university, and (iii) eliminate data related barriers for independent curiosity-driven research. The network expansion will significantly accelerate discovery processes for the research community spread across a large number of academic departments and multidisciplinary areas.
The network expansion project will impact education and research training of graduate and undergraduate students. It will have a discovery accelerating impact on a large number of federally-funded projects at Penn State. It will not only support rapidly growing needs in existing research projects but entirely new research projects involving Big Data and large-scale computations will likely be spawned in several multidisciplinary areas. It will lead to broadening of participation by women and minorities in STEM discipline as well as social sciences related research projects at Penn State and beyond. It will help extend the national CI and advance NSF-wide CIF21 goals since the proposed network will be very tightly integrated with national cyberinfrastructure, making large-scale collaborative research and sharing of computational and data resources far more frequent.
There will be focused effort through publications and several meeting presentations targeted at the national academic and research communities to disseminate the results obtained and implementation lessons learned under this project. Utilization and performance data will be made available to assist with similar research cyberinfrastructure related projects at peer institutions. The design and implementation of the network will be documented and made available as a publicly accessible report.
The Penn State NSF funded project, "Accelerating the Build-out of a Dedicated Network for Education and Research in Big Data Science and Engineering" [award: 1245980] created a high speed Research Network connecting various research labs to central computation facilities, and analysis and visualization resources. The Research Network also served to inform the next generation of the Penn State University Network. The Penn State Research Network will allow Penn State researchers from many disciplines to better utilize, analyze, and visualize "big data" to facilitate interdisciplinary as well as interinstitutional collaborations. The design goal was to improve the volume or size of the data which can be analyzed, the variety of data which can be analyzed, and how fast that data can be moved from where it is generated or captured to where it will ultimately be analyzed. One immediate impact of the Penn State Research Network is the ability to acquire large volumes of atmospheric and oceanic data from a variety of resources for the purpose of numerical weather prediction. Faster access to more data and a greater variety of data allows almost real time inputs for weather forecasting models, particularly those for severe weather events and real time storm tracking. In addition, the Penn State Research Network will facilitate the movement of gene sequences from next generation DNA sequencing instruments located in laboratory facilities on the University Park campus to central computation and storage facilities at speeds and data sizes to match current sequencer capabilities. This way more sequences can be analyzed and retained to improve discovery in genetic research. The cybersecurity of the Research Network was paramount as the network was designed and built. In order to increase performance on the Research Network, traditional network filters like firewalls were eliminated. To compensate for this, a great deal of attention was placed on monitoring the network for cyberattacks and taking real time action to eliminate or more closely monitor those attacks. Various other NSF funded software (e.g. Bro) was deployed on the Research Network to monitor, secure, and record network events.