Retaining collected data and sharing it with the broader research community is necessary for developing a better understanding of the impact of urbanization on the global ecology and environment. The data intensive instrument to be installed at CUNY College of Staten Island will provide data analysis and data asset management capabilities to allow researchers to share, federate, and retain data and will provide for reproducibility and traceability. Researchers will be able to organize, analyze, manage and annotate metadata, and search and share their data sets. The proposed interface will use a service-Âoriented architecture based on an open standards framework to provide a library of core services for managing data and metadata. It includes a hierarchical storage system with at least a 1.4 petabyte disk farm and at least a 1.2 petabyte robotic tape system.
The instrument will enable the expanded use of computations for a large number of researchers in the biological, ecological, environmental, and economic disciplines. In environmental studies, it will facilitate the creation of a "virtual" urban-oceanographic-atmosphericÂ-land observatory allowing four significant Research Centers that are studying various aspects of the effects of urbanization to now retain and share their data.
The instrument will help to accelerate the research required to understand and forecast the impact of mega city and urban planning decisions on the local and global environment, on ecology and ecosystem services, on energy systems, and on economic systems. It will have similar benefits to researchers in other disciplines. The Research Centers are committed to use the instrument to share their data and the results of their research with each other and the broader research community. Each of the participating Research Centers is multi-Âinstitutional. Many of them are led by, and most include, Minority Serving Institutions. Consistent with NSF policy and CUNY traditions, each of these Research Centers has extensive internship and training programs that encourage women and minority student participation.
Fall 2009 enrollment statistics for the 23 institutions that comprise CUNY show that 53% of its 259.515 full-Âtime students were Black or Hispanic and 60% were female. Fall 2009 statistics showed 27,962 students enrolled in Science, Technology, Engineering, and Mathematics disciplines, including 65 American Indian/Native American, 6,195 Asian/Pacific Islander, 7,878 Black, 6,473 Hispanic, and 7,351 White students; 34% of the students were female. These statistics and CUNY's College Now, Teacher Academy, and Discovery Institute outreach programs attest to CUNY's commitment to broadening participation and to the outreach initiatives of the NSF.
The City University of New York (CUNY) High-performance Computing (HPC) Center implemented a Digital Data Storage and Management System (DSMS) to support research in the environmental, biological sciences, financial engineering, and economic science. The goal of the project was to create the DSMS to enable researchers to more effectively use the CUNY HPC Center environment and satisfy NSF and other agency requirements for data management. The objectives were to implement a DSMS that would: Convert the HPC environment from a set of server centric systems to an environment that is data centric. Support data sharing and provide for the organization of data files by research project, particularly for those where there are many participating researchers, by implementing tools to support the creation of publishable and searchable metadata libraries. Provide for reliable backup and short-term preservation of research data. The DSMS is operational. Use examples follow. In the aftermath of Hurricane Sandy, a number of studies were commissioned by the City of New York and the US Department of Housing and Urban Development to evaluate future storm threats to the New York Metropolitan Area and approaches to make the City more resilient. The studies included development of revised flood maps that included the potential effects of sea level rise and identify areas that might be subject to flooding. These maps are an important long-term planning tool for two reasons: (1) areas flooded by Sandy were outside of the identified flood zones in the existing Federal Emergency Management Agency (FEMA) flood zone maps and (2) the newest FEMA maps do not include the potential effects of sea level raise. Another study evaluated the potential benefits of creating oyster bed reefs off the coast of Staten Island to protect new created sand dune barriers and curtail inland flooding by reducing the intensity of waves. The CUNY Cooperative Research in Environmental Sensing Science and Technology Institute (CREST) research focuses on all aspects of remote sensing and sensor development, satellite remote sensing, ground-based field measurements, data processing and analysis, modeling, and forecasting. CREST recruits and train undergraduate, masters and doctoral students with a focus on under-represented minorities in the environmental sciences. CREST is led by CUNY and brings together Hampton University, University of Puerto Rico at Mayaguez, California State University, and University of Maryland Baltimore County. A major activity of the CREST has been the development of the weather research and forecasting model with urban parameterizations (uWRF). A 1 km grid version of this model is being run operationally to provide real time forecasts outputs that are managed within the DSMS system. Sensor data is acquired from NOAA satellites, ground based vertical profiling lidars, sodars and radars and surface weather stations and constituent sampling stations. The sensor data is preprocessed at CREST and transmitted to the CUNY HPC Center where the data products from this uWRF model are merged with data products retrieved from the network of observational assets managed by CREST. Micro-scale air quality and weather forecasts are prepared and automatically transmitted to CREST, where they are made available to the research community. A unique feature of the uWRF model is to predict building energy demands on neighborhood scales. A strategic use of the metadata tagging capabilities of iRODS is being implemented so that the lidar information can provide surface energy flux estimates that can contribute to the accuracy of the model predictions. This particular use case of the DSMS is a good example of the value that such a cyber-environment can add by hosting a "Big Data" platform to prototype novel and potentially valuable technologies. Critical data sets are maintained on the DSMS so that the validation records of this cyber physical system can be archived. The Urban Microbial Community Diversity Project (NSF Grant 1323225) works with faculty from eight of CUNYâ€™s four-year colleges and four of its community colleges to include research-based microbiology/genomics projects into their biology laboratory courses. The curriculum includes teaching the protocols for determining the types of bacteria present in samples collected from urban sites such as subway stations; procedures for analysis, including PCR amplification/sequencing; protocol development; use of high-performance computing; and data display. Researchers and students affiliated with the CUNY graduate program in the biosciences as well as the Museum of Natural History examine processes that lead to the diversification of species -- in number, form and ecology. Using Next Generation Sequencing techniques to create phylogenic maps or trees of various species. Subsequently, the phylogenic data is integrated with morphological and ecological data (including remotely-sensed data and GIS), in a computationally intense framework, to reveal how species, ecology and morphology evolve together over time and to examine the processes associated with adaptive radiations. These are a few examples of the activities at CUNY that make use of the instrumentation, funded by NSF Grant ACI-1126113.