BigData: Small: DCM: Open Flow Enabled Hadoop over Local and Wide Area Clusters Robert L. Grossman University of Chicago

In the recent years, data intensive programming using Hadoop and MapReduce has become more and more important. As normally deployed, Hadoop's implementation of MapReduce in a multi-rack cluster is dependent upon the top of the rack switches and of the aggregator switches connecting multiple racks. To use Hadoop effectively at scale across many racks requires expensive network switches and routers that are complex to configure and to maintain.

Software defined networks using OpenFlow have proven in many cases to offer good performance at lower cost and to be simpler to manage. The first goal of this proposal is to contribute to the development of a new version of Hadoop called Hadoop-OFP. The basic idea of Hadoop-OFP is to integrate OpenFlow enabled switches with Hadoop: to i) improve performance; ii) lower the cost of the hardware required; and iii) simplify the management of the cluster.

Large data flows are also a critical component of data intensive computing. Unfortunately, setting up networks to manage large data flows can be challenging. A second goal of this proposal is to develop a tool that can configure OpenFlow enabled networks to handle more efficiently the large data flows that arise with data intensive computing.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Type
Standard Grant (Standard)
Application #
1251201
Program Officer
Robert Chadduck
Project Start
Project End
Budget Start
2013-06-01
Budget End
2017-05-31
Support Year
Fiscal Year
2012
Total Cost
$749,806
Indirect Cost
Name
University of Chicago
Department
Type
DUNS #
City
Chicago
State
IL
Country
United States
Zip Code
60637