The ever-shrinking feature size in CMOS process technology has enabled the integration of a large number of devices such as cores, caches, and other special engines in a single chip. The growing popularity of Chip Multiprocessors (CMPs) has ushered in the arrival of a communication-centric system where the design of interconnection architecture has a significant impact on the overall system performance as well as power dissipation and area of a chip. To overcome traditional interconnects problems, Network-on-chip (NoC), using switch-based networks, has been widely accepted as a promising architecture to orchestrate chip-wide communication. Although there has been significant research on NoC designs, there is still a lack of a unified design methodology integrating system and NoC design.
The investigator is developing a comprehensive design paradigm for exploring the on-chip interconnect design space, especially focusing on how it interacts with the rest of the CMP architecture. This research program is comprised of four intertwined research objectives. First, simulation testbeds and traffic analysis methodologies are developed to understand the interplay between applications and system architecture. Traffic analysis tools are used to capture the runtime behavior of various applications in CMPs. Second, solutions for high and predictable performance within power and area budgets in current and future technology generations are provided. Third, the PI is developing a domain specific NoC design for CMP memory systems. As the last objective, the PI explores new opportunities and challenges posed by future applications of next-generation CMP. The research is integrated into the education curriculum, through existing and new graduate courses, and in undergraduate research programs.
The emergence of Chip Multiprocessors (CMPs) has embarked a paradigm shift from computation-centric to communication-centric system design, as of cores in a chip increases.Moreover, as technology advances, global wire delays do not scale down as fast as gate delays. To overcome traditional interconnects problems, Network-on-chip (NoC), using switch-based networks, has been widely accepted as a promising architecture to orchestrate chip-wide communication. Although there has been significant research on NoC designs, there is still a lack of a unified design methodology integrating system and NoC design space considering their interaction in terms of performance,area and power in a cohesive fashion. This project developed a paradigm for exploring the on-chip interconnect design space, especially focusing on how it interacts with the rest of the CMP architecture.The project achieved four major objectives. First, a comprehensive framework that includes simulation testbeds and traffic analysis methodologies has been developed to understand the interplay between applications, system architecture and on-chip interconnects.The platform has been used to capture communication behavior characteristics of scientific, server, and multimedia applications, which is critical for identifying the requirements of the NoC design. We proposed a phase-based traffic analysis method to capture run-time application behavior. Leveraged by this, the notion of globally coordinated on-chip networks have been proposed in which application communication behavior-captured by traffic profiling-is utilized in the design and configuration of on-chip networks so as to support prevailing traffic flows well, in a globally coordinated manner. This has been applied to the design of a hybrid network consisting of a mesh augmented with configurable multidrop (bus-like) spanning channels that serve as express paths for traffic flows benefiting from them, according to the characterized traffic profile. Second, design and analysis of on-chip interconnects for high-performance, low power and area efficiency was investigated. Towards accomplishing this important task, traditional design aspects such as router architecture, buffer optimization and flow control have been revisited in the context of new technologies. For this task, we proposed an Adaptive Physical Channel Regulator (APCR) for NoC routers to exploit huge wiring resources. The flit size in an APCR router is less than the physical channel width (phit size) to provide finer granularity flow control. An APCR router allows flits from different packets or flows to share the same physical channel in a single cycle. Third, we explored to exploit emersing technologies in future NoC design. Nanophotonics has been proposed to design low latency and high bandwidth NoC for future CMPs. Recent nanophotonic NOC designs adopt the token-based arbitration coupled with credit-based flow control, which leads to low bandwidth utilization. In this work, we proposed handshake schemes for nanophotonic interconnects in CMPs. They get rid of the traditional credit based flow control, reduce the average token waiting time, and finally improve the network throughput. The other emerging technology we explored is STT-MRAM (Spin-Torque Transfer Magnetic RAM) for its nature of high density and near zero leakage power. But its long latency and high power consumption in write operations still need to be addressed. We explored the design issues in using STT-MRAM for NoC input buffers. Motivated by short intra-router latency, we proposed a hybrid design of input buffers using both SRAM and STT-MRAM to hide the long write latency efficiently. Finally, we further investigated intelligent network management schemes to improve performance and to reduce power consumption of existing NoC architectures. In this study, we accelerated network communication by exploiting communication temporal locality with minimal additional hardware cost in the existing state-of-the-art router architecture. We proposed a pseudo-circuit scheme which reserves crossbar connections creating pseudo circuits, sharable partial circuits within a single router. It reuses the previous arbitration information to bypass switch arbitration if the next flit traverses through the same pseudo circuit.