Portable embedded systems place ever-increasing demands on high-performance, low-power microprocessor design. This has triggered a paradigm shift in the embedded processor industry, and embedded processor architects are permanently altering their roadmap to incorporate multiple cores on the same chip to preserve exponential computational speed-up (to continue Moore's law). While the industry focus is on putting higher number of cores on a single chip, the key challenge is to optimally architect these processors to meet stringent real-time constraints in virtualized environments with different real-time operating systems (RTOS). Moreover, these processors will work on diverse application types having sporadic event-driven data. In such complex systems, the optimal performance can only be realized by co-designing the RTOS kernels, the virtualization environment, and the processor micro-architecture.
The goal of this research is to develop a cycle-accurate simulation platform for the micro-architectural exploration of future kilocore-scale heterogeneous embedded chip-multiprocessors (ECMPs), together with an interface to boot RTOS-es on the simulator, and enable virtualization on the ECMPs. The integrated framework is flexible and modular for designing a wide variety of ECMPs and RTOS-es, flexibly threaded for fast simulation of thousands of cores on a wide range of computing platforms, assertion-based and check-pointed for quickly regenerating and changing complex trigger conditions required for debugging, instantly-bootable to reduce the RTOS booting time from power-up to simulation, enabled with deep-chip-vision for better observability of silicon behavior at the architectural level, and open-sourced for non-commercial use. The RTOS integration environment provides software interfaces and libraries to the architect and the OS designer to simulate the execution of embedded applications on different exploratory ECMPs, in the presence of different virtualization scenarios and RTOS scheduling algorithms. This project promotes inter-disciplinary research between computer architects, RTOS designers, Computer-Aided Designers, and embedded application programmers at different universities and research institutions. The various education components woven into the project will foster team-based research, learning, and teaching among the university student participants.
The goal of this research was to develop a hierarchical cycle-accurate simulation platform for the microarchitectural exploration of future large scale embedded chip-multiprocessors (ECMPs), together with an interface to boot real time responsive operating systems (OS) on the simulator. The integrated framework is flexible and modular, efficiently threaded for fast simulation of multiple cores on a wide range of computing platforms, checkpointed for quickly regenerating and changing complex trigger conditions required for debugging, instantly-bootable to reduce the OS booting time from power-up to simulation, have deep-chip-vision for better observability of silicon behavior at the architectural level, and open-sourced for non-commercial use. The OS integration environment will provide software interfaces and libraries to the architect and the OS designer to simulate the execution of embedded applications on different exploratory ECMPs. Some significant technical achievements include: Design and implementation of Cycle accurate simulator for highly multithreaded multicore processors. The software is modular and written in C++ Integration of the platform with the instruction level Sun SAM simulator. Development of power models using component synthesis on the Berkeley 45nm gate library. Development of a scalable hierarchical methodology for simulating many core processors. Comparison of machine learning algorithms in exploring microarchitecture design. Application to the design of next generation network processors and smart-grid processors. Contributed an open sourced cycle accurate SPARC simulator to the community (GNU licensed) Provided a framework for exploring scheduling algorithms along with architectural exploration and code tuning for a wide variety of applications. The broader impact of this research include: The coding and testing intensive nature of this project enabled us to provide research assistantships to 10 MS students and 4 PhD students on a rotating basis, which contributed to their Theses, Projects and Dissertations. The core knowledge and expertise gained by the students were in the areas of Computer Architecture, Software Programming, Scheduling, Operating Systems, Simulation Platforms, System Level Design and VLSI Chip Design. This research generated interest from international research groups and Intel Research Labs. We use the CASPER simulator in Computer Architecture class to introduce students to a modern processor.