Future computing chips inside mobile devices such as smart phones, and servers used in data centers will contain many processors, memories and specialized functional units, connected by a sophisticated on-chip network. The on-chip network is a critical design element that influences the performance, power consumption, and cost of the chip. Hence, there is a compelling need for tools and techniques to explore the design space of an on-chip network quickly to create networks that are optimized for a given application and/or a market segment. Trace-based simulation is used widely to design and optimize on-chip networks. However, trace-based simulation can result in incorrect and misleading conclusions about network behavior because, a trace does not model the packet injection rate of the application accurately. In this project, the investigators develop techniques to overcome this limitation of trace-based simulation, which allows for rapid design space exploration of on-chip networks with accuracy approaching that of full-system simulation but with simulation time similar to trace-based simulation. Dependencies between packets are inferred by sampling multiple runs of an application on a fully connected network topology with different link latencies. Traces augmented with dependency information, model the packet injection rate of an application more accurately. FPGA-based acceleration is used to collect and analyze traces and a fast multithreaded network simulator that is capable of processing traces augmented with packet dependency information, is developed. The broader impact of the work will be through a validated repository of benchmark traces augmented with the packet dependency information and a multi-threaded network simulator that can be used by the research community to design and optimize on-chip networks with hundreds of processors.