The goal of this research is to investigate and develop techniques needed to produce a tool that allows the applications programmer and computer designer to work closely to increase uniprocessor performance. Computational units, such as the floating-point units, are becoming faster and readily available because of the rapid advances in technology and Computer-Aided Design (CAD) tools. The performance, however, is limited by the rate at which data is applied to the units. To achieve good uniprocessor performance, one must efficiently utilize the computational units, memory, and memory ports. This may require that the applications programmer redesign the algorithms to better exploit the unique features of the architecture or the computer designer utilize CAD tools to design special-purpose processors that exploit the unique features of the algorithms or for best results, both. This research will investigate and develop (1) techniques for extracting statistics on the utilization of the aforementioned processor resources and (2) methods for displaying the performance statistics to benefit both the applications programmer and the computer designer. The results of this research will permit the applications programmer and computer designer to interact closely to address the problem of efficiently solving challenging problems such as climate modeling, vehicle dynamics, semiconductor modeling, and quantum chromodynamics, which are beyond the scope of existing computers.