Over the past 15 years, processor speeds in modern computers have increased at a much faster rate than main memory speeds. Thus, relative memory latencies in both sequential and parallel computers have become quite large, and continue to increase rapidly. Such trends form the motivation for this research, which attacks the issues of how to monitor, analyze and improve computer performance in the face of a large processor-memory performance gap. Because of this performance gap, it becomes increasingly important to develop performance monitoring systems that allow programmers, compiler writers, and system designers to identify and tune portions of their design where memory overhead is limiting performance. To accomplish these goals, this project focuses on several area: (i) exploring hybrids of compile-time and run-time software-based techniques for application memory performance analysis, (ii) combining these hybrid monitoring techniques with dynamic compilation to implement flexible, low-overhead, detailed performance tools, (iii) designing and implementing novel, software-aware, hardware performance monitoring techniques, and (iv) designing software tools based on this novel hardware support. By studying both hardware and software solutions concurrently, one gains insights into the strong and weak points of each approach. The end result of this research is a suite of performance tools that span a range in the level of monitoring detail provided and the level of hardware support required.