This paper describes a combined approach for improving locality that uses the hardware performance monitors of modern processors and program-centric code annotations to guide thread scheduling on SMPs. The approach relies on a shared state cache model to compute expected thread footprints in the cache on-line. The accuracy of the model has been analyzed by simulations involving a set of parallel applications. We demonstrate how the cache model can be used to implement several practical locality-based thread scheduling policies with little overhead. Active Threads, a portable, high-performance thread system, has been built and used to investigate the performance impact of locality scheduling for several applications.
Last change: 3/29/98|
The Sather Team (email@example.com)