Benjamin Lee is pursuing his research interests in
the areas of high-performance computing and computer architecture
as a graduate student at Harvard University. He is advised by
Professor
David Brooks.
The costs of detailed single-core simulation impede chip multiprocessor
design space exploration when such detailed simulation must be performed
on a per-processor basis. Benjamin is currently assessing the potential
of statistical regression models to capture performance and power trade-offs
of uni-processors as architectural design parameters vary. These models will
enable architects to leverage uni-processor simulation results in multi-core
research with analytics and low fixed costs instead of per-core simulations
and costs that scale superlinearly as the number of cores increase.
Power and thermal characteristics are increasingly considered primary design
constraints in addition to performance. Although these metrics may be well
understood in the single-core domain, their trade-offs for novel multi-core
architectures (e.g., heterogeneous or adaptive cores) are relatively
unexplored. Benjamin is continuing to study these and other interesting
CMP architectures.
Benjamin Lee pursued his research interests in areas of computer
architecture and high-performance computing applied to scientific
computing as an undergraduate researcher at the University of
California, Berkeley. He was affiliated with the
Berkeley Benchmarking
and OPtimization Group (BeBOP) and worked with Professors
Jim Demmel,
Kathy Yelick,
and post-doc
Rich
Vuduc.
Prior work with BeBOP includes a project to implement parallel
sparse matrix kernels with posix threads in an effort to extend prior
work in serial performance tuning to parallel architectures, such as
SMPs (shared memory multiprocessors).
In particular, Benjamin implemented parallel sparse
matrix-vector multiply for shared memory multiprocessors. This
implementation incorporated prior performance tuning techniques,
such as register blocking and cache blocking, as well as code
organization strategies such as loop unrolling, removing false
dependencies, and using machine-sympathetic language constructs.
Prior work with BeBOP includes a project to optimize symmetric
sparse matrix-vector multiply in an effort to automate the tuning of
linear algebra computational kernels to reflect the capabilities of
current compiler and hardware technologies.
In particular, Benjamin examined the effects of various optimizations
on the performance of symmetric sparse matrix-vector multiply including
algorithmic, data structure, compiler, and architecture-specific
optimizations. These optimizations seek to exploit the symmetric
structure of the matrix in conjunction with existing optimizations
for general sparse matrix-vector multiply.
Benjamin's work in the field of scientific computing and performance
optimization yielded significant performance gains. In particular,
symmetry exploiting optimizations improved performance by as much as a
factor of 2.6. Furthermore, hardware performance modeling
has allowed the formulation of upper bounds on performance.
These bounds allowed us to evaluate the effectiveness of our
optimizations for reaching the theoretical maximum. Details
regarding performance tuning techniques and performance
modeling are included in
published work.