HomeOptimizing DDA Code on a Power5 Processor
Optimizing DDA Code on a Power5 Processor
Date: Monday, June 8, 2009
Time: 10:30am - 11:30am
Location: LBNL Bldg. 50F, Room 1647
Speaker:
Adam Jundt
Center for Computation and Technology
Louisana State University
Abstract:
This talk presents an implementation of the Discrete Dipole
Approximation method used for computationally simulating the light
scattering effects of atmospheric particles, and the results of
optimizing the code to run on an IBM Power5 processor. The original
implementation suffered from excessive execution time caused by a high
number of cache accesses and a low rate of instructions per cycle. The
goal of this project was to improve the code's performance by making a
series of optimizations in a step-by-step manner. The first and second
stages of optimizations were done by selecting specific optimization
parameters available from IBM's compiler, xlC. The next step was to
perform handmade optimizations to the code, concentrating mainly on
loop fusion techniques. Finally, the last stage of optimization was to
incorporate OpenMP into the code in order to take advantage of the
multiple cores available on the Power5 processor. By using the IBM
High Performance Toolkit, we are able to record the change in the
number of L1 data cache misses and references, instructions per cycle,
and execution time after each phase of optimization. Using the
original source code with no optimizations as the base for the
experiments, a speedup of 12x was obtained for "compiler only"
optimizations, with an overall speedup of 42x after all optimizations
were performed.
Host of Seminar:
Tony Drummond