Introduction to High-Performance Simulation
Tutorial at ESS 99 by
Graham Horton
University of Erlangen-Nuremberg, Dept. of Computer Science X
Martensstrasse 3, 91058 Erlangen
e-mail: horton@informatik.uni-erlangen.de
Almost all very-large-scale computations are simulations of natural phenomena
or technical systems. In addition, the problems to be solved - such as the
"Grand Challenge" problems are often of significant scientific, economic or
social importance. Their complexity is such that they can require
impractically long computation times even on the fastest available computers.
The field of high-performance simulation addresses this situation by
seeking new techniques for reducing the computation time for important classes
of simulation problems. There are essentially three main approaches to
achieving this, which will all be covered in the tutorial.
The first and most obvious approach is to perform the simulations on faster
computers. This almost always involves a more or less substantial
reformulation of the simulation algorithm and perhaps even a complete
re-design and re-coding of the program. This is particularly true for
high-performance parallel computers. Modern parallel supercomputers
now offer a performance in the Teraflop range and even workstation
clusters can yield aggregate performances in the 10GFLOPS range.
In the tutorial we will therefore
examine common techniques and pitfalls involved in parallel simulation.
Secondly, new, faster algorithms can be developed, which achieve
a direct reduction in the computational complexity of the simulation problem.
Techniques such as multi-level algorithms have, for example, in recent years,
given rise to orders-of-magnitude
performance improvements for many kinds of simulation problems.
We will therefore examine these and other techniques for developing faster
simulation algorithms.
Finally, a given simulation program can be tuned to achieve maximum performance
on a particular computer or class of computers. This tuning can involve the
reorganisation of code, the reorganisation of data or the reorganisation of
the underlying algorithm. In the tutorial we will explain the basic
architectural principles of modern computers and show how to improve
simulation progrmas to derive maximum performance from them. Examples will
be given demonstrating the significant performance benefits that can
be achieved - often by relatively simple means.