

Developing parallel high-performance applications is an error-prone and time-consuming challenge. Performance tuning can be alleviated considerably by using optimisation tools, either by simply applying a stand-alone tool or by applying a tool chain with a number of more or less integrated tools covering different aspects of the optimisation process. In the present paper, we demonstrate the benefits of the latter approach on the industrial combustion modelling software RECOM-AIOLOS. The applied tool chain comprises both low-level and high-level analysis of the application: using the MAQAO tool, the assembly code generated by the compiler can be analysed statically, aiming at possible optimisations on a loop level. Another important aspect is identifying and optimising bottlenecks in memory access and cache utilisation. On a higher level, efficient usage of parallel programming paradigms (MPI, OpenMP) is verified by the VAMPIR and SCALASCA frameworks. Combining the different optimisation strategies leads to a significant overall performance improvement.