https://www.dagstuhl.de/14371

### September 7 – 12 , 2014, Dagstuhl Seminar 14371

# Adjoint Methods in Computational Science, Engineering, and Finance

## Organizers

Nicolas R. Gauger (TU Kaiserslautern, DE)

Michael Giles (University of Oxford, GB)

Max D. Gunzburger (Florida State University, US)

Uwe Naumann (RWTH Aachen, DE)

## For support, please contact

## Documents

Dagstuhl Report, Volume 4, Issue 9

Aims & Scope

List of Participants

## Summary

The human desire for meaningful numerical simulation of physical, chemical, biological, economical, financial (etc.) phenomena in CSEF has been increasing with the growing performance of the continuously improving computer systems. As a result of this development we are (and will always be) faced with a large (and growing) number of highly complex numerical simulation codes that run at the limit of the available HPC resources. These codes often result from the discretization of systems of PDE. Their run time correlates with the spatial and temporal resolution which often needs to be very high in order to capture the real behavior of the underlying system. There is no doubt that the available hardware will always be used to the extreme. Improvements in the run time of the simulations need to be sought through research in numerical algorithms and their efficient implementation on HPC architectures.

Problem sizes are often in the billions of unknowns; and with emerging large-scale computing systems, this size is expected to increase by a factor of thousand over the next five years. Moreover, simulations are increasingly used in design optimization and parameter identification which is even more complex and requires the highest possible computational performance and fundamental enabling algorithmic technology. Derivatives of certain objectives of these numerical models with respect to a potentially very large number of model parameters are crucial for the highly desirable transition from pure simulation to optimization. Approximation of these derivatives via finite difference quotients often lacks the required accuracy. More importantly, it may be infeasible for a large parameter space in terms of its computational complexity. Adjoint numerical programs have until recently been written by hand to overcome this problem. Such programs compute (large) gradients with machine accuracy at a small constant multiple of the computational complexity of the underlying primal simulation. Due to the enormous size of most numerical simulation codes the manual procedure may take up to several man years. Moreover manual adjoint codes are error-prone and hard to maintain as the primal simulation evolves. Computer scientists have been developing special software tools based on the principles of algorithmic differentiation (AD) to generate discrete adjoint code automatically. Consequently, this method has gained considerable acceptance within the CSEF community as illustrated by numerous successful case studies presented in the proceedings of so far six international conferences on AD. See http://www.autodiff.org for details.

**Illustrative Example:** Classical applications of adjoint methods arise in the context of large-scale inverse problems, such as the estimation of unknown or uncertain parameters of implementations of mathematical models for real-world problems as computer programs. Imagine the optimization of the shape of an aircraft with the objective to maximize its lift. The continuous mathematical domain (the
surface of the aircraft) is typically discretized through the generation of a mesh with a potentially very large number of points spread over the whole surface. Optimization aims to adapt the position of these points in 3D space such that the objective is met while at the same time satisfying various constraints (e.g. prescribed volume). A naive approach might run a potentially very large number of primal numerical simulations with changing mesh configurations thus being able to identify an optimum within this very limited search space.

Derivative-based approaches use information on the sensitivity of the objective at the given mesh configuration with respect to changes in the positions of all mesh points (the gradient) in order to make a deterministic decision about the next configuration to be considered. The sensitivities can be approximated through local perturbations of the position of each mesh point (finite difference quotients). A single optimization step would thus require a number of primal simulations that is of the order of the number of degrees of freedom (three spatial coordinates for each mesh point) induced by the mesh. This approach is practically infeasible as a single simulation may easily run for several minutes (if not hours) on the latest HPC architectures. The approximation of a single gradient would take months (if not years) for a mesh with only one million points.

Adjoint methods deliver the gradient at the cost of only a few (between 2 and 10) primal simulations. Continuous adjoint methods derive an adjoint version of the primal mathematical model analytically followed by the numerical solution of the resulting adjoint model. While this approach promises low computational cost (approx. 2 primal simulations) it can be mathematically challenging and numerically inconsistent when compared with the primal numerical simulation. To the best of our knowledge, the automation of the derivation of continuous adjoint models is still outstanding.

Discrete adjoint methods rely on the algorithmic differentiation of the primal numerical model, thus overcoming the potential numerical inconsistencies induced by the continuous adjoint. Depending on the mode of implementation of AD, the level of maturity of the AD tool, and the expertise of the user of the tool the computational cost can range between 2 and 20 primal simulations, sometimes even more. Still this cost is independent of the number of mesh points (referring to the above example). Solutions to problems arising in adjoint methods require expertise in both theoretical and applied Computer Science as well as in Numerical Analysis. Robust methods for the data flow reversal within adjoint code are built on special graph partitioning and coloring algorithms. Their implementation on modern HPC architectures (e.,g. using MPI and/or OpenMP) has impact on the simulation software design and the data management. The use of accelerators has been considered only recently with many open as of yet unsolved problems. Static and dynamic program analysis and compiler construction techniques have been developed to facilitate the semi-automatic generation of discrete adjoint code. The exploration of a potential extension of these techniques to continuous adjoint code was one of the subjects of this seminar. Other conceptual problems discussed included functional analytic aspects of adjoint methods and their impact on practical implementation, combinatorial problems in adjoint code generation and their computational complexities, and simulation software engineering guidelines in the light of adjoint methods.

Adjoint methods borrow from a variety of subfields of Computer Science and Applied Mathematics including high performance and combinatorial scientific computing, program analysis and compiler construction, functional analysis, numerical analysis and linear algebra, and with relevance to a wide range of potential areas of application. As such, the topic lends itself to a series of seminars taking more detailed looks into the respective subjects. With this seminar we intent to initiate a sequence of related events alternating in between the Leibniz Center for Informatics at Schloss Dagstuhl and the Mathematisches Forschungsinstitut Oberwolfach, thus, emphasizing the obvious synergies between Computer Science and Mathematics in the given context.

**Summary text license**

Creative Commons BY 3.0 Unported license

Nicolas R. Gauger, Michael Giles, Max D. Gunzburger, and Uwe Naumann

## Classification

- Modelling / Simulation

## Keywords

- Continuous adjoints
- Discrete adjoints
- High-performance scientific computing
- Algorithmic differentiation