AA5 – Variational Problems in Data-Driven Applications

Project

AA5-5 (was EF1-25)

Wasserstein Gradient Flows for Generalised Transport in Bayesian Inversion

Project Heads

Martin Eigel, Claudia Schillings, Gabriele Steidl

Project Members

Robert Gruhlke

Project Duration

01.01.2023 − 31.12.2024

Located at

FU Berlin

Description

Generalised gradient Wasserstein flows connect measure transport and interacting particle systems. The project combines the analysis of efficient numerical methods for gradient flows, associated SDEs and compressed functional approximations in the context of Bayesian inversion with parametric PDEs and image reconstruction tasks.

External Website

Related Publications

Eigel, M., Gruhlke, R., & Sommer, D. (2024). Less interaction with forward models in Langevin dynamics: Enrichment and Homotopy. SIADS, to appear.

Eigel, M., Gruhlke, R., Kirstein M., Schillings C. & Sommer, D. (2024). Generative Modelling with Tensor Train approximation of Hamilton-Jacobi-Bellman equations. arXiv preprint, arXiv:2402.15285.

Gruhlke, R., Miranda, C., Nouy, A. & Trunschke, P. (2024). Optimal Sampling for stochastic and natural gradient descent. arXiv preprint, arXiv:2402.03113.

Gruhlke, R., Moser, D. (2024). Automatic differentiation within hierachical tensor formats. (in preparation)

Gruhlke, R., Kaarnioja V. and Schillings, C. (2024). Quasi-Monte Carlo meets kernel cubature. (in preparation)

Gruhlke, R. and Hertrich, J. (2024). Neural JKO Sampling with Importance Correction. (in preparation)

Gruhlke, R. and Resseguier, V. (2024). Diffusion models with multiplicative noise and rotational invariant distributions. (in preparation)

Berner, J., Gruhlke, R., Richter, L. and Sommer, D. (2024). Pathwise tensor train approximation of Hamilton-Jacobi-Bellman equations: A BSDE perspective. (in preparation)

Guth, P., Gruhlke, R., Schillings, C. (2024). One-shot manifold learning for design of experiment. (in preparation)

Related Media

Homotopy enhanced Langevin dynamics

Covariance preconditioned Langevin dynamics also known as Affine invariant Langevin dynamics (ALDI) and additional homotopy enhancement provide a powerful tool for fast convergence to multimodal distributions. Here the homotopy is based on a convex combination of the log-target density and an auxiliary log density, introducing intermediate potentials in the Langevin dynamics that change over time. Various designs homotopy (dashed lines) show different convergence speed with respect to the number of evaluations of the target potential. Here less interaction with the latter is preferred, e.g. if the evaluation of the potential is expensive such as in the setup of Bayesian inference.

Generative modeling: From Gaussian to Multimodality

Density trajectory from unimodal standard normal Gaussian to an non-symmetric multimodal density of non Gaussian-mixture type. The trajectory is defined through an Ornstein-Uhlenbeck process and its time-reverse counterpart process. The drift term in the reverse process is defined upon the score, which is obtained through solution of the Hamilton-Jacobi-Bellman equation. The latter is obtained through Hopf-Cole transformation of the Fokker-Planck equation associated to the forward Ornstein-Uhlenbeck process.

Optimal Sampling for alternating steepest descent on tensor networks

Many objective functions of minimization problems can be expressed in terms of the expectation of a loss. A common solution strategy is to minimize a corresponding empirical mean estimate. Unfortunately, the deviation of the exact and empirical minimizer then depends on the sample size. As an alternative, we empirically project the gradient of the exact objective function onto the tangent space. Descent is ensured by optimal weighted least squares approximation within an alternating minimization scheme.

Automatic differentiation within hierachical tensor formats

The main goal of this project is the minimization of objective functions defined on high-dimensional Euclidean tensor spaces. For the case, that the objective function allows for cheap evaluations on the Riemannian manifold of tensors given in hierachical tree-based low-rank format, we construct a cheap approach to obtain Riemannian Gradients based on Automatic differentiation. Examples of such type include (empirical) regression or completion problems.

This approach in turn overcomes the curse of dimensionality arising when computing Riemannian gradients as projection of (non traceable) Euclidean gradients to the tangential space.

Low-rank tensor formats define a non-linear approximation class of tensors, in particular they are multilinear and are a subclass of tensor networks, multigraphs with edge identities with additional dangling edges representing the indices of the full tensor.

This type of topology allows for efficient (sub)-contractions required to define local projections that define the degrees of freedom in the Riemannian gradient.

Tensor network respresentation to define projections for degrees of freedom associated for interior node case as part of the overall Riemannian gradient.

Quasi-Monte Carlo meets kernel cubature

The main goal of this project is to develop a kernel cubature technique based on the concept of optimal sampling. Optimal sampling can be used to define empirical projections on linear spaces, with error bounds bounded up to a constant by the best-approximation error. While for general L2-functions such bounds hold in expectation, in the case of functions being element in some RKHS, these bounds hold almost surely.

The latter case applies for the analysis of high-order Quasi-Monte Carlo methods, where the kernel of the RKHS is known. Hence refine the analysis of best-approximation in these RKHS and derive a almost surely convergent quadrature that yields optimal rates.

Neural JKO Sampling with Importance Correction

The main goal is the numerical approximation of Wasserstein gradient flows, using the formalism of generalized minimizing movements or Jordan-Kinderlehrer-Otto (JKO) scheme. For this we first discretize the JKO-scheme and then utilize Continuous Normalizing Flows to approximate the proximal mapping with respect to the previous obtained distribution. However Wasserstein gradient flows are known to behave poorly in the case of multi-model distribution. Hence the obtained composite of transport maps is enriched by layers of rejection and resampling steps based on importance reweighted rejection.

One-shot manifold learning for design of experiment

Common approaches to solve design of experiment tasks suffer from bad sample complexity due to an underlying nested sampling mechanic. In this project we avoid the nested sampling drawback, through solving the associated optimization task with a one-shot method. For this we consider a model class to approximate the underlying physical model and minimize the optimization task for the optimal parameter due to the parallel updating of model class coefficients and design parameter.

Diffusion models with multiplicative noise and rotational invariant distributions

In this project we extend the view of Diffusion models for sample generation to stochastic SDEs with multiplicative noise. Here the multiplicative noise is defined upon linear combination of Brownian motions and skew-symmetric operators. This type of SDEs result from spatial discretization of the following SPDE that appear in the modeling of fluid dynamics. We analyze properties of the associated Fokker-Planck equations and show that any invariant measure must be rotational invariant in accordance to the underlying physical behavior.

Pathwise tensor train approximation of Hamilton-Jacobi-Bellman equations

In this project we define forward and reverse diffusion processes with drift terms defined through an unknown control function. Then, using techniques from stochastic optimal control, we aim to find the optimal control function which is known a-priori to be the score function but intractable. In order to do so, we propose to learn the control via policy iteration on the path space by solving a backward stochastic differential equation (BSDE). In particular we utilize manifold optimization through low-rank tensor formats to represent the control function on sample trajectories. The latter is solved through minimizing a regression problem.