AA5 – Variational Problems in Data-Driven Applications

Project

AA5-5 (was EF1-25)

Wasserstein Gradient Flows for Generalised Transport in Bayesian Inversion

Project Heads

Martin Eigel, Claudia Schillings, Gabriele Steidl

Project Members

Robert Gruhlke

Project Duration

01.01.2023 − 31.12.2024

Located at

FU Berlin

Description

Generalised gradient Wasserstein flows connect measure transport and interacting particle systems. The project combines the analysis of efficient numerical methods for gradient flows, associated SDEs and compressed functional approximations in the context of Bayesian inversion with parametric PDEs and image reconstruction tasks.

External Website

Related Publications

Eigel, M., Gruhlke, R., & Sommer, D. (2023). Less interaction with forward models in Langevin dynamics. arXiv preprint arXiv:2212.11528.

 

Eigel, M., Gruhlke, R., Kirstein M., Schillings C. & Sommer, D. (2023). Diffusion Generative Modelling by directly solving the Hamilton-Jacobi-Bellman equation of Stochastic optimal control using Tensor Networks. (in preparation)

 

Gruhlke, R., Moser, D. (2023). Automatic differentiation within hierachical tensor formats. (in preparation)

 

Gruhlke, R.,  Miranda, C., Nouy, A. & Trunschke, P. (2023). Optimal Sampling for alternating steepest descent on tensor networks. (in preparation)

Related Media

Homotopy enhanced Langevin dynamics

Covariance preconditioned Langevin dynamics also known as Affine invariant Langevin dynamics (ALDI) and additional homotopy enhancement provide a powerful tool for fast convergence to multimodal distributions. Here the homotopy is based on a convex combination of the log-target density and an auxiliary log density, introducing intermediate potentials in the Langevin dynamics that change over time. Various designs homotopy (dashed lines) show different convergence speed with respect to the number of evaluations of the target potential. Here less interaction with the latter is preferred, e.g. if the evaluation of the potential is expensive such as in the setup of Bayesian inference.

Generative modeling: From Gaussian to Multimodality

Density trajectory from unimodal standard normal Gaussian to an non-symmetric multimodal density of non Gaussian-mixture type. The trajectory is defined through an Ornstein-Uhlenbeck process and its time-reverse counterpart process.  The drift term in the reverse process is defined upon the score, which is obtained through solution of the Hamilton-Jacobi-Bellman equation. The latter is obtained through Hopf-Cole transformation of the Fokker-Planck equation associated to the forward Ornstein-Uhlenbeck process.

Optimal Sampling for alternating steepest descent on tensor networks

Many objective functions of minimization problems can be expressed in terms of the expectation of a loss. A common solution strategy is to minimize a corresponding empirical mean estimate. Unfortunately, the deviation of the exact and empirical minimizer then depends on the sample size. As an alternative, we empirically project the gradient of the exact objective function onto the tangent space. Descent is ensured by  optimal weighted least squares approximation within an alternating minimization scheme.

Automatic differentiation within hierachical tensor formats

The main goal of this project is the minimization of objective functions defined on high-dimensional Euclidean tensor spaces. For the case, that the objective function allows for cheap evaluations  on the Riemannian manifold of tensors given in hierachical tree-based low-rank format, we construct a cheap approach to obtain Riemannian Gradients based on Automatic differentiation. Examples of such type include (empirical) regression or completion problems.

This approach in turn overcomes the curse of dimensionality arising when computing Riemannian gradients as projection of (non traceable) Euclidean gradients to the tangential space.

Low-rank tensor formats define a non-linear approximation class of tensors, in particular they are multilinear and are a subclass of tensor networks, multigraphs with edge identities with additional dangling edges representing the indices of the full tensor.

This type of topology allows for efficient (sub)-contractions required to define local projections that define the degrees of freedom in the Riemannian gradient.

 

Tensor network respresentation to define projections for degrees of freedom associated for interior node case as part of the overall Riemannian gradient.