EF1 – Extracting dynamical Laws from Complex Data



Kernel Ensemble Kalman Filter and Inference

Project Heads

Péter Koltai, Nicolas Perkowski

Project Members

Ilja Klebanov

Project Duration

01.04.2021 − 31.12.2024

Located at

FU Berlin


We propose combining recent advances in the computation of conditional (posterior probability) distributions via Hilbert space embedding with the stochastic analysis of partially observed dynamical systems —exemplified by ensemble Kalman methods— to develop, analyse, and apply novel learning methods for profoundly nonlinear, multimodal problems.

Given a hidden Markov model, the task of filtering refers to the inference of the current hidden state from all observations up to that time. One of the most prominent filtering techniques is the so-called ensemble Kalman filter (EnKF), which approximates the filtering distribution by an ensemble of particles in the Monte Carlo sense.

While its prediction step is straightforward, the analysis or update step (i.e. the incorporation of the new observation via Bayes’ rule) is a rude approximation by the Gaussian conditioning formula, which is exact in the case of Gaussian distributions and linear models, but, in general, cannot be expected to reproduce the filtering distribution in the large ensemble size limit.

On the other hand, as we have found in our previous Math+ project (TrU-2), the Gaussian conditioning formula is exact for any random variables after embedding them into so-called reproducing kernel Hilbert spaces (RKHS), a methodology widely used by the machine learning community under the term “conditional mean embedding”.

Therefore, the question of how these two approaches can be combined arises quite naturally. The aim of this project is to eliminate the second source of error described above (in addition to the Monte Carlo error) by embedding the EnKF methodology into RKHSs. Further advantages of such an embedding is the potential to treat nonlinear state spaces such as curved manifolds or sets of images, graphs, strings etc., for which the conventional EnKF cannot even be formulated.

External Website

Related Publications

  • I. Klebanov, I. Schuster, and T. J. Sullivan. A rigorous theory of conditional mean embeddings. SIAM J. Math. Data Sci., 2020.
  • M. Mollenhauer, S. Klus, C. Schütte, and P. Koltai. Kernel autocovariance operators of stationary processes: Estimation and convergence, 2020. arXiv:2004.00891.
  • I. Schuster, M. Mollenhauer, S. Klus, and K. Muandet. Kernel conditional density operators. In Proceedings of the 23rd AISTATS 2020, Proceedings of Machine Learning Research, 2020.
  • H. C. Yeong, R. T. Beeson, N. S. Namachchivaya, and N. Perkowski. Particle filters with nudging in multiscale chaotic systems: With application to the Lorenz ’96 atmospheric model. J. Nonlinear Sci., 30(4):1519-1552, 2020.
  • A. Bittracher, S. Klus, B. Hamzi, P. Koltai, and C. Schütte. Dimensionality reduction of complex metastable systems via kernel embeddings of transition manifolds, 2019. arXiv:1904.08622.
  • N. B. Kovachki and A. M. Stuart. Ensemble Kalman inversion: a derivative-free technique for machine learning tasks. Inverse Probl., 35(9):095005, 35, 2019.
  • J. Diehl, M. Gubinelli, and N. Perkowski. The Kardar-Parisi-Zhang equation as scaling limit of weakly asymmetric interacting Brownian motions. Comm. Math. Phys., 354(2):549-589, 2017.
  • C. Schillings and A. M. Stuart. Analysis of the ensemble Kalman filter for inverse problems. SIAM J. Numer. Anal., 55(3):1264-1290, 2017.
  • O. G. Ernst, B. Sprungk, and H.-J. Starkloff. Analysis of the ensemble and polynomial chaos Kalman filters in Bayesian inverse problems. SIAM/ASA J. Uncertain. Quantif., 3(1):823-851, 2015.
  • K. Fukumizu, L. Song, and A. Gretton. Kernel Bayes’ rule: Bayesian inference with positive definite kernels. J. Mach. Learn. Res., 14(1):3753-3783, 2013.
  • P. Imkeller, N. S. Namachchivaya, N. Perkowski, and H. C. Yeong. Dimensional reduction in nonlinear filtering: A homogenization approach. Ann. Appl. Probab., 23(6):2290-2326, 2013.
  • M. Goldstein and D. Wooff. Bayes Linear Statistics: Theory and Methods. Wiley Series in Probability and Statistics. John Wiley & Sons, Ltd., Chichester, 2007.
  • G. Evensen. Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res. Oceans, 99(C5):10143-10162, 1994.

Related Pictures

While the framed formula for the conditional expectation is not valid for general random variables, it is always true for their versions embedded into a RKHS.