EF1 – Extracting Dynamical Laws from Complex Data

Project

EF1-6

Graph Embedding for Analyzing the Microbiome

Project Heads

Tim Conrad, Stefan Klus, Gregoire Montavon

Project Members

Kateryna Melnyk (FU)

Project Duration

01.02.2019 – 31.01.2022

Located at

FU Berlin

Description

Only about 1 out of 10 cells in our body is actually a human cell. We are colonized by a diverse community of bacteria, archaea and viruses, jointly referred to as the microbiome (or microbiota). They have a strong influence on our health and are sometimes even called our second genome. Catalyzed by digitalization in the health-system and new high-throughput omics technologies, research on the human microbiome has grown exponentially in recent years and. has significantly increased our understanding about the role that these microbes play for our health and their connection to major diseases.

It has also been found that although the constitution of the microbiome is constantly changing throughout our lives (in response to environmental factors and other stimuli), a healthy human microbiome (i.e., a healthy community composition) can be considered as a meta-stable state lying in a minimum of some ecological stability landscape. Most studies aiming at understanding these dynamics, however, are focused on statistical constitution analysis, omitting more complex interactions.

The goal of this project is to develop new mathematical methods to allow a better understanding of complex systems modeled by time-evolving networks. With the aid of these methods, we will study microbiome interactions and the dynamic processes behind them.

We suggest two new approaches for analyzing time-evolving and high-dimensional graphs:

A deep-learning based approach for learning a graph embedding for which a classifier can be learned and that can be analyzed using classical time-series methods;
Using kernel-based methods for directly analyzing graph time-series data;

Overview of the project. (a) and (b): Using time-series data of an individual’s microbiome, we can create interaction networks that represent the microbial state (here: orange and green) at different time-points. Task 1 (c1): Find a function φ that embeds the given graphs in some low-dimensional space H where different states are separated by some hyper-plane ω. Task 2 (c2): Learn a graph kernel for time-series analysis. (d): Based on the low-dimensional embedding φ and the graph kernel with feature map ψ, dynamics of the system such as meta-stable states can be analyzed using e.g. Markov State Models.

We developed a kernel-based method – GraphKKE that is capable of learning embeddings of time-evolving graphs preserving temporal changes in a low-dimensional space. The method is based on the spectral analysis of transfer operators, such as the Perron–Frobenius or Koopman operator in a reproducing kernel Hilbert space.

Selected Publications

Eberle, O. and Büttner, J and Kräutli, F. and Müller, K.R. and Valleriani, M. and Montavon, G. Building and Interpreting Deep Similarity Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, preprint (2020).
Schnake, T. and Eberle, O. and Lederer, J. and Nakajima, S. and Schütt, K.T and Müller, K.T. and Montavon, G. Higher-Order Explanations of Graph Neural Networks via Relevant Walks. CoRR abs/2006.03589 (2020).
Klus, S. and Nüske, F. and Peitz, S. and Niemann, J.H. and Clementi, C. Schütte, C. Data-driven approximation of the Koopman generator: Model reduction, system identification, and control. Physica D: Nonlinear Phenomena, 406:132416, (2020)
Klus, S. and Schuster, I. and Muandet, K.. Eigendecompositions of Transfer Operators in Reproducing Kernel Hilbert Spaces. Journal of Nonlinear Science, 30(1):283-315, (2020)
Rams, M. and Conrad, T.O.F. Dictionary Learning for transcriptomics data reveals type-specific gene modules in a multi-class setting. Information Technology (2020)
Kauffmann, J. and Esders, M. and Montavon, G. and Samek, W. and Müller, K.R.. From Clustering to Cluster Explanations via Neural Networks. CoRR abs/1906.07633 (2019)
Iravani, S. and Conrad, T.O.F. Deep Learning for Proteomics Data for Feature Selection and Classification. Lecture Notes in Computer Science, 11713 (2019)
Melnyk, K. and Klus, S. and Montavon, G. and Conrad, T.O.F. GraphKKE: Graph Kernel Koopman Embedding for Human Microbiome Analysis. Applied Network Science, 96(5), (2020).
Cvetkovic, N. and Conrad, T.O.F., and Lie, H.C. A convergent discretisation method for transition path theory for diffusion processes. SIAM Multiscale Modeling and Simulation, 19(1), (2021).
J.-H. Niemann, S. Klus, and C. Schu ̈tte. Data-driven model reduction of agent-based systems using the koopman generator. PLOS ONE, 16(5), 2021.
T. Schnake, O. Eberle, J. Lederer, S. Nakajima, K. T. Schutt, K.-R. Mueller, and G. Montavon. Higher-order explanations of graph neural networks via relevant walks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.

Selected Pictures

An overview of GraphKKE method.

(a) Learning transfer operators using graph kernels, where k(·, ·) is a graph kernel and K_k is the Koopman operator. (b) In the learned embedding space it is possible to detect metastable states and to determine distinct substructures.

Please insert any kind of pictures (photos, diagramms, simulations, graphics) related to the project in the above right field (Image with Text), by choosing the green plus image on top of the text editor. (You will be directed to the media library where you can add new files.)
(We need pictures for a lot of purposes in different contexts, like posters, scientific reports, flyers, website,…
Please upload pictures that might be just nice to look at, illustrate, explain or summarize your work.)

As Title in the above form please add a copyright.

And please give a short description of the picture and the context in the above textbox.

Don’t forget to press the “Save changes” button at the bottom of the box.

If you want to add more pictures, please use the “clone”-button at the right top of the above grey box.