Hackathon on

Small Data Analysis

January 22-24, 2024

Young Academy – Organizers

Martin Hanik (TUB)

Luca Donati (ZIB)

Elodie Maignant (ZIB)

Johannes Zonker (ZIB)

Florian Beier (TUB)

Robert Beinert (TUB)


The semester is organized within the framework of the Berlin Mathematics Research Center MATH+ and supported by the Einstein Foundation Berlin. We are committed to fostering an atmosphere of respect, collegiality, and sensitivity. Please read our MATH+ Collegiality Statement.


The “Hackathon on Small Data Analysis” took place from the 22nd to the 24th of January 2024 at the Villa Engler, the former director’s mansion of the botanical garden in Berlin.  The three-day event brought together an interdisciplinary group of 30 young scientists so that they could learn about knowledge-driven approaches for small data analysis, gain first practical experience, and catch a glimpse into exciting new research questions. Participants only had to bring a laptop and lots of motivation.


What is a Hackathon?

Random people coming together and writing some more or less useful code. In our case we wanted to bring together young scientists from different research backgrounds that are interested in small data analysis. They worked in teams up to 6 people. The Hackathon aimed for exciting future research collaborations. But we also had a lot of fun and good food.


The following was “expected” from the participants:

  • You are interested in imaging. You are not expected to be an expert in anything or even fully understand the project descriptions below. The Hackathon is about learning new things.
  • You have some experience in programming. No worries if you don’t know the difference between runtime and compile time polymorphism in C++ but some basic coding experience is needed.
  • You have a laptop. We will provide high performance computers, but you will need to bring your own laptop.


Below, is the list of projects that were tackled during the hackathon.

Cardiac motion estimation using shape trajectories Utilize (dynamic) heart models to estimate the anatomical changes of the heart during the cardiac cycle. An approach was developed to accurately estimate cardiac motion from 3D cardiac MR acquisitions. The motion model was used for a statistical analysis of patient populations.

Geometric learning for quantitative analysis of stone tool reduction sequences  Refitted lithic debris poses a promising source of information for understanding the variability in stone tool production—and thus cultural transmission—among prehistoric populations. This project aimed at the exploration and adaptation of geometric processing and analysis methods for refitted core reduction nodules. One target was to derive a classifier to infer the removal characteristics of lithic flakes. Another was to encode entire reduction sequences as points in a geometric space, thereby proving a foundation for downstream learning tasks such as cluster analysis.

Gromov–Wasserstein-Based Shape, Graph and Image Analysis
The Gromov–Wasserstein (GW) distance is an optimal transport-based metric, which allows for embedding-free comparisons and matchings of gauged measure spaces like 3d Euclidean shapes. This projects aimed to explore the use of the GW distance for interpolation and classification tasks of small datasets including surface scans, images and more generally graphs. To suit the individual tasks we employed linear and multi-marginal methods. In an effort to account for outliers, data labels and prior information, we sought to incorporate several generalizations such as unbalanced, fused and keypoint-guided GW.

Learning Molecular Dynamics using neural networks
Molecular Dynamics simulations generate the time evolution of the positions of the atoms of a molecular system over time.
However, events of interest from a chemical-biological point of view, the transitions between metastable states, are only rarely observed, even in very long trajectories. This project aimed to develop new methodologies to determine such events and estimate kinetic physical properties using neural networks.

Hunter-Gatherer Dynamics through Agent-based Modeling
In this project we discussed the basics of agent-based modeling with Markov processes and how to apply the concepts in the context of hunter-gatherer societies. A running environment for Matlab, Python, or a similar language was required, and some familiarity with the language of your choice was recommended. We built the models and visualization scripts from scratch, environmental datasets were provided.


Check out the GitHub page of the event for the results of the projects, including the code that was written over the three days.



Below are more images from the hackathon.


The Hackathon took place in Villa Engler – the former director’s mansion of the botanical garden in Berlin.

The address is: Altensteinstraße 2, 14195 Berlin.