Method Development Unit

Project

MDU-3

Scaling Limits of Deep Neural Networks: Rough Analysis Meets Optimization

Project Heads

Christian Bayer, Peter K. Friz

Project Members

Florin Suciu

Project Duration

01.10.2025 – 30.09.2027

Located at

TU Berlin

Description

We investigate the training of deep neural networks, in particular of the Transformer architecture, using rough analysis. We reveal continuous-time scaling regimes and stability properties essential for understanding convergence, capturing the limiting behavior both at initialization and during training.

External Website

Related Publications

Related Pictures