Scaling Limits of Deep Neural Networks: Rough Analysis Meets Optimization
Project Heads
Christian Bayer, Peter K. Friz
Project Members
Florin Suciu
Project Duration
01.10.2025 – 30.09.2027
Located at
TU Berlin
Description
We investigate the training of deep neural networks, in particular of the Transformer architecture, using rough analysis. We reveal continuous-time scaling regimes and stability properties essential for understanding convergence, capturing the limiting behavior both at initialization and during training.