EF1 – Extracting Dynamical Laws from Complex Data

Project

EF1-5

On robustness of deep neural networks

Project Heads

Christian Bayer, Peter Friz

Project Members

Nikolas Tapia (TU / WIAS)

Project Duration

01.01.2019 – 31.12.2020

Located at

TU Berlin / WIAS

Description

Deep residual neural networks [He, Zhang, Ren, Sun 2016] are an important recent class of deep neural networks. Its incremental nature invites interpretation as Euler discretization of differential equations [Haber and Ruthotto 2017]. We suggest a far-reaching generalization using signatures and rough path analysis. In particular, we develop a new discrete rough path framework geared at difference equations, which allow us to obtain titght stability estimates for the output of a residual neural network in terms of the weight matrices.

Project Webpages

Selected Publications

  1. P. K. Friz, M. Hairer, A Course on Rough Paths. 2nd ed. Universitext (2020). Springer.
  2. C. Bellingeri, A. Djurdjevac, P. K. Friz, N. Tapia. Transport and continuity equations with (very) rough noise. (2020). arXiv 2002.10432 [math.AP]
  3. J. Diehl, K. Ebrahimi-Fard, N. Tapia. Time-warping invariants of multidimensional time series. Acta Appl. Math (2020). doi: 10.1007/s10440-020-00333-x
  4. E. Celledoni, P. E. Lystad, N. Tapia. Signatures in Shape Analysis: an Efficient Approach to Motion Identification. (2019) In Geometric Science of Information. GSI 2019. Lecture Notes in Computer Science vol. 11712. F. Nielsen, F. Barbaresco eds. doi: 10.1007/978-3-030-26980-7_3
  5. N. Tapia, L. Zambotti. The geometry of the space of branched Rough Paths. Proc. London Math. Soc. 121 no. 2 (2020) pp. 220–251. doi: 10.1112/plms.12311
  6. C. Bayer, P. K. Friz, N. Tapia. Robustness of Residual Networks via Rough Path techniques. (2020) In preparation.

Selected Pictures

Evolution of a single weight and feature
In the plot, we have a weight trajectory (yellow) and the associated feature value across the network, measured at skip connection. The feature is evolved using ReLU activation.

The weights are taken by an actual trained network from He et al.

Bound on the output size of a selected feature
The picture shows the size of at the output layer (in yellow) vs the analytical bound (in blue).
One can observe that since the weights turn out to have high 1-variation, our analysis yield sharper control for higher values of p.
Bound on the output size of a selected feature
Example of output from a network with smoothed-out weights, chose to have small 1-variation.

The picture shows that even in this case choosing p>1 can improve a priori bounds.

Bound on the relative size at output from different inputs
The picture shows the relative difference between two output values of the same feature, for a fixed set of weights.

One observes that due to the high variability of the trained weights, a priori knowledge of the deviation is better if we are allowed to choose p>1.

Please insert any kind of pictures (photos, diagramms, simulations, graphics) related to the project in the above right field (Image with Text), by choosing the green plus image on top of the text editor. (You will be directed to the media library where you can add new files.)
(We need pictures for a lot of purposes in different contexts, like posters, scientific reports, flyers, website,…
Please upload pictures that might be just nice to look at, illustrate, explain or summarize your work.)

As Title in the above form please add a copyright.

And please give a short description of the picture and the context in the above textbox.

Don’t forget to press the “Save changes” button at the bottom of the box.

If you want to add more pictures, please use the “clone”-button at the right top of the above grey box.