AA3 – Next Generation Networks

Project

AA3-12

On the Expressivity of Neural Networks

Project Heads

Christian Haase, Martin Skutella

Project Members

Moritz Grillo

Project Duration

01.08.2022 − 31.07.2025

Located at

TU Berlin

Description

The project aims at a better understanding of neural networks’ (NNs’) expressivity. More precisely, we want to characterize the class of functions that is represented by NNs with ReLU activations and a given architecture. In this context, it is of particular interest to derive upper and lower bounds on the size (width and depth) of ReLU NNs that exactly compute certain functions.

Proving lower bounds on the size of NNs

In the context of general representations of functions, we observe that the function computed by a ReLU NN is piecewise linear and continuous (CPWL) since it is a composition of affine transformations and the ReLU function. Conversely, it is known that any CPWL function can be represented by a ReLU NN of logarithmic depth. In [1] it has been conjectured that this logarithmic bound is tight. In other words, there may exist CPWL functions that can only be represented by ReLU NNs with logarithmic depth. This conjecture has been proven under a natural assumption for dimension n=4 using techniques from mixed-integer optimization. Additionally, in [2] it has been shown that logarithmic depth is necessary to compute the maximum of n numbers when only integer weights are allowed. This result is based on the duality between neural networks and Newton polytopes through tropical geometry. One of the primary goals of this project is to either prove or disprove the conjecture in its full generality.

Topological simplification with NNs

Furthermore, in the context of classification taasks, recent empirical studies have revealed that neural networks operate by changing topology, transforming complex datasets into a topologically simpler one as it passes through the layers. This process of topological simplification has been quantified using Betti numbers, which are algebraic invariants of topological spaces. In our work, we utilize the same measure to establish both lower and upper bounds on the topological simplification that a ReLU neural network can achieve with a given architecture. By doing so, we contribute to a deeper understanding of the expressivity of ReLU neural networks in the realm of binary classification problems, shedding light on their ability to capture the underlying topological structure of the data. In particular the results show that deep ReLU neural networks are exponentially more powerful than shallow ones in terms of topological simplification. [3]

Deciding Basic Properties of Functions Computed by NNs

In view of safety-critical applications, the verification of trained networks is of great importance and necessitates a thorough understanding of essential properties of the function computed by a ReLU network, including characteristics like injectivity and surjectivity. Recently, Puthawala et al. [JMLR 2022] came up with a characterization for injectivity of a ReLU layer, which implies an exponential time algorithm. However, the exact computational complexity of deciding injectivity remained open. We answer this question by proving coNP-completeness of deciding injectivity of a ReLU layer. On the positive side, as our main result, we present a parameterized algorithm which yields fixed-parameter tractability of the problem with respect to the input dimension. In addition, we also characterize surjectivity for two-layer ReLU networks with one-dimensional output. Remarkably, the decision problem turns out to be the complement of a basic network verification task. We prove NP-hardness for surjectivity, implying a stronger hardness result than previously known for the network verification problem. Finally, we reveal interesting connections to computational convexity by formulating the surjectivity problem as a zonotope containment problem. [4]

 

By pursuing these research directions, our project aims to advance our understanding of NN expressivity, encompassing function representation, topological analysis in classification tasks, and the computational complexity of deciding basic properties of the functions.

External Website

Related Publications

  • [1] Christoph Hertrich, Amitabh Basu, Marco Di Summa, and Martin Skutella. Towards lower bounds on the depth of relu neural networks. SIAM Journal on Discrete Mathematics, 37(2):997–1029, 2023
  • [2] Christian Haase, Christoph Hertrich, and Georg Loho. Lower bounds on the depth of integral relu386
    neural networks via lattice polytopes, 2023.
  • [3] Ekin Ergen and Moritz Grillo. Topological expressivity of relu neural networks. In Shipra Agrawal and Aaron Roth, editors, Proceedings of Thirty Seventh Conference on Learning Theory, volume 247 of Proceedings of Machine Learning Research, pages 1599–1642. PMLR, 30 Jun–03 Jul 2024
  • [4] Vincent Froese, Moritz Grillo, and Martin Skutella. Complexity of Deciding Injectivity and Surjectivity of ReLU Neural Networks. arXiv e-prints, page arXiv:2405.19805, May 2024.

Related Pictures

3-Layer ReLU Neural Network represented as acyclic digraph
ReLU activation function
The preimage of a ReLU neural network F, where the lightgray annuli are mapped to the negative numbers.