AA3 – Next Generation Networks

Project

AA3-12

On the Expressivity of Neural Networks

Project Heads

Christian Haase, Martin Skutella

Project Members

Moritz Grillo

Project Duration

01.08.2022 − 31.07.2025

Located at

TU Berlin

Description

The project aims at a better understanding of neural networks’ (NNs’) expressivity. More precisely, we want to characterize the class of functions that is represented by NNs with ReLU activations and a given architecture. In this context, it is of particular interest to derive upper and lower bounds on the size (width and depth) of ReLU NNs that exactly compute certain functions.

Proving lower bounds on the size of NNs

In the context of general representations of functions, we observe that the function computed by a ReLU NN is piecewise linear and continuous (CPWL) since it is a composition of affine transformations and the ReLU function. Conversely, it is known that any CPWL function can be represented by a ReLU NN of logarithmic depth. In [1] it has been conjectured that this logarithmic bound is tight. In other words, there may exist CPWL functions that can only be represented by ReLU NNs with logarithmic depth. This conjecture has been proven under a natural assumption for dimension n=4 using techniques from mixed-integer optimization. Additionally, in [2] it has been shown that logarithmic depth is necessary to compute the maximum of n numbers when only integer weights are allowed. This result is based on the duality between neural networks and Newton polytopes through tropical geometry. One of the primary goals of this project is to either prove or disprove the conjecture in its full generality.

Topological simplification with NNs

Furthermore, in the context of classification taasks, recent empirical studies have revealed that neural networks operate by changing topology, transforming complex datasets into a topologically simpler one as it passes through the layers. This process of topological simplification has been quantified using Betti numbers, which are algebraic invariants of topological spaces. In our work, we utilize the same measure to establish both lower and upper bounds on the topological simplification that a ReLU neural network can achieve with a given architecture. By doing so, we contribute to a deeper understanding of the expressivity of ReLU neural networks in the realm of binary classification problems, shedding light on their ability to capture the underlying topological structure of the data. While in our current work we establish upper bounds for a subset of the relevant Betti numbers, we also aim to find effective upper bounds for the remaining Betti numbers in this context.

Exact solutions in combinatorial optimization via NNs

Lastly, in the field of combinatorial optimization, neural networks can be viewed as models of real-valued computation. From this perspective, it is of great interest to discover neural networks that can solve or approximate combinatorial optimization problems. Previous work has shown the feasibility of implementing dynamic programming and approximation schemes for the knapsack problem and other classical problems in combinatorial optimization [3]. Additionally, it has been demonstrated that the Minimum Spanning Tree and Maximum Flow Problems can be solved using polynomial-sized NNs [4]. However, it remains unclear whether the existence of a polynomial-time algorithm for a combinatorial optimization problem implies the existence of a neural network capable of solving the same problem. The limited set of operations available and, in particular, the absence of branching due to the continuous nature of the function computed by a NN prevents the direct implementation of classical algorithms for many combinatorial optimization problems. As a consequence, new algorithms need to be developed to effectively address these challenges.

By pursuing these research directions, our project aims to advance our understanding of NN expressivity, encompassing function representation, topological analysis in classification tasks, and the solution of combinatorial optimization problems.

External Website

Related Publications

  • [1] Christoph Hertrich, Amitabh Basu, Marco Di Summa, and Martin Skutella. Towards lower bounds on the depth of reLU neural networks. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=0OWwNh-4in1.
  • [2] Christian Haase, Christoph Hertrich, and Georg Loho. Lower bounds on the depth of integral relu386
    neural networks via lattice polytopes, 2023.
  • [3] C. Hertrich and M. Skutella.
    Provably good solutions to the knapsack problem via neural networks of bounded size.
    CoRR, abs/2005.14105, 2020.
    URL: https://arxiv.org/abs/2005.14105, arXiv:2005.14105.
  • [4] C. Hertrich and L. Sering.
    Relu neural networks for exact maximum flow computation.
    CoRR, abs/2102.06635, 2021.
    URL: https://arxiv.org/abs/2102.06635, arXiv:2102.06635.

Related Pictures

3-Layer ReLU Neural Network represented as acyclic digraph
ReLU activation function
The preimage of a ReLU neural network F, where the lightgray annuli are mapped to the negative numbers.