Project Heads
Christian Haase, Martin Skutella
Project Members
Moritz Grillo
Project Duration
01.08.2022 − 31.07.2025
Located at
TU Berlin
The project aims at a better understanding of neural networks’ (NNs’) expressivity. More precisely, we want to characterize the class of functions that is represented by NNs with ReLU activations and a given architecture. In this context, it is of particular interest to derive upper and lower bounds on the size (width and depth) of ReLU NNs that exactly compute certain functions.
Proving lower bounds on the size of NNs
In the context of general representations of functions, we observe that the function computed by a ReLU NN is piecewise linear and continuous (CPWL) since it is a composition of affine transformations and the ReLU function. Conversely, it is known that any CPWL function can be represented by a ReLU NN of logarithmic depth. In [1] it has been conjectured that this logarithmic bound is tight. In other words, there may exist CPWL functions that can only be represented by ReLU NNs with logarithmic depth. This conjecture has been proven under a natural assumption for dimension n=4 using techniques from mixed-integer optimization. Additionally, in [2] it has been shown that logarithmic depth is necessary to compute the maximum of n numbers when only integer weights are allowed. This result is based on the duality between neural networks and Newton polytopes through tropical geometry. One of the primary goals of this project is to either prove or disprove the conjecture in its full generality.
Topological simplification with NNs
Furthermore, in the context of classification taasks, recent empirical studies have revealed that neural networks operate by changing topology, transforming complex datasets into a topologically simpler one as it passes through the layers. This process of topological simplification has been quantified using Betti numbers, which are algebraic invariants of topological spaces. In our work, we utilize the same measure to establish both lower and upper bounds on the topological simplification that a ReLU neural network can achieve with a given architecture. By doing so, we contribute to a deeper understanding of the expressivity of ReLU neural networks in the realm of binary classification problems, shedding light on their ability to capture the underlying topological structure of the data. In particular the results show that deep ReLU neural networks are exponentially more powerful than shallow ones in terms of topological simplification. [3]
Deciding Basic Properties of Functions Computed by NNs
In view of safety-critical applications, the verification of trained networks is of great importance and necessitates a thorough understanding of essential properties of the function computed by a ReLU network, including characteristics like injectivity and surjectivity. Recently, Puthawala et al. [JMLR 2022] came up with a characterization for injectivity of a ReLU layer, which implies an exponential time algorithm. However, the exact computational complexity of deciding injectivity remained open. We answer this question by proving coNP-completeness of deciding injectivity of a ReLU layer. On the positive side, as our main result, we present a parameterized algorithm which yields fixed-parameter tractability of the problem with respect to the input dimension. In addition, we also characterize surjectivity for two-layer ReLU networks with one-dimensional output. Remarkably, the decision problem turns out to be the complement of a basic network verification task. We prove NP-hardness for surjectivity, implying a stronger hardness result than previously known for the network verification problem. Finally, we reveal interesting connections to computational convexity by formulating the surjectivity problem as a zonotope containment problem. [4]
By pursuing these research directions, our project aims to advance our understanding of NN expressivity, encompassing function representation, topological analysis in classification tasks, and the computational complexity of deciding basic properties of the functions.
External Website
Related Publications
Related Pictures