Gabriele Steidl (TU), Andrea Walther (HU)
01.04.2021 − 31.03.2024
Recently, it was pointed out that many activation functions appearing in neural networks are proximity functions. Based on these findings, we have concatenated proximity operators defined with respect to different norms within a so-called proximal neural network (PNN). If this network includes tight frame analysis or synthesis operators as linear operators, it is itself an averaged operator and consequently non expansive. In particular, our approach is naturally related to other methods for controlling the Lipschitz constant of neural networks, which provably increase the robustness against adversarial attacks. Moreover, using Lipschitz networks, vanishing or exploding gradients during the training of the neural networks can be avoided which increases their stability. However, so far our PNNs are neither convolutional nor sparse. These attributes would make them much more useful in practice.
We aim to construct convolutional proximal neural networks with sparse filters, to analyze their behavior and to develop stochastic minimization algorithms for their training. We want to apply them for solving various inverse problems within a plug-and-play setting, where we intend to give convergence guarantees for the corresponding algorithms.
To this end, we want to tackle the following tasks:
1. Modeling: We want to construct convolutional PNNs in two steps: First, we will consider arbitrary convolution filters. To this end, we will work on matrix algebras (circulant matrices, -algebra) which are subsets of the Stiefel manifold. Here stochastic gradient descent algorithms working directly on these matrix algebras can be applied. Second, we want to restrict ourselves to sparse filters. Here (nonsmooth) constraints may appear in the minimization problem for learning the network. We have to provide an appropriate modeling which solves the tasks, but is still trainable.
2. Algorithm: The sparse convolutional model for learning PNNs above requires the construction, respectively, modification of corresponding algorithms. Fortunately, we have recently implemented an inertial SPRING algorithm and we hope to adapt it to minimize the novel functional. Further, we want to consider other variance-reduced estimators than the currently used SARAH estimator. We intend to adapt algorithmic differentiation techniques for estimating the involved Lipschitz constants.
3. Applications in inverse problems: We want to apply our convolutional PNNs within Plug-and-Play methods to solve certain inverse problems in image processing. First, our approach can be used for denoising. The advantage is that the NN has not to train each noise level, but just one. Then we want to consider deblurring and inpainting problems. Besides the forward-backward Plug-and-Play framework, we will also deal with ADMM Plug-and-Play. Also the primal-dual algorithm of Chambolle and Pock can be interesting in this direction. Using our Lipschitz networks we hope to give convergence guarantees for the methods. For that, we have to adapt a parameter in a certain fixed point equation. To learn this parameter related to the noise level, we want to apply one-shot optimization. Here, we have to address in particular the choice of the preconditioners within these algorithms so that convergence is ensured and the convergence speed is improved.
M. Hasannasab, J. Hertrich, S. Neumayer, G. Plonka, S. Setzer, and G. Steidl. Parseval proximal neural networks. The Journal of Fourier Analysis and its Applications, vol. 26, pp. 1–31, 2020.
J. Neumann, C. Schnörr, and G. Steidl. Combined SVM-based feature selection and classification. Machine Learning, vol. 61, pp. 129–150, 2005.
J. Hertrich, S. Neumayer, G. Steidl. Convolutional Proximal Neural Networks and Plug-and-Play Algorithms. arXiv preprint 2011.02281