Master's thesis presentation. Muhammad Waleed is advised by Dr. Felix Dietrich.

# Previous talks at the SCCS Colloquium

# Muhammad Waleed Bin Khalid: Efficient Kernel Flows Optimization for Neural Network Induced Gaussian Process Kernels

SCCS Colloquium |

A ubiquitous task in data-driven learning involves using available information to construct accurate models that can generalize to unseen input. Kernel-based learning methods are one such technique, but they rely on a good prior choice of the kernel. Often, knowledge is required regarding what type of kernel would be suitable for the available data and what kernel hyper-parameters would be needed for optimal performance. The Kernel Flows method is one learning technique that adapts the said kernel to provide the best performance for a given dataset without requiring as much expert knowledge. It works on the simple rationale that if a kernel is good, then there should not be a large change in the predictions even with reduced input data. The algorithm is available in two flavors, a parametric version where the kernel hyper-parameters are adapted, and a non-parametric one, where the input data itself is transformed to suit the base kernel.

On the same spectrum of learning techniques is the Gaussian process, more specifically the Gaussian process generated from Neural Networks. It can be shown that in the limit of infinite width, a fully connected (Dense) Neural Network (NN), and in the limit of infinite filters, a Convolutional Neural Network (CNN) is equivalent to a Gaussian process with a kernel that depends on the respective architecture. For such a Network, the kernel is parameterized by the variances of the learnable parameters, i.e., the variance of the weight and the bias, hence only two numbers per layer of the original architecture.

In this thesis, we will elaborate on both concepts and subsequently combine them by utilizing the Kernel Flows algorithm to optimize the NNGP kernels for kernel ridge regression tasks. We will explore the parametric version of the Kernel Flows algorithm with Neural Network induced Gaussian process (NNGP) kernels as the base kernels we wish to optimize. While the Kernel Flows algorithm can provide appreciable results with only a few data points, this is nevertheless computationally expensive when kernels from deep Neural Networks are involved. Hence, we will also provide efficient implementations of the proposed method which will lead us to variants of the parametric Kernel Flows algorithm that utilize different optimization techniques compared to the one used originally. Subsequently, we will compare the results of these optimized kernels along with the computational complexity involved in achieving them.

We will also explore the non-parametric version of the Kernel Flows algorithm. Particularly, we will explore the problem of unnatural perturbations of data points and poor convergence that have been highlighted in previous works to understand why such abnormalities exist, propose solutions that aim to remedy them, and test our proposed solutions.