Non-convex learning with stochastic algorithms

In recent years, deep learning has led to unprecedented advances in a wide range of applications including natural language processing, reinforcement learning, and speech recognition. Despite the abundance of empirical evidence highlighting the success of neural networks, the theoretical properties of deep learning remain poorly understood and have been a subject of active investigation. One foundational aspect of deep learning that has garnered great intrigue in recent years is the generalization behavior of neural networks, that is, the ability of a neural network to perform on unseen data. Furthermore, understanding better this generalization behavior has significant practical importance as it can provide guidance and intuition on how to design more effective and powerful deep learning algorithms in the future. Our proposal has two primary objectives: (1) develop an algorithm to address the “generalization gap” problem in deep learning: how to decrease the decay in generalization performance when using large-batches in training. (2) Investigate non-vacuous generalization bounds in deep learning through the PAC-Bayes and uniform stability frameworks

Mufan Bill Li
Faculty Supervisor: 
Daniel Roy;Murat Erdogdu
Partner University: