Understanding Empirical Risk Minimization via Information Theory

Deep neural network (DNN) is a class of machine learning algorithms which is inspired by biological neural networks. Learning with deep neural networks has enjoyed huge empirical success in recent years across a wide variety of tasks. Lately many researchers in machine learning society have become interested in the generalization mystery: why do overparameterized DNN perform well on previously unseen data, even though they have way more parameters than the number of training samples? The Information-theoretic approach for studying generalization is one the frameworks to answer this question.

Efficient Deep Learning Methods that Only Require Few Labeled Data

In this postdoc, we plan to focus on computer vision tasks where existing deep learning methods require lots of labeled samples to work well. Acquiring labeled samples is time-consuming and often impractical. Thus, we investigate three different classes of methods to alleviate the label scarcity problem: active learning, weakly-supervised learning, and few-shot learning. In active learning, the goal is to label the most important samples to maximize the performance of the model while reducing labeling costs. In weakly supervised learning, the goal is to train models using weak labels.

Separating Syntax and Semantics for Semantic Parsing

Study of language disorders, theoretical linguistics, and neuroscience suggests language competence involves two interacting systems, typically dubbed syntax and semantics. However few state-of-the-art deep-learning approaches for natural language processing explicitly model two different systems of representation. While achieving impressive performance on linguistic tasks, they commonly fail to generalize systematically. For example, unlike humans, learning a new verb like jump in isolation is insufficient for models to combine it with known words (like jump twice or jump and run).

Few-shot Generative Adversarial Networks

The most successful computer vision approaches are based on deep learning architectures, which typically require a large amount of labeled data. This can be impractical or expensive to acquire. Therefore, few-shot learning techniques were proposed to learn new concepts with just one or few annotated examples. However, unsupervised methods such as generative adversarial networks (GANs) still require a huge amount of data to be trained. As such, this project will focus on few-shot learning for GANs.

Graph-based learning and inference: models and algorithms

Learning from relational data is crucial for modeling the processes found in many application domains ranging from computational biology to social networks. In this project, we propose to work on developing modeling techniques that combine the advantages of the approaches found in two fields of study: Machine Learning (through graph neural networks) and Statistical Learning (through statistical relational learning methods).

Link Prediction on Knowledge Graphs with Graph Neural Networks

Knowledge graphs store facts using relations between pairs of entities. In this work, we address the question of link prediction in knowledge graphs. Our general approach broadly follows neighborhood aggregation schemes such as that of Graph Convolutional Networks (GCN), which in turn was motivated by spectral graph convolutions. Our proposed model will aggregate information from neighbour entities and relations. Contrary to most existing knowledge graph completion methods, our model is expected to work in the inductive setting: Predicting relations for entities not seen during training.

Unsupervised Learning of 3D Scenes from Images using a View-based Representation

We’d like to address the issue of 3D reconstruction from 2D images. This means developing a machine learning algorithm that can take a regular photo as an input and generate a full 3-dimensional reconstruction of the contents of the photo. Such technology can be used creatively or to help the coming generation of robots better understand their surroundings.

An information-theoretic framework for understanding generalization in neural networks

Deep neural network (DNN) is a class of machine learning algorithms which is inspired by biological neural networks. DNNs are themselves general function approximations, which is the reason they can be applied to almost any machine learning problem. Their applications can be found in visual object recognition in computer vision, translating texts in unsupervised learning, etc. DNNs are prone to overfitting because DNNs usually have many more parameters than the available training data. However, they usually have a low error on the test data.