Deep neural network (DNN) is a class of machine learning algorithms which is inspired by biological neural networks. Learning with deep neural networks has enjoyed huge empirical success in recent years across a wide variety of tasks. Lately many researchers in machine learning society have become interested in the generalization mystery: why do overparameterized DNN perform well on previously unseen data, even though they have way more parameters than the number of training samples? The Information-theoretic approach for studying generalization is one the frameworks to answer this question.
In this postdoc, we plan to focus on computer vision tasks where existing deep learning methods require lots of labeled samples to work well. Acquiring labeled samples is time-consuming and often impractical. Thus, we investigate three different classes of methods to alleviate the label scarcity problem: active learning, weakly-supervised learning, and few-shot learning. In active learning, the goal is to label the most important samples to maximize the performance of the model while reducing labeling costs. In weakly supervised learning, the goal is to train models using weak labels.
Study of language disorders, theoretical linguistics, and neuroscience suggests language competence involves two interacting systems, typically dubbed syntax and semantics. However few state-of-the-art deep-learning approaches for natural language processing explicitly model two different systems of representation. While achieving impressive performance on linguistic tasks, they commonly fail to generalize systematically. For example, unlike humans, learning a new verb like jump in isolation is insufficient for models to combine it with known words (like jump twice or jump and run).
The most successful computer vision approaches are based on deep learning architectures, which typically require a large amount of labeled data. This can be impractical or expensive to acquire. Therefore, few-shot learning techniques were proposed to learn new concepts with just one or few annotated examples. However, unsupervised methods such as generative adversarial networks (GANs) still require a huge amount of data to be trained. As such, this project will focus on few-shot learning for GANs.
Learning from relational data is crucial for modeling the processes found in many application domains ranging from computational biology to social networks. In this project, we propose to work on developing modeling techniques that combine the advantages of the approaches found in two fields of study: Machine Learning (through graph neural networks) and Statistical Learning (through statistical relational learning methods).
Knowledge graphs store facts using relations between pairs of entities. In this work, we address the question of link prediction in knowledge graphs. Our general approach broadly follows neighborhood aggregation schemes such as that of Graph Convolutional Networks (GCN), which in turn was motivated by spectral graph convolutions. Our proposed model will aggregate information from neighbour entities and relations. Contrary to most existing knowledge graph completion methods, our model is expected to work in the inductive setting: Predicting relations for entities not seen during training.
We’d like to address the issue of 3D reconstruction from 2D images. This means developing a machine learning algorithm that can take a regular photo as an input and generate a full 3-dimensional reconstruction of the contents of the photo. Such technology can be used creatively or to help the coming generation of robots better understand their surroundings.
Deep neural network (DNN) is a class of machine learning algorithms which is inspired by biological neural networks. DNNs are themselves general function approximations, which is the reason they can be applied to almost any machine learning problem. Their applications can be found in visual object recognition in computer vision, translating texts in unsupervised learning, etc. DNNs are prone to overfitting because DNNs usually have many more parameters than the available training data. However, they usually have a low error on the test data.
Join a thriving innovation ecosystem. Subscribe now