Computer vision researchers have been moving beyond simple image classification and tackling more complex tasks such as object localization, detection and semantic segmentation. However, many of the proposed methods require large amounts of annotated data such as segmentation masks, which are expensive and time-consuming to acquire. Moreover, those methods cannot segment new object categories which were not present in the training set.
Few-shot segmentation alleviates both those problems by learning end-to-end to segment new object categories from few examples.
Recommender systems (RS) are intended to be a personalized decision support tool, where decisions can take the form of products to buy (e.g., Amazon), movies to watch (e.g., Netflix), online news to read (e.g., Google News), or even individuals to screen for a medical condition (e.g., personalized medicine). For digital users, RS play an essential role, since the available content (and hence possible actions) grows exponentially.
In this project, we propose a continual learning approach to face the problem of forward transfer in complex reinforcement learning tasks. Concretely, we propose a model that learns how to combine a series of general modules in a deep learning architecture, so that generalization emerges through the composition of those modules. This is of vital importance for Element AI to provide reusable solutions that scale with new data, without the need of learning a new model for every problem and improving the overall performance.
One of the approaches portfolio managers commonly use to build portfolios, is to rank the underlying assets based on the prediction for the stock returns, as well as other aspects of the portfolio such as the portfolio risk. In this project we aim to apply different deep learning techniques to the problem of stock ranking. The features we want to use to train our models are mainly derived from fundamental company data including quarterly and annual filings of the publicly traded companies.
The inference of causal relationships is a problem of fundamental interest in science. Compared to models that rely on mere correlations, causal models allow us to anticipate the effect of a change in a system. Such causal models have applications ranging from government policy making to personalized medicine. However, learning causal models from data is a challenging task, since it requires large data sets and, in some cases, the conduction of costly or invasive experiments. In this project, we propose a new method to learn causal models using less data.
Learning from relational data is crucial for modeling the processes found in many application domains ranging from computational biology to social networks. In this project, we propose to work on developing new modeling techniques that combine the advantages of the approaches found in two fields of study: Machine Learning (through graph neural networks and transformer networks) and Statistical Learning (through statistical relational learning methods).
Human perception has developed the ability to decompose scenes into fine grained elements. This lays the foundation for strong generalization to new situations where the base concepts can be recomposed to interpret objects never seen before. While it has been shown that, in the general case, proper decomposition is not possible, new paradigms provide provable decomposition in constrained environments. We hypothesize that the multiple sensory systems of human perception offer a strong signal for decomposing scenes in a proper way.
Humans recognise objects in the world leveraging multi-modal sensory inputs beyond visual aspects (images and videos). Touch based information (Haptics) possesses rich information about structure, shape and other objetness properties. In this work, we will study and learn cross-modal representations between vision and touch. To connect vision and touch, we plan to introduce a zero shot classification task of recognising unseen object categories from shapenet dataset using haptics signals.
We’d like to address the issue of 3D reconstruction from 2D images. This means developing a machine learning algorithm that can take a regular photo as an input and generate a full 3-dimensional reconstruction of the contents of the photo. Such technology can be used creatively or to help the coming generation of robots better understand their surroundings.
This fundamental research project investigates semantic visual navigation tasks, such as asking a household robot to “go find my keys”. We seek to enhance the efficacy of repeated search tasks within the same environment, by explicitly building, maintaining, and exploiting a map of locations that the robot had previously explored. We also seek to exploit prior location-tolocation, object-within-location, and object-to-object relationships from similar environments (e.g. within a common cultural region) to improve semantic visual navigation in unseen environments.