Utilizing pre-trained vision models for object-centric reinforcement learning

The way humans perceive their surroundings relies heavily on some sort of decomposition of those into smaller atomized entities (objects, or parts of objects). This allows for complex reasoning and prediction capabilities which are mostly seen as mundane in humans but are still a challenge to propagate to artificial systems. My project wants to explore the advantages of incorporating this intuition about how humans perceive into a Reinforcement Learning (RL) algorithm. We will do this by employing computer vision models trained for identifying objects on large and diverse datasets, so that the general knowledge of what “objects” are is already learned and “stored” in their weights, and little to no additional training is required on our end. The segmentation these models will provide will be further “fed” into an RL agent. We hope this will be beneficial for the agents ability to “solve” the environment it’s in, be it a video game or a robot control task.

Faculty Supervisor:

Igor Gilitschenski

Student:

Partner:

National Technical University of Ukraine

Discipline:

Computer science

Sector:

Artificial Intelligence

University:

University of Toronto

Program: