Goal-Conditioned Reinforcement Learning

The goal of the project is to improve upon the methodology behind goal conditioned learning. In this framework, similar to the setup in traditional reinforcement learning, an agent interacts with an environment. However, instead of training the agent to maximize return, the agent is trained to reach a given goal at the end of the trajectory. That is, given a rollout-specific goal, the agent attempts to reach it. This goal conditioned paradigm is particularly promising for applications where the objective changes in every episode, for example, controlling a robot or a drone for different tasks; or self-driving vehicles, where the destination might change between episodes. In this project, we will explore potential improvements within the goal conditioned framework, both in the discrete and continuous action space settings.

Faculty Supervisor:

Arvind Gupta

Student:

Panteha Naderian

Partner:

Layer 6 AI

Discipline:

Computer science

Sector:

Professional, scientific and technical services

University:

University of Toronto