On-the-fly World Models for Quadrupedal Robots

Deep reinforcement learning (DRL) for quadrupedal robot control has recently become tractable. In under an hour, serviceable control policies can be obtained in simulation. However, transferring reinforcement learning from simulation to the real world is still an arduous task. We propose a novel two-agent setup that leverages the good sides of both model-free and model-based DRL. We propose introducing a new agent is to the model-free control policy. By tasking the new agent with regularizing differences in operating environments by generating world models on-the-fly that will act as regularizers over the control policy’s inputs, we will obtain a reliable sim2real strategy to help take quadrupedal control polices out of simulation and into the real world.

Faculty Supervisor:

Liam Paull

Student:

Partner:

Kyoto University

Discipline:

Computer science

Sector:

Education

University:

Université de Montréal

Program: