Using few demonstration videos to improve RL agent’s one-shot performance

(1) Ocado Technology, a division of Ocado Group, specializes in AI-driven robotics and automated fulfillment solutions for online grocery retailers. The company develops machine learning models, robotic control
systems, and computer vision technologies to improve warehouse automation. As a partner in this project, Ocado will provide mentorship, computational resources, proprietary datasets, and robotic simulation environments, supporting research into reinforcement learning (RL) for robotic manipulation.
(2) A key challenge for Ocado is reducing reliance on costly and complex data collection for training RL agents. Traditional Imitation Learning (IL) methods require large-scale expert demonstrations, limiting scalability.
Additionally, RL-based robotic systems struggle with generalization, requiring extensive retraining. This project will explore whether a few low-overhead demonstration videos can improve RL efficiency, leveraging vision-language models (VLMs) and imitation learning to enhance one-shot learning.
(3) By improving RL efficiency, Ocado can accelerate AI-driven robotic deployment, reducing training costs, manual labor dependency, and operational expenses. This will enhance warehouse automation and scalability. Beyond Ocado, the research contributes to smarter AI-driven automation, benefiting industries such as manufacturing, logistics, and healthcare.

Faculty Supervisor:

Igor Gilitschenski

Student:

Partner:

Ocado Technology

Discipline:

Computer science

Sector:

Professional, scientific and technical services

University:

University of Toronto

Program: