Off-Policy Reinforcement Learning (RL) for a Production Robotics Application

Kindred offers eCommerce retailers a solution to assist with rapid order fulfilment from their distribution centres. The solution (SORT) is a combination of a so-called put-wall and a humanoid robot. The robot picks up items from orders, scans them, and puts each item in a cubby of the put-wall according to the scan code. The robot comprises a gripper, a 6-degree-of-freedom arm, and a stereo vision module, as well as other electronics and mechanical housing. The proposed research will explore machine learning techniques based on reinforcement learning to feed data recorded from Kindredâs production robots back into learning algorithms in order to generate new better ways for those robots to pick, scan, and stow those eCommerce customersâ orders.

Faculty Supervisor:

Florian Shkruti

Student:

Bryan Chan

Partner:

Kindred Systems Inc

Discipline:

Computer science

Sector:

University:

University of Toronto

Program: