Luxembourg Research Visit

Deep Reinforcement Learning (DRL) for robotic grasping has been actively studied in recent years. Each DRL needs a reward function to interact with its environment and figure out how desirable or suitable the actions it takes in each state are. The reward formulation of DRL is usually a linear summation of the reward components, which is inefficient to learn the multi-objective priorities. Hierarchical reward methods have been proposed to enable a robot to learn multi-objective tasks such as achieving autonomy or human-like merging actions for driving. The formulation of the reward hierarchies contains logical or weighted connections. Logical connections are strict constraints, where the higher-level hierarchy must be learned before the lower-level hierarchy. Weighted connections are soft constraints, where the higher-level hierarchy and lower-level hierarchy are learned together with a weighted. For this project, what I have in my mind is that instead of using a single coded reward like linear summation, use an extended hierarchical reward function to tackle the problem in a way that even includes gripper actions as a decision variable and can learn multi-objective tasks. In my work, these multi-objective tasks could be divided into approaching, attempting to grasp, grasping phase, and stabilizing.

Faculty Supervisor:

George Zheng Hong Zhu

Student:

Partner:

Université du Luxembourg

Discipline:

Engineering

Sector:

Education

University:

York University

Program: