Related projects
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
Deep Reinforcement Learning (DRL) for robotic grasping has been actively studied in recent years. Each DRL needs a reward function to interact with its environment and figure out how desirable or suitable the actions it takes in each state are. The reward formulation of DRL is usually a linear summation of the reward components, which is inefficient to learn the multi-objective priorities. Hierarchical reward methods have been proposed to enable a robot to learn multi-objective tasks such as achieving autonomy or human-like merging actions for driving. The formulation of the reward hierarchies contains logical or weighted connections. Logical connections are strict constraints, where the higher-level hierarchy must be learned before the lower-level hierarchy. Weighted connections are soft constraints, where the higher-level hierarchy and lower-level hierarchy are learned together with a weighted. For this project, what I have in my mind is that instead of using a single coded reward like linear summation, use an extended hierarchical reward function to tackle the problem in a way that even includes gripper actions as a decision variable and can learn multi-objective tasks. In my work, these multi-objective tasks could be divided into approaching, attempting to grasp, grasping phase, and stabilizing.
George Zheng Hong Zhu
Université du Luxembourg
Engineering
Education
York University
Globalink Research Award
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
Find the perfect opportunity to put your academic skills and knowledge into practice!
Find ProjectsThe strong support from governments across Canada, international partners, universities, colleges, companies, and community organizations has enabled Mitacs to focus on the core idea that talent and partnerships power innovation — and innovation creates a better future.