Mastering Multi-modal Decision-making with World-Models

The quest for building an artificial intelligence (A.I.) that is general and able to solve multiple tasks, as humans do, has experienced a growing interest in the last few years. Current A.I. systems can communicate with humans and provide sensible answers and ideas in the form of text. However, if we expect A.I. systems to be pervasively used in real-world applications, we need to provide them with the ability to sense, interpret, reason, and act in the world. This project aims to bridge the gap between A.I. language systems and reality, by employing world models, a recent idea that has been applied in many decision-making tasks, such as video games, simulations, and robotics. By successfully combining language systems with world models, the project aims to build generalist A.I. systems, which have the potential to be adopted for solving numerous real-world tasks, including visual question answering and web-browser based tasks, two challenges that are being extensively researched at ServiceNow.

Faculty Supervisor:

Aaron Courville

Student:

Partner:

ServiceNow Canada

Discipline:

Computer science

Sector:

Professional, scientific and technical services

University:

Université de Montréal

Program: