Enhancing Vision Encoders for GUI Navigation with VLMs

(1) the main activities of the partner
ServiceNow develops a platform for client organizations to manage and automate large-scale processes across various industries. In 2020, ServiceNow acquired Element AI to strengthen its presence in the Artificial Intelligence (AI) research landscape and the Canadian AI ecosystem. This acquisition enabled the development of AI-driven products that improve the platform’s capabilities. ServiceNow Research has made significant contributions to the field of foundation models, notably in Natural Language Processing (NLP), and has a strong presence in developing generative models for different data domains.
(2) the challenges the partner aims to solve through this project
Navigating GUIs poses significant obstacles for AI agents, especially in ensuring generalization across diverse tasks and domains. Current vision-language models struggle with compositionality, grounding abstract plans within pixel-based interfaces, and recovering from errors. This project seeks to address these issues by developing modular architectures that enhance planning, grounding, and execution, advancing the robustness of GUI agents while integrating reinforcement learning to improve alignment between predicted actions and interface outcomes.
(3) the anticipated social or economic benefits of the project for the partner organization(s)
Developing effective web agents has the potential to enable various applications, such as automating administrative tasks, assisting new employees with tool usage during onboarding, and making web navigation more accessible for individuals with visual or other impairments. Progress in this area can lead to the creation of prototypes and proofs-of-concepts for ServiceNow, supporting the integration of web assistants into production and advancing the product roadmap.

Faculty Supervisor:

Bang Liu

Student:

Partner:

ServiceNow Canada

Discipline:

Computer science

Sector:

Professional, scientific and technical services

University:

Université de Montréal

Program:

Accelerate

Current openings

Find the perfect opportunity to put your academic skills and knowledge into practice!

Find Projects