Improving AI by Learning Directly from Human Preferences

Our project aims to improve how AI models learn from human feedback. Current methods assume human preferences can be reduced to a single “reward” value, but research shows this isn’t always true. We will investigate if an AI algorithm can learn from human preferences without relying on rewards. If possible, we’ll design and test algorithms which learn without reward. If not, we’ll explore the cases where rewards are necessary. This research will promote AI systems that are better aligned with human preferences, benefiting the scientific community and industry sectors like healthcare and education. The insights gained from this project will advance Canada’s AI research and reinforce both Mila and Stanford’s global leadership in AI.

Faculty Supervisor:

Doina Precup

Student:

Partner:

Stanford University

Discipline:

Computer science

Sector:

Education

University:

McGill University

Program: