Answer Quality Assessment for Retrieval-Enhanced Generation via Conformal Relevance and Factuality

This project focuses on improving the reliability of AI systems that generate answers using external databases, known as Retrieval Augmented Generation (RAG) systems. While these systems help reduce inaccuracies, it’s still hard for users to know how trustworthy or relevant the answers are. To address this, the intern will explore a new method that evaluates the quality of RAG-generated answers by providing clear confidence scores and conformal prediction set without needing predefined answers for comparison. The approach will use a technique called conformal prediction to ensure reliability across various aspects like factual accuracy and relevance. The research will involve reviewing existing methods, identifying gaps, developing new theoretical approaches, and testing them on real-world datasets. For the partner organization, this project offers a scalable framework to evaluate and improve the trustworthiness of AI-generated content. The outcome can help enhance user trust in AI systems, especially in critical applications like customer support, financial, and healthcare, where reliable information is essential.

Faculty Supervisor:

Ga Wu

Student:

Partner:

Layer 6 AI

Discipline:

Computer science

Sector:

Finance and Insurance; Professional, scientific and technical services

University:

Dalhousie University

Program: