Multimodal RAG Explainability

This project “Multimodal RAG Explainability (MRAGE)” aims to develop an interactive tool for explaining Large Multimodal Models (LMMs) augmented with retrieval capabilities. LMMs represent a significant advancement in AI, capable of understanding and generating content across multiple modalities like text and images. By employing the retrieval-augmented generation (RAG) methodology, this project queries external knowledge sources to empower the retrieval capabilities and reduce the hallucinations of the model. Counterfactual reasoning will be utilized to formulate explanations by extracting the input data that directly impacts the answer it generates. This approach will illuminate the decision-making process of multimodal LLMs, enhancing their transparency and interpretability.

The project holds the potential to significantly impact fields such as healthcare, law, and finance, where understanding the rationale behind AI outputs is crucial for trust and responsible deployment. An interactive tool enables users to understand the reasoning behind the model outputs and identify potential biases or errors.

Faculty Supervisor:

Lukasz Golab

Student:

Partner:

Taras Shevchenko National University of Kyiv

Discipline:

Computer science

Sector:

Artificial Intelligence; Information and Communications Technology

University:

University of Waterloo

Program: