Interpretability and explainability of Deep Neural Networks

In recent years ·complex deep neural networks have achieved outstanding performance on challenging tasks such as language models, classifiers and machine translation but they have remained as a mystery for the users. Due to their hidden and difficult to comprehend internal structure, as well as to their sheer size, they are often referred to as “black box” models. Additionally, the widespread adoption of machine learning algorithms has increased the necessity to trust to these moging dels in order to employ them for decision-making in critical situations. At the same time, making critical decisions concerning humans without understanding the justification of such a decision is unacceptable, both ethically and legally. Therefore, there is an increasing interest in the Machine Learning community in interpreting or deriving explanations able to describe a black box’s behavior. The need to understand black-box models’ decisions has resulted in the growth of research on their explainability and interpretability. For this research we are planning to design mathematical methods to explain the behavior of these models .

Faculty Supervisor:

Stan Matwin

Student:

Partner:

University of Pisa

Discipline:

Computer science

Sector:

Education

University:

Dalhousie University

Program: