A framework to enhance deep learning systems’ trustworthiness against Out of Distribution examples
In the past decade, deep learning models have demonstrated their highest performance for a variety of tasks. These models outperformed classical machine learning models and even humans in terms of performance and accuracy. However, previous research indicated that these models are vulnerable to out-of-distribution and adversarial inputs. Ideally, these inputs should be rejected by the deep learning model, but the deep learning model generates confident outcomes for it. In this research, we develop a framework that assesses the deep learning model’s vulnerabilities against such inputs, detects and rejects these malicious input , and enhances deep learning models to generate less confident labels for them.