Visual attention in deep learning for detection and classification

Visual attention refers to the mechanism of dynamically and selectively focusing on a subset of the visual input stimuli for detailed analysis, which is part of the visual perception process of the early primate vision. It has been successfully integrated into the design and implementation of many artificial visual recognition systems with applications to image classification, object detection, object sequence recognition, as well as image captioning and visual question answering. This research will explore various visual attention mechanisms with the goal to improve the generalization, robustness and efficiency of various DCNN models and algorithms developed for Epson’s computer vision and machine learning core technologies and products.

Jing Huang
Faculty Supervisor: 
Sanja Fidler