Sparse Representations for Embodied AI
Understanding scenes representing real world environments is a challenging problem at the intersection of Computer Vision research and Deep Learning and a necessary pre-requisite for Embodied AI. Embodied AI is an emerging field within Machine Learning that focuses on the challenges that need to be addressed for successful deployment of edge devices such as drones and robots. In this setting estimating the semantics of an environment plays an essential role in addition to how it can be efficiently navigated to solve a variety of tasks that can involve other agents as well. The Embodiment Hypothesis states that intelligence emerges from the interaction of an agent and its perception of the environment it is embodied within. This project establishes an empirical study to validate this hypothesis for Deep Reinforcement Learning (DRL) agents trained on environments derived from simulations as well as real world data. We use DRL as a methodology to model a Partially Observable Markov Decision Process (POMDP) that describe optimal processes for navigation-perception problems. The embodiment hypothesis implies that an end-to-end understanding of this problem should outperform conventional methods for training computer vision models.