Deep Learning for Action Recognition, Localization and Parsing

The goal of this project is to develop algorithms that allow machines to understand and describe actions in video. For example, we would like computers to be able to classify videos according to the actions taking place inside them , and annotate the videos according to where and when the actions take place. To tackle this problem, the student will pursue a strategy of designing and training a deep neural network, loosely inspired by the neural networks in the human brain. Deep neural networks already outperform humans on certain aspects of image understanding; the current project aims to extend this success to the domain of video understanding. Apart from the inherent scientific
interest this problem, addressing it will serve a variety of important real world applications, such as video retrieval, security and surveillance, and human-computer interaction. The work from this project will be submitted to a top-tier computer vision conference.

Faculty Supervisor:

Kosta Derpanis

Student:

Adam Harley

Partner:

Discipline:

Computer science

Sector:

University:

Ryerson University

Program:

Globalink