Exploring and Improving Self-supervised Methods for Large-scale Video Recognition

With the advancement of modern technology, especially the increase in network speed, videos are taking more and more important places among media types. With vast potential applications, video recognition has received great attention. However, video recognition is a non-trivial task: a lot of training data are needed for complicated neural networks, but annotated data are hard to acquire. As a result, there is a growing tendency to bank on self-supervised learning approaches that can make use of unlabeled data. Some results have been made but it is still a pretty preliminary topic with a lot of room to improve. This project aims to dig into this topic, design and experiment more efficient algorithms and train on larger-scale datasets. The expected result would be an improved large-scale video recognition pretrained model that achieves competitive performance.

Faculty Supervisor:

Animesh Garg

Student:

Keyu Long

Partner:

Layer 6 AI

Discipline:

Computer science

Sector:

Professional, scientific and technical services

University:

University of Toronto