Automated Video Content Chaptering via Machine Learning

The research will focus on building a deep learning model to analyze video and audio streams of input video material, essentially creating something like a table of contents, for example:
* Minutes 0-3: Introduction. Our topic is binomial coefficients.
* Minutes 3-4: Problem source.
* Minutes 4-6: Details of the mathematics. Calculations of values.
* Minutes 6-7: Conclusions. Applications of binomial coefficients. Further reading.

Video chaptering involves deducing cues, possibly from the transcript (word choices, such as “to begin”, or “in conclusion”), the audio (clapping, pauses), or the video itself (movement, scene cuts). Such video analysis is a very large challenge, because of the huge data volumes, and so the internship will be undertaken in concert with existing video-chaptering research being conducted in the Vision & Image Processing group at the University of Waterloo. The key goal will be the construction of small, efficient networks, to see which key features emerge as having significance in the chaptering process. The intern will gain expertise in both machine learning and statistical analysis, and be expected to contribute to research in both areas.

Faculty Supervisor:

Paul Fieguth

Student:

Partner:

Taras Shevchenko National University of Kyiv

Discipline:

Computer science

Sector:

Artificial Intelligence

University:

University of Waterloo

Program: