Described Video and Language Detection on Audio-tracks Using Machine Learning

Bell Media receives content from different providers, including content it produces in-house. There are standards for tagging audio tracks with metadata however many facilities (including Bell) do not adhere to these standards. Currently Bell uses a manual approach to classify unlabeled audio tracks, which is inefficient, and time consuming for massive digital media that Bell has and receives. Bell is developing a single ingest pipeline to accelerate the labeling and processing of media files it receives. This research project will look at two features that Bell Media would like to include in the ingest pipeline. The first feature is the ability to automatically classify the audio track of the media file into its language type, primarily English or French. The second feature is to identify which audio track carries the described video information. In this research we will develop machine learning solutions for these two problems.

Faculty Supervisor:

Shahram Shirani

Student:

Yasamin Fazliani

Partner:

Bell Canada

Discipline:

Engineering - computer / electrical

Sector:

Information and cultural industries

University:

McMaster University