Speaker Diarization for Audio Transcription

This research is concerned with speaker diarization for the purpose of facilitating automated speech transcription. This problem has multiple depths depending on the prior knowledge provided to the system. The type and amount of information about the number and characteristics of the speakers can differentiate this problem in a range from a 1-to-N matching, where the voice is compared against different templates, to a clustering problem, where no prior knowledge is available.