Related projects
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
Mitacs brings innovation to more people in more places across Canada and around the world.
Learn MoreWe work closely with businesses, researchers, and governments to create new pathways to innovation.
Learn MoreNo matter the size of your budget or scope of your research, Mitacs can help you turn ideas into impact.
Learn MoreThe Mitacs Entrepreneur Awards and the Mitacs Awards celebrate inspiring entrepreneurs and innovators who are galvanizing cutting-edge research across Canada.
Learn MoreDiscover the people, the ideas, the projects, and the partnerships that are making news, and creating meaningful impact across the Canadian innovation ecosystem.
Learn MoreSpeech signals propagating in enclosed environments are distorted by two important, environment-related factors: a) the multiple reflections of the signal from the walls and other objects present in the room, which are called coloration and reverberation, for early and late reflections respectively, and b) competing acoustic signals coming from other sound sources than the speaker, called background noise. Such distortions degrade not only the perceived speech quality and intelligibility for human listeners (either listening to the original distorted speech, speech transmitted by a telephone, or an assistive listening device), but also hampers automatic speech and speaker recognition systems. To try to mitigate these effects, speech enhancement algorithms have been widely used, as well as specific acoustic models matching the environmental characteristics, in the case of automatic speech/speaker recognition applications.
While there are several methods for experimentally measuring the effect of environmental distortions given a clean reference signal, such methods cannot be used in real-time applications as a reference signal is seldom available. Therefore, the so-called blind measures (i.e., measures that do not require a reference signal) have to be employed.
We have recently proposed non-intrusive speech quality, intelligibility, and reverberation time estimation measures. Such measures were shown to accurately estimate speech quality/intelligibility across noise-only, reverberation-only and noise-plus-reverberation listening conditions. Adapted versions of these metrics were also shown to estimate speech quality and intelligibility in complex listening environments for hearing aid and cochlear implant users. These metrics showed performance inline with those obtained with state-of-the-art measures, but with the added benefit of not requiring access to a clean reference signal.
Automatically assessing acoustic environment characteristics can be useful to improve the performance of speech enhancement algorithms. Most speech enhancement methods consider low-level features extracted from the distorted speech signal, such as estimated signal-to-noise ratio, as a proxy for measuring the amount of speech distortion present in the signal, and rely on this information to adjust how the speech enhancement algorithm works. However, higher-level characteristics, such as reverberation time and speech quality/intelligibility, began to be explored only recently.
In this project, we aim to develop environment-aware speech enhancement algorithms, taking into account the predictions of our blind measures of acoustic environment characteristics. As a first step to enable the use of these measures in speech enhancement algorithms, understanding how such features behave in real-world, time-varying environments is important. For that end, we are going to develop a tool to track the evolution of our blind measures over time as a smartphone application. The application will periodically record audio segments, compute, and log the measures. The user will be able to tag measurements as corresponding to specific places (e.g., inside a room, automobile, on the street), and also annotate them with comments and quality scores. The information stored by the application will later be analysed by researchers and used to detect possible limitations of the blind measures. The application will later be extended to perform environment-aware speech enhancement as well; however, this is outside of the scope of this short-term project.
Tiago Falk
Blas Kolic
Journalism / Media studies and communication
Globalink
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
Find the perfect opportunity to put your academic skills and knowledge into practice!
Find ProjectsThe strong support from governments across Canada, international partners, universities, colleges, companies, and community organizations has enabled Mitacs to focus on the core idea that talent and partnerships power innovation — and innovation creates a better future.