Diverging from the GPUs: The case for alternative architectures for training ML algorithms

Machine learning has ushered many breakthroughs in areas such as computer vision, natural language processing, speech recognition, and recommendation systems. The models used in these applications contain many parameters that need to be learned and training often requires massive amounts of computation. As such in recent years graphics processor units (GPUs) have seen wide adoption for the training of these large-scale models. While there has been a lot of work on GPUs, little focus has been given to other architectures for machine learning. As such, this project plans to explore a series of architectures that diverge from the beaten path of GPUs through modeling and simulation on state-of-the-art machine learning workloads. The study aims to determine the trade-offs of the architecture in terms of performance to help better design machine learning accelerators.

Faculty Supervisor:

Maryam Mehri Dehnavi

Student:

Partner:

AMD Canada

Discipline:

Computer science

Sector:

Manufacturing; Professional, scientific and technical services

University:

University of Toronto

Program: