Applied next generation AI accelerator algorithm hardware co-optimization: using quantization, sparsity and hardware constraints during neural net training

This work aims to explore software and hardware co-optimization for deep neural network (DNN) inference applications. Once a model is trained to sufficient accuracy, the model is used to make inference or predictions based on this trained model. With increasing performance, more people are using these models for tasks such as translation, self-driving cars and […]

Read More