Classification and segmentation of automotive service data - ON-183

Preferred Disciplines: Engineering, Physics, Computer Science, Machine learning (Masters, PhD or Post-Doc)
Company: Pitstop
Project Length: 4-6 months (1 unit)
Desired start date: As soon as possible
Location: Toronto, ON
No. of Positions: 1-2
Preferences: Prefer local candidate but this is not a  hard requirement

About the Company: 

Pitstop is a startup company based in Toronto, Ontario which has developed a (mobile and backend) platform for predictive maintenance as a service.  

Project Description:

This project  will support the Pitstop mission to provide predictive analytics by developing classification and segmentation methods for  automotive service records. This data is collected from dealerships and car companies by the Pitstop Connect platform and details costs, parts and labour performed to repair vehicles. The goal is to classify records into failure categories and to extract and attribute features of the records which reflect evolving mechanical issues.

Our platform is currently configured to extract patterns out of technical time-series vehicle data and uses service records as outcome measures.  Predictive analytics assumes that mechanical issues evolve over time within the time-series data, but service data has not been analysed in this way.

The project is focussed on the following questions.

  1. How accurately can an automatic classification system determine the repair performed within a standard scheme ie. (fuel, air, electrical, tire/brakes, ignition etc) by natural language processing (NLP) analysis of service records.
  2. How accurately can an automatic classification system predict the part to be replaced based on reported symptoms?
  3. Are service records and technical time series data redundant or complementary information and how can they be systematically combined?

Research Objectives:

  • Evaluate and refactor our current approach to service record classification using records accumulated over the last year. Re-evaluate the efficacy of support vector machine (SVM), Bayes, and other models.
  • Develop part-replacement predictors based on the available service history for a restricted set of non-consumable engine components using NLP methods.
  • Develop a combined model for the time-evolution of mechanical failures based on combining longitudinal service records and technical (PID) data for a restricted set of vehicle makes, models and engine components.


The participant will collaborate with the Pitstop datascience team and develop algorithms in Python,  Node js or other suitable datascience environments  for deployment on our platform.
Machine learning algorithms to be compared will be analysed, selected, trained and evaluated based on known properties of the datasets.

Expertise and Skills Needed:

Experience with analysing multivariable data sets, relational databases, strong coding skills, knowledge of and/or experience with basic machine learning algorithms.

For more info or to apply to this applied research position, please

  1. Check your eligibility and find more information about open projects
  2. Interested students need to get the approval from their supervisor and send their CV along with a link to their supervisor’s university webpage by applying through the webform.