Summarization of Canada’s Parliamentary Proceedings with NLP/ML- ON-419

Desired discipline(s): Engineering - computer / electrical, Engineering, Computer science, Mathematical Sciences, Statistics / Actuarial sciences
Company: Alphabyte Solutions, Inc.
Project Length: 4 to 6 months
Preferred start date: 02/15/2021
Language requirement: English
Location(s): Vaughan, ON, Canada
No. of positions: 1
Preferred institutions: McGill University, University of Alberta, University of Toronto

Search across Mitacs’ international networks - check this box if you’d also like to receive profiles of researchers based outside of Canada: 
No

About the company: 

Alphabyte is a proudly Toronto-based tech start-up working on big data analytics and machine learning primarily in the e-commerce and construction fields. We believe that, with a little thoughtfulness and determination, anything can be done a little better and faster. We thrive on thinking outside the box while remaining commited to innovating and responsibly bringing about positive change. 

Please describe the project.: 

At our research division, Alphabyte Research Lab, we have started to apply state-of-the-art machine learning models to solve real-world problems. One of these projects, known as Parlawatch, aims to improve democratic participation and governance by automatically summarizing Canada’s parliamentary proceedings for the general public and providing useful insights from them (e.g. sentiment analysis, emotion classification).

We currently have a treasure trove of valuable parliamentary proceedings data and a functioning software stack in place to perform the summarization and other tasks. However, we lack the research talent to fully exploit the project’s potential and develop a well-polished and mature product. The successful candidate will work on training and fine-tuning our NLP/ML models, diagnosing/fixing bugs, as well as spearheading development of new features that they are expected to independently propose.

Required expertise/skills: 

About the Candidate

  • Resourceful, creative, detail-oriented, works well under pressure
  • Team-oriented and highly organized
  • Self-starter, requires minimal supervision, and manages open-ended requirements well
  • Excellent oral and written communication in English
  • Strong research skills with publication record at top conferences/journals
  • Quick learner, able to independently acquire domain knowledge as needed
  • Able to translate problem requirements into technical solutions with ease

Required Skills

  • Possess a Bachelors or higher in a quantitative field
  • Pursuing a Masters or higher in a quantitative field
  • Strong understanding of machine learning and natural language processing architectures
  • Demonstrated hands-on experience with deep learning, especially NLP
  • Python: Pandas, Numpy, Scikit-learn, PyTorch, Tensorflow

Nice to Have Skills

  • Working experience with SQL (or similar relational databases) strongly preferred
  • Experience with data visualization (e.g. Power BI, Tableau)
  • Experience developing machine learning from end to end (i.e. data loading, data cleansing, model training, fine-tuning, production-level code)
  • Broad experience in different data modalities, e.g. images, time-series, financial, geospatial data
  • Experience with front-end development, UX, visual design
  • Broad knowledge in different disciplines, especially current affairs, Canadian government, politics, and journalism
  • Experience with Microsoft Azure suite