A.I. Engine Modernization- ON-534

Project type: Research
Desired discipline(s): Engineering - computer / electrical, Engineering, Computer science, Mathematical Sciences, Languages and linguistics, Social Sciences & Humanities
Company: Alexa Translations
Project Length: 6 months to 1 year
Preferred start date: As soon as possible.
Language requirement: Bilingual
Location(s): Toronto, ON, Canada; Canada
No. of positions: 1
Desired education level: CollegeUndergraduate/BachelorMaster'sPhDPostdoctoral fellow
Search across Mitacs’ international networks - check this box if you’d also like to receive profiles of researchers based outside of Canada: 
Yes

About the company: 

At Alexa Translations, we build trust. Since 2002, we have grown our reputation in the language services industry by forging long-term relationships in the legal, financial, marketing, technical and government sectors, delivering customized and premium service. Helping our clients reach their business goals is the foundation of our success. We help the world’s largest and most prestigious professional services firms, financial institutions, asset managers, consumer product brands, and governments with translation solutions that elevate the way they do business.

Our innovative machine translation tool is specifically trained for the Canadian legal and financial markets. It delivers complex, industry-specific translations with unprecedented quality and unmatched speed.

Our Enhanced Machine Translation Service Attributes: 

  • Simultaneously translates up to 100 documents in seconds
  • Flexible API integration with most website software for real-time translation of web content
  • Supports seamless integration with leading CAT tools used by in-house translation teams
  • Client-specific translation memory and term-base intelligence
  • Integrates seamlessly with Alexa’s suite of human translation services

Describe the project.: 

The high-level strategic goal is to build an adaptive automatic translation ecosystem to address the real needs of clients from the legal, financial, and government sectors and support the long-term sustainable growth of the company’s business model.

The A.I Engine Modernization project will:

  • Develop an elastic data curation system for Machine Translation
  • Develop a Translation Memory mining system
  • Develop cutting-edge custom Machine Translation service
  • Build an integrated online translation platform

Use Cases

  • Automatic preparation and application of custom data for a new user with no prior data
  • Automatic extraction of a translation term base for a user with translation memory data
  • Search for translation precedents
  • Customizable text translation
  • Customizable end-to-end document translation
  • Translation project management, post-edit, and precedent-based translation
  • Customizable translation in a 3rd-party application through the Alexa A.I. translation API

Methodology/Techniques

  • Automatic data augmentation
  • Intelligent sentence and word alignment
  • Scalable and adaptive data curation pipeline
  • Precedent translation search and retrieval
  • Term base extraction
  • Translation memory clustering
  • Custom Neural Machine Translation (CNMT) algorithm development
  • Custom Neural Machine Translation (CNMT) implementation
  • Self-Learning Rule-Based Machine Translation (SL-RBMT) algorithm development
  • Self-Learning Rule-Based Machine Translation (SL-RBMT) implementation
  • Standard CAT editor
  • Precedent translation-based CAT editor
  • Enhanced UI development

Required expertise/skills: 

  • Hands-on experience in applied machine learning, and predictive modeling and analysis, statistical analysis and data mining experience
  • Hands-on experience with data science, big data, and data engineering
  • Bilingual is preferred
  • Hands-on experience with various big data technologies in one or more ecosystems (AWS or Microsoft Azure)
  • Worked with popular Python ML/DL tools/libraries such as Scikit-Learn, OpenNMT-tf, Pandas, NumPy, etc.
  • Hands on experience of Tensorflow serving or Pytorch Serving
  • Hands-on experience with Tensorflow/Pytorch
  • Have experience with MLOps such as MLFlow
  • Solid understanding of foundational statistics concepts and ML algorithms: linear/logistic regression, Random Forest, boosting, XGBM, k-NN, Naive Bayes, Decision Trees, SVM, etc.
  • Excellent theoretical as well as practical knowledge of deep learning architectures such as LSTM, RNN and CNN and Transformer based models
  • Deep understanding of fundamental as well as latest concepts pertaining to NLP and NLU such as word2vec, glove, BERT, GPT and T5 models. 
  • Hands-on-experience of working with deep learning libraries such as Tensorflow/PyTorch.
  • Solid understanding and practical experience of containerizing Deep Learning models for deployment purpose in scalable manner.