A.I. Translation Engine Modernization & Enhancements - ON-674

Project type: Research
Desired discipline(s): Engineering - computer / electrical, Engineering, Computer science, Mathematical Sciences, Languages and linguistics, Social Sciences & Humanities
Company: Alexa Translations
Project Length: Longer than 1 year
Preferred start date: As soon as possible.
Language requirement: Bilingual
Location(s): Toronto, ON, Canada; Canada
No. of positions: 2
Desired education level: PhDPostdoctoral fellow
Open to applicants registered at an institution outside of Canada: No

About the company: 

At Alexa Translations, we build trust. Since 2002, we have grown our reputation in the language services industry by forging long-term relationships in the legal, financial, marketing, technical, and government sectors, delivering customized and premium service. Helping our clients reach their business goals is the foundation of our success.

We help the world’s largest and most prestigious professional services firms, financial institutions, asset managers, consumer product brands, and governments with translation solutions that elevate the way they do business.

Our innovative machine translation tool is specifically trained for the Canadian legal and financial markets. It delivers complex, industry-specific translations with unprecedented quality and unmatched speed.

Our Enhanced Machine Translation Service Attributes:

  • Simultaneously translates up to 100 documents in seconds
  • Flexible API integration with most website software for real-time translation of web content
  • Supports seamless integration with leading CAT tools used by in-house translation teams
  • A client-specific translation memory and term-base intelligence
  • Integrates seamlessly with Alexa’s suite of human translation services

Describe the project.: 

Alexa Translations A.I. Engine needs to be modernized from both an architecture and infrastructure perspective. The plan and the strategy are to move from monolithic architecture to serverless architecture as well as introduce new features that will allow us to be ahead of the competitors. Features such as Document-level Machine Translation, and Adaptive/Responsive translation.

The A.I. Translation Engine Modernization & Enhancements project will:

  • Enable our Translation Engine to provide accurate segment translation based on its document context and ensure consistent translation of terms and phrases throughout the document
  • Enable the A.I. machine to learn from human post-edits in real-time and significantly improves translation quality and translator productivity by learning from the human corrections 
  • Enable our AI Engine to become a native cloud application

The tasks will include:

  • Curate document-level bitext data: Curate document-level bitext data in the law/finance/government domains
  • Develop and implement document-level MT algorithms: Survey, develop, implement, and evaluate English-French document-level MT algorithms that meet both accuracy and latency goals
  • Develop responsive MT algorithms: Survey and develop English-French responsive MT algorithms
  • Implement responsive MT backend: Implement English-French responsive MT backend
  • Test and deploy responsive MT system: Test and deploy the full English-French adaptive MT system that meets accuracy, latency, and human-oriented goals

Methodology/Techniques:

  • Data augmentation, neural sentence alignment, text classification, etc.
  • Concatenation-based methods, context encoding, extended attention models, etc.
  • Attention-based model, optimizer and learning rate experimentation, batch fine-tuning, hybrid model updating program

Required expertise/skills: 

  • Hands-on experience in applied machine learning, predictive modeling and analysis, statistical analysis, and data mining experience
  • Worked with popular Python ML/DL tools/libraries such as Scikit-Learn, OpenNMT-tf, Pandas, NumPy, etc.
  • Hands-on experience of Tensorflow serving or Pytorch Serving
  • Hands-on experience with Tensorflow/Pytorch
  • Have experience with MLOps such as MLFlow
  • Solid understanding of foundational statistics concepts and ML algorithms: linear/logistic regression, Random Forest, boosting, XGBM, k-NN, Naive Bayes, Decision Trees, SVM, etc.
  • Excellent theoretical as well as practical knowledge of deep learning architectures such as LSTM, RNN, and CNN and Transformer based models
  • Deep understanding of fundamental as well as latest concepts pertaining to NLP and NLU such as word2vec, glove, BERT, GPT, and T5 models. 
  • Hands-on experience of working with deep learning libraries such as Tensorflow/PyTorch.
  • Solid understanding and practical experience of containerizing Deep Learning models for deployment purposes in a scalable manner.