Table extraction for logistics and supply chain industries using machine learning- QC-465

Genre de projet: Recherche
Discipline(s) souhaitée(s): Génie - informatique / électrique, Génie, Informatique, Sciences mathématiques, Mathématiques
Entreprise: Mely.AI Technologies
Durée du projet: Plus d’un an
Date souhaitée de début: Dès que possible
Langue exigée: Anglais
Emplacement(s): Montreal, QC, Canada
Nombre de postes: flexible
Niveau de scolarité désiré: CollègeÉtudes de premier cycle/baccalauréatMaîtriseDoctoratRecherche postdoctorale
Ouvert aux candidatures de personnes inscrites à un établissement à l’extérieur du Canada: No

Au sujet de l’entreprise: is an AI-powered solution company, helping enterprise in Supply Chain and Logistics to accelerate their digital transformation journey.

Our proprietary Smart Document Extraction engine automatically, accurately and rapidly extracts key information from documents, including lengthy commercial invoices, complicated packing list, or non-standardized bill of lading or waybills, removing back-office inefficiencies and saving 90% of time and 80% of cost when removing manual labour from low-value activities such as data entry.

Veuillez décrire le projet.: 

Automated table extraction is an active field of research. Irregularities found in tables like nested headers, merged cells, and lack of grid lines have made generalized approaches difficult. In this project, we hope to apply the latest deep learning research to read borderless tables in standardized forms like invoices, certificates of analysis, and other corporate documents. 

Our software platform currently reads templated manifests, but has difficulty reading tables. By developing this technology, we can help logistics and supply chain companies in their digital transformation journeys. The candidate may be able to see the direct impact of their work reflected on our platform.

The candidate would have the following responsibilities:

  • Overseeing and developing models needed to train a generic model for table extraction, with great accuracy
  • Setting up an MLOps pipeline to measure model performance in a production setting
  • Work with the current data scientist, along with the software development, and business team to develop a commercial product

We mostly expect to use computer vision and deep learning algorithms but may also use natural language processing algorithms. Methodologies may evolve as the project develops with the candidate.

Expertise ou compétences exigées: 

The ideal candidate would have these specific skillsets:

  • Background in form and document extraction
  • Experienced with computer vision
  • Experience with deep learning in the computer vision domain
  • Experience putting models in a cloud environment (AWS, Azure, GCP)
  • Presentation skills to offer insights about how the technology can help expand the commercial side of the business
  • Python is a must
  • Proficient in English
  • Experience with OpenCV, Keras, Tensorflow, or Scikit-learn

Experience with natural language processing is a nice to have but not required