Related projects
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
To validate structured trade contracts for language and economic term correctness, the existing document understanding systems use machine learning methods, natural language understanding, and text analysis, to extract data elements from financial documents. This can be expanded to a wide variety of financial documents, especially customer-provided reference material. The project focuses on document information extraction to extract the key data elements from financial documents and validate this data against the internal source of record.
The existing OCR-based approach has issues of error propagation (especially with noises), slow processing time due to the large model size, as well as the need for retraining for new document classes. To improve this, the first approach is to improve OCR engine performance by exploring OCR engines with pre-processing and post-processing techniques. The second objective is to implement OCR-less models which learn jointly with image and text contexts.
Gerald Penn
Scotiabank
Computer science
Finance and Insurance
University of Toronto
Accelerate
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
Find the perfect opportunity to put your academic skills and knowledge into practice!
Find ProjectsThe strong support from governments across Canada, international partners, universities, colleges, companies, and community organizations has enabled Mitacs to focus on the core idea that talent and partnerships power innovation — and innovation creates a better future.