Leveraging Advanced NLP Models and LLMs for Document Set Understanding with Multi-documents semantic relations

Docugami is a document engineering company registered in British Columbia that transforms how businesses create and execute critical business documents. Leveraging breakthrough Natural Language Processing (NLP) and Large Language Models (LLMs) Docugami enhances productivity, compliance, and insight across industries such as finance, law, and business operations.
This project addresses a key challenge or objective for Docugami—improving document set understanding by developing advanced methodologies of understanding, chunking and extracting semantics relations across large collections of similar types of business documents. Traditional approaches struggle with complex document structures, limiting automation and knowledge discovery. Additionally, effective connections and understanding relations between document information— which usually are lost in the documents—plays a vital role in driving business efficiency, streamlining processes, and
unlocking valuable insights within the business sector.
The anticipated benefits for Docugami include enhanced AI-driven document intelligence, leading to more accurate information retrieval, document summarization, and knowledge extraction; and improved ability to create integrated and intelligent business systems, providing better service for small businesses’ or large organizations’ needs. The project also aims to contribute to the open-source community by documenting findings, code sharing, and collaborating on projects related to document understanding using LLMs.

Faculty Supervisor:

Gerald Penn

Student:

Partner:

Docugami Canada, Inc

Discipline:

Computer science

Sector:

Information and cultural industries

University:

University of Toronto

Program:

Accelerate

Current openings

Find the perfect opportunity to put your academic skills and knowledge into practice!

Find Projects