Document Understanding
Management of digital content (documents, screenshots, webpages, etc.) spread across many apps and platforms is becoming an ever-pervasive problem for professionals and businesses alike. Most existing solutions are focused on the storage and distribution of digital content, and there is yet a gap in the market for a tool that addresses the content management problem by gaining an understanding of their text content. Charli AI strives to address this gap by providing an easy to use platform for everything content-related; such as filing/ folder organization, optimal search, team collaboration, and analysis-driven insights and actions such as reports and reminders.
We propose to apply a set of empirical studies for document classification and information extraction from these documents. Specifically, character level language models, pre-trained neural language models, and transfer learning techniques in zero shot and few shot settings will be studied. The studies aim to determine the best document embedding for document classification, and Named Entity Recognition models for information extraction.
Fatemeh Hendijani Fard
Charli
Computer science
Information and cultural industries
The University of British Columbia - Okanagan
Accelerate