Document Engineering via Semantic Correlations - Year Two
We propose to develop smart algorithms for document generation, by innovating in the field of natural language processing and document intelligence. We envision the next generation of business applications able to parse and understand documents, to compose documents automatically, and to respond intelligently to voice commands. Our industrial partner, Koneka Inc. has a document automation platform that generates documents by assembling clauses (content-blocks) together based on a set of user inputs. We propose to extend the company's platform by designing advanced algorithms that will meet its R&D needs. We will design and implement a taxonomy framework that supports a weighted synonym approach and an algorithm that efficiently measures the correlation between an input text and a database of objects. We also aim to define a cost effective infrastructure which provides the ability to analyze a very large set of input texts, for several types of documents in different domains.