NLP Techniques for Automated Entity Recognition

The primary goal of this project is to explore a variety of new and existing Natural Language Processing (NLP) techniques to improve the performance, and further the automation of, Knote’s text analysis software – specifically with entity recognition. Entity recognition is the process of identifying all groupings of words in a collection of documents that fall within that entity’s purview, such as proper names or chemical compounds. We will study the applicability of classic statistically driven approaches to classification, and evaluate the viability of newer techniques that make use of semantic encoding (such as word2vec).

Colton Chapin
Cole Boudreau
Matthew Arnold
Faculty Supervisor: 
Frank Rudzicz
Partner University: