Email Mining, Modeling, and Visualization

For this project, a data mining, visualization, and modeling technique will be developed and tested specifically for emails, using publicly available datasets. The mining will consist of gathering email and other potentially related datasets and cleaning those datasets. Cleaning will consist of removing duplicate or unnecessary information, as well as labeling data with basic information in order to ease training in the later steps. Next that data will be visualized in some form (graphs, charts, etc.) so that it may be more easily understood and a training model can be development. Lastly the data will be analyzed in order to obtain some statistically model which will provide new insight on the information, such as its general topics, the flows of conversations, and sentiment analysis; each using novel and modern algorithms and processes.

Cole Boudreau
Faculty Supervisor: 
Frank Rudzicz
Partner University: