Building an automated text mining algorithm to extract location-based information

Text documents often include information pertaining to geographic locations. Mapping these place names to specific geographic locations (e.g., GIS) requires a considerable amount of human effort. This becomes challenging especially when the same place name is represented by multiple places, such as in the naming of waterbodies (e.g. lakes and rivers). In this project we will develop an integrated system, bringing together Natural Language Processing, Machine Learning, and one or more known text mining software tools for geographic location application in order to evaluate an online document to match any place names with a geographic location that can be used by GIS.
Goldstream Publishing Inc. operates The Angler’s Atlas, which is a sport fishing service that provides detailed maps and related information for lakes and rivers across North America. Providing current information on thousands of the catalogued waterbodies is a challenge for the company. However, there is a significant volume of relevant, user generated information on different web-sites which could be used if the location information could be attached to these online posts. The outcome of this project will be a tool for automatically assigning location to the user generated online contents.

Negar Hassanpour
Faculty Supervisor: 
Dr. Liang Chen