Topic Segmentation for Text Mining on legal Documents

Text Mining is the process of automatically extracting structured knowledge from

unstructured, natural language documents. It aims to support users in dealing with large

amounts of textual information. Examples for specific text mining tasks are entity detection,

summarization, and opinion mining. Due to the complexity and ambiguity of natural language,

this analysis is broken down into individual processing steps, which are based on techniques

from the fields of machine learning, natural language processing, and semantic

computing.

In this project, the goal is to enrich the text mining pipelines developed at KeaText for the

processing of legal documents. Specifically, the analysis is to be enriched with a topic

segmentation module that is tailored to the specific domain and application requirements

tomatic topic segmentation, also known as text tiling, structures documents into individual

parts, each representing a distinct theme. It is well-known that topic segmentation can

improve several

information retrieval and text analysis tasks. In this project, the following tasks are to be

completed……………………………………..

Faculty Supervisor:

Rene Witte

Student:

Partner:

Keatext

Discipline:

Computer science

Sector:

Manufacturing

University:

Concordia University

Program:

Accelerate

Current openings

Find the perfect opportunity to put your academic skills and knowledge into practice!

Find Projects