Topic Segmentation for Text Mining on Legal Documents


Text mining is the process of automatically extracting knowledge from unstructured, natural language documents. It aims to support users in dealing with large amount of textual information. Examples for specific text mining tasks are entity detection, summarization, and opinion mining. Due to the complexity and ambiguity of natural language, this analysis is broken down into individual processing steps, which are based on the techniques from the fields of machine learning, natural language processing, and semantic computing.

Parameter Uncertainty and Model Adequacy for GLMs Applied to Property/Casualty Insurance Data

Accurate forecasting is of crucial importance in managing insurance risks and ensuring a solvent and profitable operation. In recent years the property/Casualty insurance industry has adopted generalized linear models (GLMs) to improve the fit and prediction accuracy of their insurance portfolio models. Yet, the interdependence between the different insurance covers included in packaged products, such as car insurance, need to be explained in the GLM in order to include them in the predictive process.

Connecting to Cascades

Each summer, Globalink students undertake a research project with a Canadian university which allows them to experience state-of-the-art research facilities, Canadian society and build friendships with local students.

Portrait of a past landscape to chart the future

Her research project, which saw her elaborate a portrait of the preindustrial forest in the Mauricie region which helped AbitibiBowater gain a desired environmental certification, fitted very well with her wider research interest as a modeler. “Accelerate provides something relatively few other internship programs offer: its short, four to six month time frame, gives you a lot of flexibility to pursue a very defined project within a larger research context,” Dr. Tittler explains.

Word Segmentation in Handwritten Documents Using Genetic Programming

Word segmentation in handwritten document is a difficult task because inter-word-spacing (i.e. the space between parts of the same word) is sometimes wider than the intra-word-spacing (i.e. the space between two consecutive words). Many different approaches to segmenting words have been proposed so far. However these segmentation approaches usually use some parameters that are manually tuned; meaning that they do not take into account the properties of the document in order to automatically calibrate the parameters.

Research topics in oil lubrication for aircraft gas turbine engines

All mechanical systems with moving parts face a serious challenge due to the friction phenomena that occurs at the interface of the two relative moving surfaces. The effect of friction which is the slow degradation of the surfaces in contact could be significantly reduced through lubrication. Although a significant number of investigations have been carried out on friction phenomena, there is a limited academic interest into the implementation of the findings to real word. The present set of stages is all directed towards the implementation of such findings.

Developing highly sensitive biosensors for MRSA bacterial detection

Nosocomial infection is a growing problem in Canadian hospitals, these bacteria can kill as many as 8,000 patients per year, and the expenses reach at least $100 million annually. Clostridium difficile (C. difficile) and methicillin‐resistant Staphylococcus aureus (MRSA) are among the most common bacteria. For example, C. difficile has killed more than 600 people in Quebec alone between 2003 and 2005. The control of the spread of bacteria to multiple patients in hospitals and the efficacy of treatment will be improved with early detection of bacteria.

Keywords Detection in Handwritten Documents

The long-term aim of this project is to develop techniques and software for the processing of unconstrained handwritten documents. The short terms goals are 1) the enhancement, “de-noising” and removal of artifacts in degraded digital handwritten document images, 2) text-lines and words segmentation independent of scripts or symbols and 3) identification of a small set of keywords in handwritten document images for document classification, retrieval or other purposes.