Time series clustering and classification

Financial indicators of an individual firm may be in the form of time series, vectors, or even richer data, such as text or images. The purpose of this work is to explore and develop methods for dealing with such data, and in particular perform the clustering/classification of such data into similar groups. In the project the intern will develop the tools that will allow to determine whether a client should be issued a loan or not.

Large data manipulation research to bolster the Canadian insurance industry and the macro-economy

Nuera has been acquiring customers and data for the past 3 years and is launching an internal initiative to mine the data to find insurance claims trends.  The objective of this initiative will consist of analysis and reporting on our data sets to find trends that lead to claims frequency and severity, in an effort to reduce claims costs, and consumer insurance pricing as a result. We will also be identifying customer behaviors and how those behaviors contribute to insurance purchasing and claims.

Aboveground Storage Tank(AST) testing using statistical approach

Based on the original Statistical Inventory Reconciliation(SIR) Test Method (Quantitative), K-folds cross validation is used to increase P(D) and decrease P(FA) by adjusting K, which are related to bias and standard deviation. There is a trade-off between bias and variance, with very flexible models (overfit) having low bias and high variance, and relatively rigid models(underfit) having high bias and low variance. When K is larger, we have lower bias and larger standard deviation. Also, K-folds cross validation is very useful, when data size is small.

Machine Learning Strategies in the Physical North American Power Markets

Machine learning techniques have been applied to the financial industry for some time. They have allowed large utilities and generators to better forecast their needs, and the prices they will pay, leading to a generally more efficient grid. However, very little research has been done that could benefit power marketers, who do not have a load to serve or a generating facility to manage. The application of machine learning techniques has yielded great results in the financial industry.

The Genetics of Blood Biomarkers in COPD

COPD is a progressive inflammatory airway disease characterized by persistent and progressive airway inflammation. It is a major cause of global morbidity and mortality and is predicted to become the third leading cause of death by 2020. Biomarkers may be useful for diagnosing disease considering that the usually used lung function measures have poor correlation with both symptoms and other measures of disease progression. However, the relationship between biomarkers and COPD is still elusive.

Anomaly detection and simulation for unlabeled sensor data

The rapid development in the areas of statistics and machine learning demonstrate unprecedented performance in making cognitive business decisions. Quartic.ai aims to use state-of-the-art machine learning technology to help manufacturers assess and maintain the quality of their industrial units, which suffer damage due to continuous usage and normal wear and tear. Such damage needs to be detected early to prevent further losses. The data in this domain are recorded using sensors at various stages in the process flow.

Modeling and Measuring Insurance Risks Considering IFRS 17 Framework

The objective of the project is to design a model determining capital requirements associated with property and casualty insurance business lines for an insurer that is compliant with the new IFRS 17 framework (international accounting framework). Several subcomponents of the model will be developed such as a dynamic model embedding dependence for the evolution of incurred but not reported (IBNR) claims, a risk measurement component with risk measures and an allocation framework for capital requirements across the various business branches of the insurer.

Advanced Data Science Research for Social Good II

Municipal governments and urban centres across Canada are being inundated with data—data that have potential to improve public service. Despite this, local governments do not have enough data expertise to extract insight from these overwhelming datasets. Simultaneously, high-quality personnel (HQP) in the domains of data science and urban analytics lack opportunities to work closely with local government to address this gap.

Machine learning applied to drilling in open pit mines

The project involves identifying changes in mineralization during the drilling of the blast holes. During drilling, an experienced driller is able, to a certain extent, to detect signals that indicate that a zone change is occurring: vibration in the cabin, rotation rate, etc.

Quantitative risk measurement techniques for insurers

This project will assist Sun Life Financial to build, implement and validate quantitatively sophisticated state-of-the-art models of its risk portfolio. This will result in a better quantitative and qualitative understanding of company's risk, liability and capital profile, and thus in more effective risk management decision making process.