Going Beyond Thin Credit, the use of Account Data

The business partner is interested in finding ways to further automate small business lending and annual renewals. In recent years, with the improvement in efficient computing and data storage and movement, the use of deposit data in lending has become more prevalent in the industry. Within industry risk managers, it is widely accepted that deposit account information has a strong predictive ability for predicting borrower risk level. However, there are no widespread industry tools similar to credit scores making use of deposit data.

New Order of Risk Management: Theory and applications in the era of systemic risk (NORM)

To transform the way we think of and manage risk, in this research program we develop a comprehensive theory of systemic risk that combines the physical and social dimensions of risk, its spatial and temporal domains, and its primary and secondary channels of impact within a framework that acknowledges differences in risk vulnerability and susceptibility.

Managing tree models plasticity and mixing GLMs with regression trees for insurance ratemaking

Predicting policyholders' claims over a year is crucial for a Property-Casualty insurance company. These expenditures, popularly called losses, are incurred by the insurer when reimbursing the policyholders' claims. The insurance company is required to pay any legitimate claim made by a policyholder, in exchange the latter pays an amount of money, called the premium, to the company to buy this entitlement. Annual premium must be calculated with precision to ensure a fair deal on both sides.
It is the task of actuaries to set premiums for all policyholders; this is called ratemaking.

Statistical framework and methodology for risk and privacy in complex and high-dimensional data

Modern data collection and storage results in complex and high-dimensional databases: they include a large number of variables, with a lot of interactions. At this same time, access and release of information that is, or is derived from, personal information involves complex challenges in terms of the potential for inappropriate disclosure (e.g., identification).
In this project we propose to develop a statistical methodology that can inform the evaluation of privacy assurances while preserving the statistical utility of complex, high-dimensional health data.

Simulator for Distributed Quantum Computation

Distributed computing is a model in which components of a software system are shared among multiple computers to improve efficiency and performance. The growing interest in cloud computing scenarios that incorporate both distributed computing capabilities and heterogeneous hardware presents a significant opportunity for network operators. The aim of this Research is to develop a purpose-built discrete-event simulator for distributed quantum computing and identify further challenges and open problems arising from the design of a Distributed Quantum Computing ecosystem.

Developing Proprietary Natural Resource Business Models with Machine Learning Approaches for Feedstocks Susceptible to a By-Product First Solution

This research project aims to employ advanced machine learning and data science models for Developing Proprietary Natural Resource Business Models for Identification of Feedstocks Susceptible to a By-Product First Solution.

Complex network based data analysis for shared mobility

This project aims to build a new analysis toolkit for shared bikes, escooters, and cars. The method is based on a mathematical theory called complex network. Like internet and human brains, transportation system possesses network structure. To study its topological properties, we can calculate some indices that encode the information of the network. There are theoretical and empirical studies proposing and examining various models based on spatial networks structure of transportation, and especially shared mobility.

Sentiment analysis on cryptocurrencies and stocks

The objective of this project is to create market sentiment indicators for cryptocurrencies. Market sentiment indicators are built from text analysis of exchanges on Twitter and other sources, that may include embedded or references images (e.g., price curves) and videos. The text dataset for cryptocurrencies may have a very structure from that of traditional currencies. For example, cryptocurrency tweets may reference ‘mining’, which is a concept that does not exist for traditional currencies.

Evaluation of the accuracy of the office-based virtual surgical planning for Orthognathic and Implant surgery

Orthognathic surgery is a procedure used to correct facial deformities and is a mainstay treatment in the oraland maxillofacial surgery field. Another common procedure in the field of oral and maxillofacial surgery is dentalimplant surgery that is used to replace missing teeth. Currently, dental implants are one of the most preferredtreatment options for recreating tooth form and function.The surgical procedure is complex and requires extensive planning and accuracy to obtain a successfuloutcome.

Combating money laundering networks with the firefighting problem

The Firefighter Problem is a deterministic, discrete-time model of the spread of a fire on the nodes of a graph. If a graph is a network where bank accounts are nodes, then an edge between two accounts is a transaction between one bank account and another. Imagine we have a suspicious bank account with suspicious transactions possibly tied to money laundering. We view this suspicious bank account as a place a fire breaks out. Then, those accounts that receive money from the suspicious bank account are considered suspect.