Development of an improved generative adversarial network method for data augmentation and its application in environmental and financial domains

Using simplified language understandable to a layperson; provide a general, one-paragraph description of the proposed research project to be undertaken by the intern(s) as well as the expected benefit to the partner organization. (100 - 150 words) This project aims to increase image datasets by not doing experiments or collecting physical checks. Instead, the image data augmentation is implemented by generative adversarial networks (GANs), generating new images from original images using different algorithms. GANs have a generator and a discriminator.

Enhanced Graph Convolutional Networks using Local Structural Information

Over the past few years, Graph Convolutional Networks (GCNs) have achieved state-of-the-art performance in machine learning tasks on graph data and have been widely applied to many real-world applications across different fields, such as traffic prediction, user behavior analysis, and fraud detection. However, networks in the real world are often with heterogeneous degree distributions, such as power-law.

Generative models for controlled generation of synthetic sequence-based datasets

At a high level, the goal of this project is to create a system for producing synthetic datasets based on real data. As a large financial crime detection firm, Verafin deals with large volumes of sensitive data which must be kept private, however they are also interested in collaborating with academics to gain new insights into their data.

Hunt for the Super-Spreaders — A Complex Networks Approach

Contagious diseases, such as SARS and COVID-19, bring a large amount of damage to human’s life and world economy. Pathogens spread among individuals through the contact network. It is observed that most social networks show a power-law degree distribution, implying that hubs exist in these networks. Finding underlying super-spreaders (hubs) and isolating or immunizing them can decrease the pathogen spreading dramatically.

Interpretable dimensionality reduction of multivariate time series data using LSTM based autoencoders

Data collection over time is a common practice in many large organizations- including financial institutions and health care providers- often with the goal of using this data to predict future challenges and opportunities. While this data may contain valuable information, it is often unstructured, coming from different sources and recorded at different times. This lack of structure makes extracting useful information difficult, as most standard statistical and machine learning tools are designed to work with data in a fixed structure.

Comparative assessment of Machine Learning methods for fraud detection and improving the interpretability of the best model

Machine learning algorithms are being used in a wide range of applications. It is a branch of computer science where the system can learn from the data and make decisions. Financial fraud is an increasing hazard in the financial industry, and it is important to detect a fraudulent transaction. Machine learning algorithms can be used to decide whether the transaction is fraud or not. After the system makes its prediction, it is important for users to understand the reason behind the prediction in such cases.

Detecting Credit Transaction Fraudulent Behavior Using Recurrent Neural Networks

Fraudulent activities are hard to detect, but they cost financial institutions millions of dollars in monetary losses and legal costs every year. Millions of dollars are being lost in credit transactions as criminals are finding new, more sophisticated ways to conduct financial crime. This research project examines novel ways of detecting fraudulent behavior using powerful tools such as Recurrent Neural Networks, a type of machine learning model that is well suited for sequence or historical data.

Application of Different Machine Learning and Data Mining Algorithms in the Detection of Financial Fraud

Detection of financial fraud is a priority for financial institutions. There are a variety of techniques and models that can be used to address the problem of financial fraud. However, as fraudsters are becoming more inventive and adaptive, they have been able to penetrate the conventional protective methods. This is one of the main reasons for the growth in financial fraud activity, regardless of the efforts of financial institutions and government and law enforcement agencies.

Deep Fraud Detection

Financial fraud is a serious issue that is taking place globally and causing considerable damage at great expense. Statistical analysis and machine learning tools can help financial institutions detect different types of fraud. In some cases however, mislabeling and the cost of classification may actually increase the volume of ‘false positives’ for supervised methods. As the number of normal transactions in financial domains far outweigh the number of anomalous transactions, it is challenging to classify the anomaly labels.

Customer Segmentation Using Feature Set Generated from Customer Transaction Data

There is a high demand for automated fraud and money laundering detection and prevention systems since such activities costs millions to the financial industry every year. A key problem in detection techniques is the accurate and descriptive profiling of the accounts. Thus, it is important to identify the salient features in traction data that would enable us to accurately profile the accounts.