Generative models for controlled generation of synthetic sequence-based datasets
At a high level, the goal of this project is to create a system for producing synthetic datasets based on real data. As a large financial crime detection firm, Verafin deals with large volumes of sensitive data which must be kept private, however they are also interested in collaborating with academics to gain new insights into their data. In this project we will use recent developments from the field of generative modeling to design a system which can create synthetic datasets which are nearly indistinguishable from the true data they are mimicking, while not exposing information which must be kept private. In addition to making it easier for Verafin to collaborate with external parties, another benefit is Verafin can use the synthetic data to create more realistic product demonstrations for potential clients.