Related projects
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
Generating synthetic data is important for a number of machine learning problems at Mastercard especially in the areas of additional data generation for imbalanced problems, data sharing etc. The data is mostly tabular in nature and a number of techniques exist for generating tabular data in the literature. However most of these techniques do not work on large datasets or fail to generate differentially private datasets. We already have done some work in this regard (see https://link.springer.com/chapter/10.1007/978-3-030-92310-5_60 ). However, the problem is not “solved” yet as it is difficult to generate differentially private datasets from large training sets and metrics like machine learning efficacy can be abysmally lower. The intern would be asked to work upon improving the current algorithms available in the literature both from privacy and accuracy standpoint.
Wael El-Dakhakhni
Mastercard
Computer science
Professional, scientific and technical services
McMaster University
Accelerate
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
Find the perfect opportunity to put your academic skills and knowledge into practice!
Find ProjectsThe strong support from governments across Canada, international partners, universities, colleges, companies, and community organizations has enabled Mitacs to focus on the core idea that talent and partnerships power innovation — and innovation creates a better future.