Duplicate detection for billing systems

Merging different sub-companies into TELUS caused some of customer records to be repeated through the merged data-set. Algorithms are needed to determine the duplicate records. Currently a deterministic algorithm is being used in TELUS. In this project, we will investigate if machine learning can help to detect duplicates. Solving this problem has several parts. We have to preprocess the data and select some features from the TELUS records that help us in our model. A probabilistic model should be selected, implemented and tuned. Then, it is necessary to test the proposed model and compare that with the current systems.

Faculty Supervisor:

David Poole

Student:

Bahare Fatemi

Partner:

Telus

Discipline:

Computer science

Sector:

Information and communications technologies

University:

University of British Columbia

Program:

Accelerate

Current openings

Find the perfect opportunity to put your academic skills and knowledge into practice!

Find Projects