Related projects
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
A large amount of health-related data is available only in unstructured form (free-form text). To share this data for secondary purposes, it is necessary to de-identify it to protect against inappropriate disclosure of personal health information (PHI). PARAT Text is Privacy Analytics de-identification software for unstructured data. It automatically discovers and marks PHI in a variety of document formats using gazetteers and a bunch of rules. The primary problem of this tool is that it is limited by the knowledge of human experts, gazetteer lists, and lack of contextual knowledge. I plan to explore unsupervised and semi-supervised machine learning approaches to make the PHI discovery more robust. This will provide elegant and robust methods to deal with text data, which might broaden the partner organizations consumer base.
Diana Inkpen
Varada Kolhatkar
Privacy Analytics
Engineering - computer / electrical
Information and communications technologies
University of Ottawa
Accelerate
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
Find the perfect opportunity to put your academic skills and knowledge into practice!
Find ProjectsThe strong support from governments across Canada, international partners, universities, colleges, companies, and community organizations has enabled Mitacs to focus on the core idea that talent and partnerships power innovation — and innovation creates a better future.