Interfaces and algorithms to interactively improve medical datasets for machine learning - BC-372
referred Disciplines: Engineering / Computer Science. Masters or Ph.D.
Project length: 8 to 12 months (2 units)
Desired start date: May 1, 2018
Location: Vancouver, BC
No. of Positions: 1
Preferences: Must be able to work onsite at Vancouver office. Language : English
Company: Galiano Medical Solutions Inc.
Galiano Medical Solutions Inc. is creating leading edge medical image processing technology that exploits machine learning to empower physicians and improve patient care.
The success of our algorithms depends on the availability of high-quality data, which in our current project means working with chest x-ray images (CXRs) that are accurately labeled with the findings that a radiologist would report in their examination.
Publicly-available datasets exist to help us build our algorithms; for example, the National Institutes of Health (NIH) recently released a large set of labeled CXR images. The NIH collection has over 110,000 images – sufficient in quantity for machine learning studies – yet to provide this large set of images, their labels have been computer-generated from radiologists’ reports. Due to this automated reading process, many of the labels are incorrect. Presenting these “weakly-labeled” images to expert humans for review allows for label quality improvements, thus fulfilling a critical step on the path to better image processing solutions.
The goal of this project is to identify and adapt algorithms and their corresponding user interfaces to make significant improvements to the weakly-labeled CXR dataset, while leveraging the limited time available from expert label reviewers. As the larger machine learning community – far beyond the medical field – relies on weakly-labeled but large publicly-available datasets, this study has a potentially broad impact.
Background and required skills
- Preferred methods for :
- choosing which images to show an expert for re-labeling
- re-labeling images that the expert doesn’t see, based on image-image similarity
- Measurements of the success of each of the algorithms attempted
- Data ingestion
- Front-end design
- Similarity training
- Algorithm experimentation
- Review and completion
Expertise and Skills Needed:
- Essential Skills
- Experience implementing numerical computational algorithms
- Demonstrable knowledge of software development best practices
- Proficiency in Python development in a Linux environment
- Experience in at least three of the following Desired Skills
- Desired Skills
- Machine Learning or Data Science solution development
- Computer Vision application development
- User Interface design and implementation
- Docker, ReactJS, Bootstrap, TensorFlow, Flask or OpenCV development
- Commercial software development in an Agile/DevOps environment
For more info or to apply to this applied research position, please