Cross-Modal Recipe Retrieval

The goal of cross-modal recipe retrieval is to design systems that are able to find a digital recipe, given the user’s image of the food, or find its image, given its ingredients or cooking instructions. For such a cross- modal retrieval task, a common image-text representation space is needed to embed the semantic information of each modality along with the cross-modal mutual information. With the advent of large- scale datasets, such as Recipe1M, the scalability-accuracy tradeoff of the cross-modal embedding methods has increasingly gained more attention in the last few years. The main goal of this project is to use (and improve) the SOTA cross-modal embedding methods to efficiently retrieve a recipe in a large dataset of recipes with low latency and computational demand, and recommend similar recipes based on the queried recipe.

Faculty Supervisor:

Scott Sanner

Student:

Partner:

LG Electronics Canada, Inc.

Discipline:

Computer science

Sector:

Professional, scientific and technical services

University:

University of Toronto

Program: