This is the intern project I did at Grubhub. It aims to return the most similar k images in database to the user’s input image ($k$ is also specified by user’s input).
Tools used include OpenCV, Theano, Keras and Google Vision.
With balanced retrieval dataset constructed from Food-11’s training set (11 categories in total), the system can achieve precision@5 with 0.64 and precision@15 with 0.77 on its evaluation set.
Snapshot of Sample Output