Problem statement

Imagine you just came back from a vacation and start to plan for the next adventure. You have a few questions:

  • Where should I travel next?

  • What if I want somewhere similar?

  • What about somewhere new and different?

  • I am lazy… Is there an app for that?

 

Solution overview

The answer is yes. Introducing MemoTrek!

MemoTrek is an application that takes your travel photos as input and makes personalized recommendations for future travel destinations. It provides two types of recommendations: you-may-also-likes for a similar type of experience and something-different for new adventures.

Strategy

As the name suggests, Memo stands for memory, the user photo library; Trek is for a database from which recommendations are made from. This is essentially a mapping problem: we are trying to figure out which entry in the database is most correlated to the user input? However, it is not a trivial one as the input and output data are of different natures: image, text. Therefore, the task is to transform two different data types into the same space, for apples-to-apples comparison.

I started from the user library and applied a convolutional neural network to identify a photo as a mixture of different categories. The percentage stands for the probability of the image belonging to that category. I trained the network based on vgg16 model with my own images and achieved an accuracy of 96%.

 

I trained the neural network using "bottleneck features" based on a vgg16 model to achieve a high accuracy with a limited number of images. The details can be found here.

 

I then used a Wikipedia API to download text-based descriptions for each item in the database. The description has on average 200 words, and each word is then converted to a 300-dimensional vector with Google’s pre-trained word2vec model. Those vectors are then averaged to obtain the overall meaning of the passage represented by the location_vector. I applied the same technique to the mixture of categories from the user database and obtained the user_vector. I then calculate the correlation between the two vectors using cosine similarity.

Think of the cosine similarity as the angle between two vectors in this high-dimensional space. The smaller the angle, the higher the similarity. MemoTrek selects the top three closest vectors to the user_vector as you-may-also-likes, and the top three furthest vectors as new adventures.

Web app demo

Summary

The project made substantial use of the following skills:

APIs, Deep Learning (Convolutional Neural Networks), Flask