The Story Of How Airbnb Implemented Neural Networks To Improve Their Search Feature

Share

Published on October 31, 2018

by Abhijeet Katte

Search capabilities are a big feature for any business website or app. A good search engine can bring new customers to relevant products and lead to better sales. The engineering team at Airbnb sees their search ranking algorithm as their biggest machine learning success story.

The team tried gradient boosted decision tree model but the gains that were seen soon plateaued. In a new paper, data scientists at Airbnb tried to break the condition using neural networks. The paper talks about how Airbnb leveraged deep learning for improving search features and how the team undertook their deep learning journeys.

Airbnb is one of the biggest marketplaces in the world. It is especially difficult to operate this two-sided marketplace where people put out their places for rent and visitors and tourists book them. Most of the bookings on Airbnb start with a search on the website or the app by giving geographic location and they then respond by ranking rooms from the inventory. After many experiments to improve the search results Airbnb made some sweeping changes to the algorithm and decided to incorporate deep learning into the search system.

The data science team at Airbnb transitioned their current search engine into a deep learning algorithm which had to perform at scale. Airbnb started to think about neural networks for search since they had many other components that used machine learning. The team also recommends other teams to also look at deep learning for better search results.

Search Ranking Model

The team says that the move towards deep learning was not immediate and that it was the result of proper planning. The Airbnb paper focuses on the key part of the puzzle — given all available listings, how should they rank them according to guest’s likelihood of actually booking the place.

The team observed that many of the guests do multiple searches for booking decisions and also open links of some spaces they like to view more details. All the searching sessions are stored and successful sessions are the ones which end with bookings at the end of the session. All these sessions are used in the machine learning algorithms. The new model is trained to learn a scoring function that gives a good listing rank for successful sessions.

Booked listings are assigned a relevance of 1 and others are assigned 0 relevance score. Regarding the complexity of deep learning models, Airbnb teams learned that building very intricate and complex neural network models is difficult and it is better to start out with simpler models.

During the deep learning journey, the team at Airbnb tried many neural network models. They started with a simple neural network using L2 regression loss, followed by Lambdarank NN in which pairwise preference of bookers was taken into account, followed by a Decision Tree/Factorisation Machine NN and finally a deep neural network model with 195 features and ReLU layers.

Stages Of Evolution

One important question that comes up and the team addresses well is: Was trying all the models and evolutions necessary to get to a great final model? Couldn’t the team just skip to the last model and achieved great results much before? Deep learning has achieved great things in many fields of computer applications but it is hard to measure its impact on search and allied topics.

Since there is no human performance to compare to and the logs stored for the browsing sessions do not hold any ground truth, it is hard to assess the impact of deep learning on this problem. Researchers are quoted in the paper talking about this exact problem. “Other researchers note the difficulty in using human evaluation even for familiar shopping items. For our application, these diculties are further exacerbated due to the novelty of the inventory,” they said.

The paper showcases failed models really well. Some of the models, the team put out are:

Listing ID as feature in neural network word embedding.
Multi-task learning

Talking about some failures the team said, “When items can be repeated without constraints, such as online videos or words in a language, there is no limit to the amount of user interaction an item can have.”

The rest of the paper goes into the details of hyperparameter tuning, feature engineering, feature importance, feature distribution and model interpretability.

Lessons To Learn

Airbnb started in a time when deep learning enthusiasm was at its peak. The team talks about their deep learning journey saying, “Over time we realized that moving to deep learning is not a drop-in model replacement at all; rather it’s about scaling the system. As a result, it required rethinking the entire system surrounding the model.” The team says that they would wholeheartedly suggest using deep learning because the technology changes and improves many things ahead. The team most of the time before was going into the feature engineering and now researchers can now spend time on better problems. To sum it all up, the team say, “Two years after taking the first steps towards applying neural networks to search ranking, we feel we are just geing started.”

Access all our open Survey & Awards Nomination forms in one place