Secret Behind Youtube’s Great Machine Learning Enabled Video Recommendations Finally Revealed

Recommender systems are one of the most treasured tools on the modern internet. Good recommender systems increase retention on video and content platforms and provide an excellent experience for the users. Just ask any ardent Netflix or Hotstar consumer and one will understand how so much of their viewing choices are influenced by algorithms and models running behind the platforms.

Researchers at YouTube recently open-sourced the algorithm running behind the hood which throws much-needed light on how the biggest video platform in the world operates. YouTube has over 1.9 billion monthly users who consume content in over 80 languages. Zhe Zhao and other scientists at Google published a paper called titled Recommending What Video to Watch Next: A Multitask Ranking System to unveil the workings of the large recommender system.


Sign up for your weekly dose of what's up in emerging technology.

The paper outlines and explains how scientists have converted the recommendation problem into a large scale multi-objective ranking system. The algorithm manages many ranking objectives including user feedback. Researchers tried many techniques such as Multi-gate Mixture-of-Experts to manage multiple objectives. They have also underlined how the improvements they suggest have made a huge difference to the world’s biggest video platform.

YouTube’s New Recommendation Engine

The core target of the scientists is to come up with an algorithm that will improve recommendations for billions of users on YouTube. The researchers also had to consider two important issues: 

  1. Multimodal feature space: They had to rank a candidate video based on multiple features such as video content, thumbnail, audio, title and description and others. The two big hurdles that have to be solved according to the researchers are bridging the semantic gap for low-level video features and making the machine learning model, learn a sparse distribution of items to do collaborative filtering.
  2. Scalability: Since the platform is used by more than a billion users, scalability becomes a huge issue. Some of the features needed for the model to perform are only available online and can not be fetched beforehand. 

Researchers say in the paper, “At the candidate generation stage, we retrieve a few hundred candidates from a huge corpus. Our ranking system provides a score for each candidate and generates the final ranked list.” Candidate generation used by the system runs on multiple algorithms which try to calculate the similarity between query and candidate video. One particular algorithm retrieves candidates by the frequency of the video being watched for a given query. 

On the other hand, the ranking algorithm runs on two important inputs from users:

  1. Engagement behaviours, collected from clicks and watches
  2. Satisfaction behaviours, collected from likes and dismissals 

The researchers choose to go with the learning-to-rank framework to solve this particular problem. Researchers explain, “We model our ranking problem as a combination of classification problems and regression problems with multiple objectives.” Ranking with multiple objectives is really a hard task. The researchers decided to mitigate the conflict between multiple objectives using Multi-gate Mixture-of-Experts (MMoE), which is a technique recently invented. 

This particular technique called MMoE is a soft-parameter sharing model which is specifically designed to model task conflicts. The researchers say, “The MMoE layer is designed to capture the task differences without requiring significantly more model parameters compared to the shared-bottom model. The key idea is to substitute the shared ReLu layer with the MoE layer and add a separate gating network for each task.”

The neural network suggested by YouTube researchers is identical to multilayer perceptrons with a ReLU activation with task k, prediction yk, and the last hidden layer hk, the MMoE layer as components. Researchers are very careful to eliminate various biases such as position bias, selection biases and others. The new MMoE based neural network with 8 Experts resulted in +0.45% improvement in engagement metrics and + 3.07% improvement in satisfaction metrics. 



The research paper from Google shows that how putting more thought and effort into the ranking problem can reap great rewards for any media recommendation engine. Researchers have been successful in creating a scalable and improved end-to-end ranking system with systems built in to handle various kinds of data biases. 

More Great AIM Stories

Abhijeet Katte
As a thorough data geek, most of Abhijeet's day is spent in building and writing about intelligent systems. He also has deep interests in philosophy, economics and literature.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM