Secret Behind Youtube’s Great Machine Learning Enabled Video Recommendations Finally Revealed

Recommender systems are one of the most treasured tools on the modern internet. Good recommender systems increase retention on video and content platforms and provide an excellent experience for the users. Just ask any ardent Netflix or Hotstar consumer and one will understand how so much of their viewing choices are influenced by algorithms and models running behind the platforms.

Researchers at YouTube recently open-sourced the algorithm running behind the hood which throws much-needed light on how the biggest video platform in the world operates. YouTube has over 1.9 billion monthly users who consume content in over 80 languages. Zhe Zhao and other scientists at Google published a paper called titled Recommending What Video to Watch Next: A Multitask Ranking System to unveil the workings of the large recommender system.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

The paper outlines and explains how scientists have converted the recommendation problem into a large scale multi-objective ranking system. The algorithm manages many ranking objectives including user feedback. Researchers tried many techniques such as Multi-gate Mixture-of-Experts to manage multiple objectives. They have also underlined how the improvements they suggest have made a huge difference to the world’s biggest video platform.

YouTube’s New Recommendation Engine

The core target of the scientists is to come up with an algorithm that will improve recommendations for billions of users on YouTube. The researchers also had to consider two important issues: 

  1. Multimodal feature space: They had to rank a candidate video based on multiple features such as video content, thumbnail, audio, title and description and others. The two big hurdles that have to be solved according to the researchers are bridging the semantic gap for low-level video features and making the machine learning model, learn a sparse distribution of items to do collaborative filtering.
  2. Scalability: Since the platform is used by more than a billion users, scalability becomes a huge issue. Some of the features needed for the model to perform are only available online and can not be fetched beforehand. 

Researchers say in the paper, “At the candidate generation stage, we retrieve a few hundred candidates from a huge corpus. Our ranking system provides a score for each candidate and generates the final ranked list.” Candidate generation used by the system runs on multiple algorithms which try to calculate the similarity between query and candidate video. One particular algorithm retrieves candidates by the frequency of the video being watched for a given query. 

On the other hand, the ranking algorithm runs on two important inputs from users:

  1. Engagement behaviours, collected from clicks and watches
  2. Satisfaction behaviours, collected from likes and dismissals 

The researchers choose to go with the learning-to-rank framework to solve this particular problem. Researchers explain, “We model our ranking problem as a combination of classification problems and regression problems with multiple objectives.” Ranking with multiple objectives is really a hard task. The researchers decided to mitigate the conflict between multiple objectives using Multi-gate Mixture-of-Experts (MMoE), which is a technique recently invented. 

This particular technique called MMoE is a soft-parameter sharing model which is specifically designed to model task conflicts. The researchers say, “The MMoE layer is designed to capture the task differences without requiring significantly more model parameters compared to the shared-bottom model. The key idea is to substitute the shared ReLu layer with the MoE layer and add a separate gating network for each task.”

The neural network suggested by YouTube researchers is identical to multilayer perceptrons with a ReLU activation with task k, prediction yk, and the last hidden layer hk, the MMoE layer as components. Researchers are very careful to eliminate various biases such as position bias, selection biases and others. The new MMoE based neural network with 8 Experts resulted in +0.45% improvement in engagement metrics and + 3.07% improvement in satisfaction metrics. 



The research paper from Google shows that how putting more thought and effort into the ranking problem can reap great rewards for any media recommendation engine. Researchers have been successful in creating a scalable and improved end-to-end ranking system with systems built in to handle various kinds of data biases. 

Abhijeet Katte
As a thorough data geek, most of Abhijeet's day is spent in building and writing about intelligent systems. He also has deep interests in philosophy, economics and literature.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox