One of the most popular social networks, Instagram has witnessed exponential growth over the past few years. Last year, the social media network reached one billion monthly active users, and it is projected to surpass 111 million in 2019.
According to the reports, over half of the Instagram community visits Instagram Explore every month to discover new photos, videos, and stories relevant to their interests. The Instagram Explore is a recommendation engine, which recommends the most relevant content to users in real-time with the help of emerging technologies like artificial intelligence and machine learning. Recently, the researchers at Facebook AI research unveiled the novel engineering solutions and a detailed overview of the key elements that make Instagram Explore work effectively.
How It Works
As mentioned, Instagram Explore recommends the most relevant content out of billions of options in real-time, which introduces a number of machine learning challenges to the researchers and these problems are tackled by creating a series of custom query languages that are mostly lightweight modelling techniques and tools that enable high-velocity experimentation. These techniques and tools constitute an AI system that extracts 65 billion features and makes 90 million predictions per second.
While developing the recommender engine, the researchers addressed three important needs, and they are mentioned below:
- The ability to conduct rapid experimentation at scale.
- To obtain a stronger signal on the breadth of people’s interests
- A computationally efficient way to ensure that the recommendations are both high quality and fresh.
Foundational Tools
In order to address those needs, the researchers developed foundational tools that are mentioned below:
IGQL
The researchers built a domain-specific language optimised for retrieving candidates in recommender systems known as IGQL. It is a custom domain-specific meta-language that provides the right level of abstraction and assembles all algorithms into one place. Basically, IGQL identifies the most relevant accounts based on individual interests.
IGQL is both statically validated and high-level. The execution of this customised language is optimised in C++, which helps the language minimise both latency and compute resources. It lets engineers focus on ML and business logic behind recommendations to provide a high degree of code reusability.
Ig2vec
Ig2vec is a word2vec-like embedding framework which is used to conclude account embeddings. Account embeddings help to identify topically similar accounts efficiently. The ig2vec embedding framework works by treating account IDs that a user interacts with. And with the user’s interaction, the accounts can be predicted with which a user is likely to interact in a particular session within the Instagram app.
In another case, if a user interacts with a sequence of accounts in the same session, a distance metric between two accounts is then defined which is usually cosine distance or dot product. Based on this, a K Nearest Neighbour (KNN) is implied in order to find the topically similar accounts for any account in the embedding.
Ranking Distillation Model
The researchers introduced a ranking distillation model to help preselecting candidates before implementing the more complex ranking models. The approach works by training a super-lightweight model that learns from and tries to approximate the main ranking models. The distillation model is then trained on this recorded data with a limited set of features and a simpler neural network model structure to replicate results.
Wrapping Up
After implementing all these three tools, researchers split the Explore recommendation system into two main stages, which are the candidate generation stage or the sourcing stage and the ranking stage.
According to the researchers, the ongoing ML challenge encountered by the researchers is to find new and exciting ways to help the Instagram community discover the most interesting and relevant content on the social media platform.