# A Complete Guide To Tensorflow Recommenders (with Python code)

TensorFlow Recommenders (TFRS) is an open-source TensorFlow package that simplifies the building, evaluation, and deployment of advanced recommender models.

Developing comprehensive recommendation systems is a tedious and complicated effort for both novices and experts. It involves several steps starting with obtaining a dataset, embedding the vectors, and, most importantly, the complete coding technique To avoid the complexity in developing the recommender systems, TensorFlow has launched an open-source package called Tensorflow Recommenders. Here in this article, we will discuss the concept behind Tensorflow Recommenders and with its implementation, we will see how quickly we can set up a system. The major points to be covered further are listed below.

## Points to be Discussed

1. What are Tensorflow Recommenders?
2. Retrieval System
3. Implementing Tensorflow Recommenders

Let us begin with the discussions.

## What are Tensorflow Recommenders?

TensorFlow Recommenders (TFRS) is an open-source TensorFlow package that simplifies the building, evaluation, and deployment of advanced recommender models. TFRS, which is based on TensorFlow 2. x, allows us to create and assess flexible candidate nomination models, freely include item, user, and context information into recommendation models, etc. We can train multi-task models that optimize many recommendation goals at the same time. So ultimately by using TensorFlow serving, we can efficiently serve our obtained models.

## Retrieval System

Many recommender systems aim to extract a few good recommendations from a pool of millions of candidates. A recommender system’s retrieval stage tackles the “needle in a haystack” challenge of identifying a shortlist of promising candidates from a large candidate list. Thankfully TensorFlow Recommenders simplifies the process by constructing two-tower retrieval models. Such models retrieve data in two steps:

1. Converting user input into an embedding
2. Identifying the best options in the embedding space

TensorFlow 2. x and Keras are used to build TFRS, making it both familiar and user-friendly. It’s designed to be modular (so we can easily tweak specific layers and metrics), but it still works as a whole (so that the individual components work well together).

## Implementing Tensorflow Recommenders

To give you an idea of how to utilize TensorFlow Recommenders, I’m demonstrating here a basic use case based on Tensorflow’s official implementation. We train a basic model for movie recommendations using the MovieLens dataset. This dataset contains information about which movies a user watched and what ratings they provided to those movies.

We’ll use this dataset to train a model that predicts which movies a user will watch and which they won’t. The so-called two-tower model is a frequent and effective design for this type of task which is a neural network with two sub-models that train representations for questions and candidates separately. A particular query-candidate pair’s score is simply the dot product of the outputs of these two towers (depicted in the below animation).

Two-Tower Analogy

On the query side, the inputs can be anything: user ids, search queries, or timestamps whereas on the candidate side, movie titles, descriptions, synopses, and lists of starring actors. We’ll keep things basic in this example by using user ids for the query tower and movie titles for the candidate tower.

Now let’s quickly set up our environment by installing and importing the dependencies.

!pip install -q tensorflow-recommenders

import tensorflow_datasets as tfds
import tensorflow_recommenders as tfrs

import numpy as np
import tensorflow as tf

from typing import Dict, Text
import pprint

Now we will prepare the dataset taken from the Tensorflow datasets. The MovieLens dataset has two files – one is a Rating file that holds attributes related to movies and users, and another file is Movies which holds information related to a movie. To see how quickly a recommendation can be built we will use only user_id and movie_titles for the system.

# ratings data
# features of all the movies

# limiting the features
rating = rating.map(lambda x:{'movie_title':x['movie_title'],'user_id':x['user_id']})
movies = movies.map(lambda x: x['movie_title'])

To implement the two-tower analogy we need to create a user tower that will map the user_ids into high dimensional vector space, similarly, we will create for movie_titles. These embeddings will later be used in the Keras embedding layer.

user_id_vocabulary = tf.keras.layers.experimental.preprocessing.StringLookup(mask_token=None)



Below we are defining the class that holds the recommendation model where two methods are defined: __init__() and compute_loss(). Under the __init__() method we set up primary components of our model i,e., the user_ids, movie_titles representation, and the retrieval task. Comput_loss is defined for model training.

class MovieLensModel(tfrs.Model):

def __init__(
self,
user_model: tf.keras.Model,
movie_model: tf.keras.Model,
super().__init__()

# Set up user and movie representations.
self.user_model = user_model
self.movie_model = movie_model

# Set up a retrieval task.

def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:
# Define how the loss is computed.

user_embeddings = self.user_model(features["user_id"])
movie_embeddings = self.movie_model(features["movie_title"])



Now we will define the user model and movie model using Keras Sequential layer and the retrieval task using TFRS.

users_model = tf.keras.Sequential([user_id_vocabulary,
tf.keras.layers.Embedding(user_id_vocabulary.vocab_size(),64)])
movie_model = tf.keras.Sequential([movies_title_vocabulary,                                   tf.keras.layers.Embedding(movies_title_vocabulary.vocab_size(),64)])

movies.batch(128).map(movie_model)))

Now let us create, compile, and train a retrieval model.

model = MovieLensModel(users_model,movie_model,task)
model.fit(rating.batch(4096), epochs=3)

To validate the model’s recommendations, the TFRS BruteForce layer is employed. The BruteForce layer is indexed with candidate representations that have already been computed, allowing us to find top movies in response to a query by computing the query-candidate score for all available candidates:

recommends = tfrs.layers.factorized_top_k.BruteForce(model.user_model)
recommends.index_from_dataset(movies.batch(100).map(lambda title: (title, model.movie_model(title))))

id_ = input('Enter the user_id: ')
_, titles = recommends(np.array([str(id_)]))
print('Top recommendation for user',id_,titles[0, :3])

Output:

## Conclusion

In this article, we have taken the four steps to build the movie recommendation system by analyzing the ratings given by the user. We imported the data and sorted the features for simplicity. Then we built an embedding representation using Keras preprocessing layers. After that, we defined a class for the TFRS model, strategy for models and retrieval task. Finally, we combined the model under the class MovieLensModel, trained and inferred. This post was all about how we can start to build a recommendation system using SOTA Tensorflow Recommenders.

## More Great AIM Stories

### Implementing A Recurrent Neural Network (RNN) From Scratch

Vijaysinh is an enthusiast in machine learning and deep learning. He is skilled in ML algorithms, data manipulation, handling and visualization, model building.

## Our Upcoming Events

Conference, in-person (Bangalore)
MachineCon 2022
24th Jun

Conference, Virtual
Deep Learning DevCon 2022
30th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

### Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

### Telegram Channel

Discover special offers, top stories, upcoming events, and more.

##### MORE FROM AIM

LTI and Mindtree both play in Analytics services businesses, just like most other large IT/ITes service providers. But, what would the analytics services business of the merged entity look like?

##### GitHub now offers math support in markdown

GitHub’s math rendering capability uses MathJax; an open-source, JavaScript-based display engine.

Meta recently organised messaging event called ‘Conversations.’

##### Wipro announces 40,000 sq.ft. Innovation Studio in Texas

The studio will leverage Wipro’s deep reservoir of IPs, patents, and innovation DNA.

##### Google’s facial recognition tech to replace smart cards in Bengaluru metro trains￼

BMRCL plans to introduce the technology at its automatic fare collection gates.

##### Data science hiring process at DealShare

In the next few months, DealShare looks to grow its data science team by 15-20 members.

##### DeepMind’s AlphaFold 2 is half of the story

The idea was if I give you a sequence of amino acids, can you predict what will be the structure or the shape that it will take in the 3D space?

##### Lenskart invests USD 2 Mn in location intelligence platform GeoIQ

GeoIQ’s AI-based location tool will help Lenskart with its aggressive store rollout strategy.

##### TensorFlow v2.9 released: Major highlights

The main highlights of this release are performance enhancement with oneDNN and the release of a new API for model distribution, called DTensor