Now Reading
A Complete Guide To Tensorflow Recommenders (with Python code)

A Complete Guide To Tensorflow Recommenders (with Python code)

Tensorflow Recommenders

Developing comprehensive recommendation systems is a tedious and complicated effort for both novices and experts. It involves several steps starting with obtaining a dataset, embedding the vectors, and, most importantly, the complete coding technique To avoid the complexity in developing the recommender systems, TensorFlow has launched an open-source package called Tensorflow Recommenders. Here in this article, we will discuss the concept behind Tensorflow Recommenders and with its implementation, we will see how quickly we can set up a system. The major points to be covered further are listed below.

Points to be Discussed

  1. What are Tensorflow Recommenders?
  2. Retrieval System
  3. Implementing Tensorflow Recommenders

Let us begin with the discussions.

Register for our Workshop on How To Start Your Career In Data Science?

What are Tensorflow Recommenders?

TensorFlow Recommenders (TFRS) is an open-source TensorFlow package that simplifies the building, evaluation, and deployment of advanced recommender models. TFRS, which is based on TensorFlow 2. x, allows us to create and assess flexible candidate nomination models, freely include item, user, and context information into recommendation models, etc. We can train multi-task models that optimize many recommendation goals at the same time. So ultimately by using TensorFlow serving, we can efficiently serve our obtained models.

Retrieval System

Many recommender systems aim to extract a few good recommendations from a pool of millions of candidates. A recommender system’s retrieval stage tackles the “needle in a haystack” challenge of identifying a shortlist of promising candidates from a large candidate list. Thankfully TensorFlow Recommenders simplifies the process by constructing two-tower retrieval models. Such models retrieve data in two steps:

  1. Converting user input into an embedding
  2. Identifying the best options in the embedding space

TensorFlow 2. x and Keras are used to build TFRS, making it both familiar and user-friendly. It’s designed to be modular (so we can easily tweak specific layers and metrics), but it still works as a whole (so that the individual components work well together).

Implementing Tensorflow Recommenders

To give you an idea of how to utilize TensorFlow Recommenders, I’m demonstrating here a basic use case based on Tensorflow’s official implementation. We train a basic model for movie recommendations using the MovieLens dataset. This dataset contains information about which movies a user watched and what ratings they provided to those movies.

We’ll use this dataset to train a model that predicts which movies a user will watch and which they won’t. The so-called two-tower model is a frequent and effective design for this type of task which is a neural network with two sub-models that train representations for questions and candidates separately. A particular query-candidate pair’s score is simply the dot product of the outputs of these two towers (depicted in the below animation).

Two-Tower Analogy

On the query side, the inputs can be anything: user ids, search queries, or timestamps whereas on the candidate side, movie titles, descriptions, synopses, and lists of starring actors. We’ll keep things basic in this example by using user ids for the query tower and movie titles for the candidate tower.

Now let’s quickly set up our environment by installing and importing the dependencies.

!pip install -q tensorflow-recommenders
!pip install -q --upgrade tensorflow-datasets
 
import tensorflow_datasets as tfds
import tensorflow_recommenders as tfrs
 
import numpy as np
import tensorflow as tf
 
from typing import Dict, Text
import pprint

Now we will prepare the dataset taken from the Tensorflow datasets. The MovieLens dataset has two files – one is a Rating file that holds attributes related to movies and users, and another file is Movies which holds information related to a movie. To see how quickly a recommendation can be built we will use only user_id and movie_titles for the system. 

# ratings data
rating = tfds.load('movielens/100k-ratings', split='train')
# features of all the movies
movies = tfds.load('movielens/100k-movies', split='train')
 
# limiting the features
rating = rating.map(lambda x:{'movie_title':x['movie_title'],'user_id':x['user_id']})
movies = movies.map(lambda x: x['movie_title'])

To implement the two-tower analogy we need to create a user tower that will map the user_ids into high dimensional vector space, similarly, we will create for movie_titles. These embeddings will later be used in the Keras embedding layer. 

See Also

user_id_vocabulary = tf.keras.layers.experimental.preprocessing.StringLookup(mask_token=None)
user_id_vocabulary.adapt(rating.map(lambda x: x['user_id']))
 
movies_title_vocabulary = tf.keras.layers.experimental.preprocessing.StringLookup(mask_token=None)
movies_title_vocabulary.adapt(movies)

Below we are defining the class that holds the recommendation model where two methods are defined: __init__() and compute_loss(). Under the __init__() method we set up primary components of our model i,e., the user_ids, movie_titles representation, and the retrieval task. Comput_loss is defined for model training.

class MovieLensModel(tfrs.Model):
 
  def __init__(
      self,
      user_model: tf.keras.Model,
      movie_model: tf.keras.Model,
      task: tfrs.tasks.Retrieval):
    super().__init__()
 
    # Set up user and movie representations.
    self.user_model = user_model
    self.movie_model = movie_model
 
    # Set up a retrieval task.
    self.task = task
 
  def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:
    # Define how the loss is computed.
 
    user_embeddings = self.user_model(features["user_id"])
    movie_embeddings = self.movie_model(features["movie_title"])
 
    return self.task(user_embeddings, movie_embeddings)

Now we will define the user model and movie model using Keras Sequential layer and the retrieval task using TFRS. 

users_model = tf.keras.Sequential([user_id_vocabulary,
                                   tf.keras.layers.Embedding(user_id_vocabulary.vocab_size(),64)])
movie_model = tf.keras.Sequential([movies_title_vocabulary,                                   tf.keras.layers.Embedding(movies_title_vocabulary.vocab_size(),64)])
 
task = tfrs.tasks.Retrieval(metrics=tfrs.metrics.FactorizedTopK(
    movies.batch(128).map(movie_model)))

Now let us create, compile, and train a retrieval model.

model = MovieLensModel(users_model,movie_model,task)
model.compile(optimizer=tf.keras.optimizers.Adagrad(0.5))
model.fit(rating.batch(4096), epochs=3)

To validate the model’s recommendations, the TFRS BruteForce layer is employed. The BruteForce layer is indexed with candidate representations that have already been computed, allowing us to find top movies in response to a query by computing the query-candidate score for all available candidates:

recommends = tfrs.layers.factorized_top_k.BruteForce(model.user_model)
recommends.index_from_dataset(movies.batch(100).map(lambda title: (title, model.movie_model(title))))
 
id_ = input('Enter the user_id: ')
_, titles = recommends(np.array([str(id_)]))
print('Top recommendation for user',id_,titles[0, :3])

Output:

Conclusion 

In this article, we have taken the four steps to build the movie recommendation system by analyzing the ratings given by the user. We imported the data and sorted the features for simplicity. Then we built an embedding representation using Keras preprocessing layers. After that, we defined a class for the TFRS model, strategy for models and retrieval task. Finally, we combined the model under the class MovieLensModel, trained and inferred. This post was all about how we can start to build a recommendation system using SOTA Tensorflow Recommenders. 

References:


Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.
Join our Telegram Group. Be part of an engaging community

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top