Reinforcement Learning For Better Recommender Systems

Collaborative Interactive Recommenders (CIRs) are a class of recommender systems that emerged out of the need to make recommendations user-specific. The growth of online services demanded the service providers to up their game by developing strategies to maximise customer engagement.

However, it is machine learning and the results are only as good as the data it is fed. Since humans are the culmination of complex systems, decisions are usually made in a large hazy grey area. This makes human-like intelligence in machines a distant dream. That said, algorithms are being designed to come closer to human-like intelligence.

In an attempt to make better decisions and recommendations, ML developers from Google merged reinforcement learning and recommender systems.


Sign up for your weekly dose of what's up in emerging technology.

The next generation of recommenders is forecasted to be modelled around sequential user interaction for optimising users’ long-term engagement and overall satisfaction. The importance of modelling the dynamics of user interaction when devising good algorithmic and modelling techniques for CIRs is plainly obvious. Setting aside questions of user interface design and natural language interaction, this makes CIRs a natural setting for the use of reinforcement learning (RL).

So the researchers have built a general-purpose simulation platform dubbed RecSim to facilitate the study of reinforcement learning algorithms in recommender systems. And, RecSim has been open-sourced

Download our Mobile App

RecSim As A Platform

RecSim allows both researchers and practitioners to test the limits of existing RL methods in synthetic recommender settings.

RecSim simulates a recommender agent’s interaction with an environment where the agent interacts by doing some recommendations to users. Both the user and the subject of recommendations are simulated.

The simulations are done based on popularity, interests, demographics, frequency and other traits.

So, the question now would be how different is this from the conventional approaches?

When an RL agent recommends something to a user, then depending on the user’s acceptance, few traits are scored high. This still sounds like a typical recommendation system. However, with RecSim, a developer can author these traits. The features in a user choice model can be made more customised as the agent gets rewarded for making the right recommendation.

The team behind RecSim at Google believes that this simulation platform can be used to test algorithm performance and robustness to different assumptions about user behaviour.

RecSim was created to facilitate the following:

  • Investigate the intersection of RL and recommender systems; 
  • Encourages reproducibility and model-sharing; 
  • Rapidly test and refine models and algorithms in simulation, before incurring the potential cost of live experiments; and 
  • Ease up academic-industry collaboration through the release of “realistic” stylised models of user behaviour without revealing user data or sensitive industry strategies.

RecSim’s aim is to support simulations that mimic the user behaviour that is found in real recommender systems and serve as a controlled environment for developing and assessing recommender models and algorithms, especially reinforcement learning systems designed for sequential user-system interaction.

Future Direction

As Google researchers see a promising future in pursuing this reinforcement-recommender model investigation, they plan to develop the following add-ons:

  • Methodologies to fit stylised user models to usage logs to partially address the gap between reality and simulation
  • Develop APIs using TensorFlow to facilitate model specification and learning, 
  • Scaleup simulation and inference algorithms using accelerators and distributed execution; and 
  • Establish mixed-mode interaction models that will be the de facto standard for modern CIRs.

They posit that in their work that modern collaborative interactive recommenders will cover a variety of system actions such as preference elicitation, providing endorsements, navigation chips and user responses (e.g., example critiquing, indirect/direct feedback, query refinements), not to mention unstructured natural language interaction.

Researchers hope that RecSim will serve as a valuable resource that bridges the gap between recommender systems and RL research.

More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

AIM Upcoming Events

Regular Passes expire on 3rd Mar

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Early Bird Passes expire on 17th Feb

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, Virtual
Deep Learning DevCon 2023
27 May, 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox