MITB Banner

Reinforcement Learning For Better Recommender Systems

Share

Collaborative Interactive Recommenders (CIRs) are a class of recommender systems that emerged out of the need to make recommendations user-specific. The growth of online services demanded the service providers to up their game by developing strategies to maximise customer engagement.

However, it is machine learning and the results are only as good as the data it is fed. Since humans are the culmination of complex systems, decisions are usually made in a large hazy grey area. This makes human-like intelligence in machines a distant dream. That said, algorithms are being designed to come closer to human-like intelligence.

In an attempt to make better decisions and recommendations, ML developers from Google merged reinforcement learning and recommender systems.

The next generation of recommenders is forecasted to be modelled around sequential user interaction for optimising users’ long-term engagement and overall satisfaction. The importance of modelling the dynamics of user interaction when devising good algorithmic and modelling techniques for CIRs is plainly obvious. Setting aside questions of user interface design and natural language interaction, this makes CIRs a natural setting for the use of reinforcement learning (RL).

So the researchers have built a general-purpose simulation platform dubbed RecSim to facilitate the study of reinforcement learning algorithms in recommender systems. And, RecSim has been open-sourced

RecSim As A Platform

RecSim allows both researchers and practitioners to test the limits of existing RL methods in synthetic recommender settings.

RecSim simulates a recommender agent’s interaction with an environment where the agent interacts by doing some recommendations to users. Both the user and the subject of recommendations are simulated.

The simulations are done based on popularity, interests, demographics, frequency and other traits.

So, the question now would be how different is this from the conventional approaches?

When an RL agent recommends something to a user, then depending on the user’s acceptance, few traits are scored high. This still sounds like a typical recommendation system. However, with RecSim, a developer can author these traits. The features in a user choice model can be made more customised as the agent gets rewarded for making the right recommendation.

The team behind RecSim at Google believes that this simulation platform can be used to test algorithm performance and robustness to different assumptions about user behaviour.

RecSim was created to facilitate the following:

  • Investigate the intersection of RL and recommender systems; 
  • Encourages reproducibility and model-sharing; 
  • Rapidly test and refine models and algorithms in simulation, before incurring the potential cost of live experiments; and 
  • Ease up academic-industry collaboration through the release of “realistic” stylised models of user behaviour without revealing user data or sensitive industry strategies.

RecSim’s aim is to support simulations that mimic the user behaviour that is found in real recommender systems and serve as a controlled environment for developing and assessing recommender models and algorithms, especially reinforcement learning systems designed for sequential user-system interaction.

Future Direction

As Google researchers see a promising future in pursuing this reinforcement-recommender model investigation, they plan to develop the following add-ons:

  • Methodologies to fit stylised user models to usage logs to partially address the gap between reality and simulation
  • Develop APIs using TensorFlow to facilitate model specification and learning, 
  • Scaleup simulation and inference algorithms using accelerators and distributed execution; and 
  • Establish mixed-mode interaction models that will be the de facto standard for modern CIRs.

They posit that in their work that modern collaborative interactive recommenders will cover a variety of system actions such as preference elicitation, providing endorsements, navigation chips and user responses (e.g., example critiquing, indirect/direct feedback, query refinements), not to mention unstructured natural language interaction.

Researchers hope that RecSim will serve as a valuable resource that bridges the gap between recommender systems and RL research.

PS: The story was written using a keyboard.
Share
Picture of Ram Sagar

Ram Sagar

I have a master's degree in Robotics and I write about machine learning advancements.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India