Collaborative Interactive Recommenders (CIRs) are a class of recommender systems that emerged out of the need to make recommendations user-specific. The growth of online services demanded the service providers to up their game by developing strategies to maximise customer engagement.\n\n\n\nHowever, it is machine learning and the results are only as good as the data it is fed. Since humans are the culmination of complex systems, decisions are usually made in a large hazy grey area. This makes human-like intelligence in machines a distant dream. That said, algorithms are being designed to come closer to human-like intelligence.\n\n\n\nIn an attempt to make better decisions and recommendations, ML developers from Google merged reinforcement learning and recommender systems.\n\n\n\nThe next generation of recommenders is forecasted to be modelled around sequential user interaction for optimising users\u2019 long-term engagement and overall satisfaction. The importance of modelling the dynamics of user interaction when devising good algorithmic and modelling techniques for CIRs is plainly obvious. Setting aside questions of user interface design and natural language interaction, this makes CIRs a natural setting for the use of reinforcement learning (RL).\n\n\n\nSo the researchers have built a general-purpose simulation platform dubbed RecSim to facilitate the study of reinforcement learning algorithms in recommender systems. And, RecSim has been open-sourced\n\n\n\nRecSim As A Platform\n\n\n\n\n\n\n\nRecSim allows both researchers and practitioners to test the limits of existing RL methods in synthetic recommender settings.\n\n\n\nRecSim simulates a recommender agent\u2019s interaction with an environment where the agent interacts by doing some recommendations to users. Both the user and the subject of recommendations are simulated.\n\n\n\nThe simulations are done based on popularity, interests, demographics, frequency and other traits.\n\n\n\nSo, the question now would be how different is this from the conventional approaches?\n\n\n\nWhen an RL agent recommends something to a user, then depending on the user\u2019s acceptance, few traits are scored high. This still sounds like a typical recommendation system. However, with RecSim, a developer can author these traits. The features in a user choice model can be made more customised as the agent gets rewarded for making the right recommendation.\n\n\n\nThe team behind RecSim at Google believes that this simulation platform can be used to test algorithm performance and robustness to different assumptions about user behaviour.\n\n\n\nRecSim was created to facilitate the following:\n\n\n\nInvestigate the intersection of RL and recommender systems; Encourages reproducibility and model-sharing; Rapidly test and refine models and algorithms in simulation, before incurring the potential cost of live experiments; and Ease up academic-industry collaboration through the release of \u201crealistic\u201d stylised models of user behaviour without revealing user data or sensitive industry strategies.\n\n\n\nRecSim\u2019s aim is to support simulations that mimic the user behaviour that is found in real recommender systems and serve as a controlled environment for developing and assessing recommender models and algorithms, especially reinforcement learning systems designed for sequential user-system interaction.\n\n\n\nFuture Direction\n\n\n\nAs Google researchers see a promising future in pursuing this reinforcement-recommender model investigation, they plan to develop the following add-ons:\n\n\n\nMethodologies to fit stylised user models to usage logs to partially address the gap between reality and simulationDevelop APIs using TensorFlow to facilitate model specification and learning, Scaleup simulation and inference algorithms using accelerators and distributed execution; and Establish mixed-mode interaction models that will be the de facto standard for modern CIRs.\n\n\n\nThey posit that in their work that modern collaborative interactive recommenders will cover a variety of system actions such as preference elicitation, providing endorsements, navigation chips and user responses (e.g., example critiquing, indirect\/direct feedback, query refinements), not to mention unstructured natural language interaction.\n\n\n\nResearchers hope that RecSim will serve as a valuable resource that bridges the gap between recommender systems and RL research.