Hands-On Guide To Recommendation System Using Collaborative Filtering

Recommendation systems expect to foresee clients' inclinations and predict the most likely product that the users are most likely to purchase and are of interest to them.

Published on November 2, 2020
by Ankit Das

Recommendation systems expect to foresee clients’ inclinations and predict the most likely product that the users are most likely to purchase and are of interest to them. Organizations utilizing recommendation frameworks centre around expanding deals because of exceptionally customized offers and an upgraded client experience. Netflix, Amazon, and so forth use recommender frameworks to assist their clients with recognizing the right item or films for them.

In this article, we will discuss the recommendation system with its types where we will cover the collaborative filtering method in detail with implementations.

Types Of Recommendation System

1. Collaborative Filtering

Collaborative filtering is used to find similar users or items and provide multiple ways to calculate rating based on ratings of similar users.

User-Based: The system finds out the users who have rated various items in the same way. Suppose User A likes 1,2,3 and B likes 1,2 then the system will recommend movie 3 to B.

Item Based: Here, the system tries to find users who bought similar items. For example, A and B like movie 1 and 3 and C likes 3 then, the system will recommend movie 1 to user C.

2. Content Based Filtering

It works on the principle of similar content. If a user is watching a movie of one genre and rates it high, then the system will try to find movies of the same genre with good ratings and recommend it to the user.

In this article, we will cover the item-based collaborative filtering approach to recommend items to the user.

Code Implementation

The movie dataset can be downloaded from the following link.

Import all the libraries required for this project.

import pandas as pd
movies = pd.read_csv("movies.csv",encoding="Latin1")
Ratings = pd.read_csv("ratings.csv")
Tags = pd.read_csv("tags.csv",encoding="Latin1")
movies.head()

Now we need to merge the two dataset movies and ratings.

ratings = pd.merge(movies,Ratings).drop(['genres','timestamp'],axis=1)
print(ratings.shape)
ratings.head()
UserRatings = ratings.pivot_table(index=['userId'],columns=['title'],values='rating')
UserRatings.head()
print("Before: ",UserRatings.shape)
UserRatings = UserRatings.dropna(thresh=10, axis=1).fillna(0,axis=1)
#userRatings.fillna(0, inplace=True)
print("After: ",UserRatings.shape)

In the case-1 Suppose we measure the distance between the two points using euclidean distance. The calculated distance will be large. To overcome this problem there is a need to calculate the Angular distance between the points rather than the Euclidean distance. This approach to finding the similarity between users is called Cosine distance. Another approach is Pearson correlation which is a modified version of cosine distance but adjusted to subtract the means.

Let’s implement this using Pearson Correlation Approach.

corrMatrix = UserRatings.corr(method='pearson')
corrMatrix.head(10)

def get_similar(movie_name,rating):
    similar_ratings = corrMatrix[movie_name]*(rating-2.5)
    similar_ratings = similar_ratings.sort_values(ascending=False)
    #print(type(similar_ratings))
    return similar_ratings

Here, we calculate the Pearson correlation of all the romantic movies that are similar to movies: Reader, Alice in Wonderland.

romantic_movies= [("Reader, The (2008)",5),("Alice in Wonderland (2010)",3),("Aliens (1986)",1),("2001: A Space Odyssey (1968)",2)]
similar_movies = pd.DataFrame()
for movie,rating in romantic_movies:
    similar_movies = similar_movies.append(get_similar(movie,rating),ignore_index = True)
similar_movies.head(10)

similar_movies.sum().sort_values(ascending=False).head(20)

Let’s calculate the Pearson correlation of all the action movies that are similar to movies:Skyfall,Mission Impossible.

action_movies = [("Skyfall (2012)",5),("Mission: Impossible III (2006)",4),("Toy Story 3 (2010)",2),("2 Fast 2 Furious (Fast and the Furious 2, The) (2003)",4)]
similar_movies = pd.DataFrame()
for movie,rating in action_movies:
    similar_movies = similar_movies.append(get_similar(movie,rating),ignore_index = True)
similar_movies.head(10)
similar_movies.sum().sort_values(ascending=False).head(20)

Conclusion

I would conclude this article by stating that I hope you have got a basic idea of how item-based collaborative filtering of recommendation systems works. Further, we can research on user-based collaborative filtering, Hybrid model and content-based filtering approach. Now we can build our own recommendation system. Hope this article is useful to you.
The complete code of the above implementation is available at the AIM’s GitHub repository. Please visit this link to find the notebook of this code.

Access all our open Survey & Awards Nomination forms in one place >>

Ankit Das

A data analyst with expertise in statistical analysis, data visualization ready to serve the industry using various analytical platforms. I look forward to having in-depth knowledge of machine learning and data science. Outside work, you can find me as a fun-loving person with hobbies such as sports and music.

Hands-On Guide To Recommendation System Using Collaborative Filtering

Types Of Recommendation System

1. Collaborative Filtering

2. Content Based Filtering

Code Implementation

Conclusion

Ankit Das

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discord Server

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Recent Stories

World's Biggest Media & Analyst firm specializing in AI

Advertise with us

AIM publishes every day, and we believe in quality over quantity, honesty over spin. We offer a wide variety of branding and targeting options to make it easy for you to propagate your brand.

Branded Content

AIM Brand Solutions, a marketing division within AIM, specializes in creating diverse content such as documentaries, public artworks, podcasts, videos, articles, and more to effectively tell compelling stories.

Corporate Upskilling

ADaSci Corporate training program on Generative AI provides a unique opportunity to empower, retain and advance your talent

Hackathons

With MachineHack you can not only find qualified developers with hiring challenges but can also engage the developer community and your internal workforce by hosting hackathons.

Talent Assessment

Conduct Customized Online Assessments on our Powerful Cloud-based Platform, Secured with Best-in-class Proctoring

Research & Advisory

AIM Research produces a series of annual reports on AI & Data Science covering every aspect of the industry. Request Customised Reports & AIM Surveys for a study on topics of your interest.

Conferences & Events

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives.