Hands-On Guide To Recommendation System Using Collaborative Filtering

Recommendation systems expect to foresee clients' inclinations and predict the most likely product that the users are most likely to purchase and are of interest to them.
Recommendation System

Recommendation systems expect to foresee clients’ inclinations and predict the most likely product that the users are most likely to purchase and are of interest to them. Organizations utilizing recommendation frameworks centre around expanding deals because of exceptionally customized offers and an upgraded client experience. Netflix, Amazon, and so forth use recommender frameworks to assist their clients with recognizing the right item or films for them.

In this article, we will discuss the recommendation system with its types where we will cover the collaborative filtering method in detail with implementations.


Sign up for your weekly dose of what's up in emerging technology.

Types Of Recommendation System

1. Collaborative Filtering

Collaborative filtering is used to find similar users or items and provide multiple ways to calculate rating based on ratings of similar users.

   User-Based: The system finds out the users who have rated various items in the same way. Suppose User A likes 1,2,3 and B likes 1,2 then the system will recommend movie 3 to B. 

   Item Based: Here, the system tries to find users who bought similar items. For example, A and B like movie 1 and 3 and C likes 3 then, the system will recommend movie 1 to user C.

2. Content Based Filtering

It works on the principle of similar content. If a user is watching a movie of one genre and rates it high, then the system will try to find movies of the same genre with good ratings and recommend it to the user.

In this article, we will cover the item-based collaborative filtering approach to recommend items to the user.

Code Implementation

The movie dataset can be downloaded from the following link.

Import all the libraries required for this project.

import pandas as pd
movies = pd.read_csv("movies.csv",encoding="Latin1")
Ratings = pd.read_csv("ratings.csv")
Tags = pd.read_csv("tags.csv",encoding="Latin1")

Now we need to merge the two dataset movies and ratings.

ratings = pd.merge(movies,Ratings).drop(['genres','timestamp'],axis=1)
UserRatings = ratings.pivot_table(index=['userId'],columns=['title'],values='rating')
print("Before: ",UserRatings.shape)
UserRatings = UserRatings.dropna(thresh=10, axis=1).fillna(0,axis=1)
#userRatings.fillna(0, inplace=True)
print("After: ",UserRatings.shape)

In the case-1 Suppose we measure the distance between the two points using euclidean distance. The calculated distance will be large. To overcome this problem there is a need to calculate the Angular distance between the points rather than the Euclidean distance. This approach to finding the similarity between users is called Cosine distance. Another approach is Pearson correlation which is a modified version of cosine distance but adjusted to subtract the means.

Let’s implement this using Pearson Correlation Approach.

corrMatrix = UserRatings.corr(method='pearson')
def get_similar(movie_name,rating):
    similar_ratings = corrMatrix[movie_name]*(rating-2.5)
    similar_ratings = similar_ratings.sort_values(ascending=False)
    return similar_ratings

Here, we calculate the Pearson correlation of all the romantic movies that are similar to movies: Reader, Alice in Wonderland.

romantic_movies= [("Reader, The (2008)",5),("Alice in Wonderland (2010)",3),("Aliens (1986)",1),("2001: A Space Odyssey (1968)",2)]
similar_movies = pd.DataFrame()
for movie,rating in romantic_movies:
    similar_movies = similar_movies.append(get_similar(movie,rating),ignore_index = True)

Let’s calculate the Pearson correlation of all the action movies that are similar to movies:Skyfall,Mission Impossible.

action_movies = [("Skyfall (2012)",5),("Mission: Impossible III (2006)",4),("Toy Story 3 (2010)",2),("2 Fast 2 Furious (Fast and the Furious 2, The) (2003)",4)]
similar_movies = pd.DataFrame()
for movie,rating in action_movies:
    similar_movies = similar_movies.append(get_similar(movie,rating),ignore_index = True)


I would conclude this article by stating that I hope you have got a basic idea of how item-based collaborative filtering of recommendation systems works. Further, we can research on user-based collaborative filtering, Hybrid model and content-based filtering approach. Now we can build our own recommendation system. Hope this article is useful to you.
The complete code of the above implementation is available at the AIM’s GitHub repository. Please visit this link to find the notebook of this code.

More Great AIM Stories

Ankit Das
A data analyst with expertise in statistical analysis, data visualization ready to serve the industry using various analytical platforms. I look forward to having in-depth knowledge of machine learning and data science. Outside work, you can find me as a fun-loving person with hobbies such as sports and music.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM