Guide to Product Recommendation Using ML.NET

Product Recommendation

Product recommendation in Machine Learning refers to the task of recommending product(s) to a customer based on his purchase history. A product recommender system is an ML model which suggests some items, content or services that a specific user would like to buy or indulge in. Here, we have used  Amazon’s product co-purchasing network dataset to create a C# .NET Core console application which works as a product recommender system. 

In our previous articles, we have already covered the basics of ML.NET and implementation of an image classifier using the framework. Let us move forward to another use case of ML.NET i.e. product recommendation.

Two types of recommendation systems

Product recommendation - types of filters

Current recommendation systems can be broadly divided into two categories:

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
  1. Content-based filters

Such filters use information/features related to the products themselves rather than using users’ preferences. For instance, using movies’ genre, star cast, year of release, duration and so on as features to recommend movies to the viewers.

  1. Collaborative filters

 Unlike content-based ones, these filters take users’ choices and feedback into consideration. Recommending movies to a viewer based on the historical data of ratings given by different viewers to different movies is an example of collaborative filtering.

Prerequisites 

  • Use Visual Studio 2019 or higher version
  • Or use Visual Studio 2017 version 15.6 or higher with the .NET Core cross-platform development workload installed

Algorithm used

ML.NET uses collaborative filtering methods for building recommendation systems. It does so by providing an algorithm called Matrix Factorization (MF) which you can implement using the MatrixFactorizationTrainer class.

Visit this page to understand what Matrix Factorization is and how it works.

Dataset used

Amazon’s dataset used here consists of product IDs of various articles and that of the corresponding co-purchased product.  It originally comes from the Stanford Network Analysis Platform (SNAP). The data is based on the Amazon website’s well-known feature called ‘Customers Who Bought This Item Also Bought’.

Visit this page of SNAP where you will find the dataset or click here to download the file directly. 

Implementation steps

Create a C# .NET Core console application. Then install the Microsoft.ML NuGet Package. Click here for its installation.

Open the Program.cs file and replace the ‘using’ statements with the following ones:

using System;

using System.IO;

using Microsoft.ML;

using Microsoft.ML.Data;

using Microsoft.ML.Trainers;

Path definitions 

Define the path locations of your dataset and model.

  private static string DatasetPath = @”../../../Data”;
 //Relative path of the dataset
  private static string TrainDataRelPath = $”{DatasetPath}/Amazon-302.txt”;
 //Absolute location of the dataset 
 private static string TrainDataAbsPath = GetAbsolutePath(TrainDataPath);

 private static string Model = @”../../../Model”;
 //Relative path of the model
 private static string ModelRelPath = $”{Model}/model.zip”;
 //Absolute location of the model
 private static string ModelAbsPath = GetAbsolutePath(ModelRelPath); 

where, GetAbsolutePath() function is defined as follows:

 public static string GetAbsolutePath(string relativePath) {
 FileInfo root = new FileInfo(typeof(Program).Assembly.Location;
 string FolderPath = root.Directory.FullName;
 string fullPath = Path.Combine(FolderPath, relativePath)’
 return fullPath;
 } 

 Click here to understand the FileInfo class.

Context creation

Inside the main() method, instantiate MLContext class.

MLContext myContext = new MLContext();

The ‘myContext’ object will be shared across all the objects involved in the model creation workflow.

Data loading

Replace amazon0302.txt with the dataset from https://snap.stanford.edu/data/amazon0302.html

Change the column’s names so that the dataset looks as follows:

 ProductID  CoPurchaseProductID
 0  1
 0  20
 1  32 

Read the trained data using TextLoader by defining the schema for reading the product co-purchase dataset

 var trainData = myContext.Data.LoadFromTextFile(path:TrainDataAbsPath,
                 //define the schema
                 columns: new[] { 
                 //column for target label                             
                 new TextLoader.Column("Label", DataKind.Single, 0),
                 //column for ProductID
                 new TextLoader.Column(name:nameof(ProductEntry.ProductID), 
                     dataKind:DataKind.UInt32, source: new [] { new        
                     TextLoader.Range(0) }, keyCount: new KeyCount(262111)),
                     //column for CoPurchasedProductID 
                    new TextLoader.Column (name:nameof                           
                    (ProductEntry.CoPurchaseProductID),
                    dataKind:DataKind.UInt32,     
                    source: new [] { new TextLoader.Range(1) }, 
                    keyCount: new KeyCount(262111)) },
                    hasHeader: true,
                    separatorChar: '\t'); 

Among the parameters of TextLoader.Column(), ‘dataKind’ refers to the data type of items in the column, ‘source’ defines source index ranges of the column and ‘keyCount’ means a range of values in the key column.

Click here to know more about the TextLoader.Column class.

Define the model training pipeline

As the data is already in encoded form, we need not specify the required options of the MatrxiFactorizationTrainer; only optional ones and a few extra hyperparameters need to be specified.

 MatrixFactorizationTrainer.Options opt = new MatrixFactorizationTrainer.Options();            
 options.MatrixColumnIndexColumnName = nameof(ProductEntry.ProductID);            
 options.MatrixRowIndexColumnName = nameof(ProductEntry.CoPurchaseProductID);        
 options.LabelColumnName= "Label";            
 options.LossFunction = MatrixFactorizationTrainer.LossFunctionType.SquareLossOneClass;
 //hyperparameters            
 options.Alpha = 0.01;
 options.Lambda = 0.025;
 options.K = 100;
 options.C = 0.00001; 

Pass the options to the MatrixFactorization trainer

var estimator = myContext.Recommendation().Trainers.MatrixFactorization(opt);

Train the estimator on the training data

ITransformer model = estimator.Fit(trainData);

Use the model for predictions

Define two classes to be fed as input to the prediction engine.

 public class Copurchase_prediction
     {
         //predicted score for co-purchased product
            public float Score { get; set; }
     }
     public class ProductEntry
     {
             [KeyType(count : 262111)]
             public uint ProductID { get; set; }
             [KeyType(count : 262111)]
             public uint CoPurchaseProductID { get; set; }
     } 

Create Prediction Engine

var predeng = myContext.Model.CreatePredictionEngine<ProductEntry, Copurchase_prediction>(model);

Using the product engine, predict score for product #50 being the co-purchased product of product #2 

 var pred = predeng.Predict(
                              new ProductEntry()
                              {
                              ProductID = 2,
                              CoPurchaseProductID = 50
                              }); 

Run the console application.

Output interpretation

The output score of matrix factorization trainer is a numerical representation of the likelihood of one product being bought together with the other product. There is no probability information directly indicated in the result. It is considered that the higher the score value, the higher is the probability. For a given product, scores of multiple other products are computed and the one with the highest score is recommended as the co-purchased one.

  • Refer to the GitHub repository and dive deeper into such an interesting use case of ML.NET – product recommendation!
Nikita Shiledarbaxi
A zealous learner aspiring to advance in the domain of AI/ML. Eager to grasp emerging techniques to get insights from data and hence explore realistic Data Science applications as well.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR