Now Reading
Introduction To Infer.NET – A Framework For Probabilistic Programming in DOTNET

Introduction To Infer.NET – A Framework For Probabilistic Programming in DOTNET

Infer.NET

Introduction

Infer.NET is a framework for making Bayesian inference on graphical models. The user specifies the factors and variables of a graphical model. Infer.NET analyses them and creates a schedule for making inference on the model. The model can then be queried for marginal distributions.

Microsoft Research and .NET Foundation developed Infer.NET. It is also a part of ML.NET machine learning framework. (Read this article if you are unfamiliar with ML.NET).

Originally written in C# language, the development of Infer.NET began Cambridge, UK in 2004. It was initially released for academic purposes in 2008 and then was open-sourced in 2018. 

Infer.NET is used internally at Microsoft as the machine learning engine in some of its products such as Office, Azure, and Xbox. It is already in use in several projects related to social networking, healthcare, computational biology, web search, machine vision, etc. It enables the creation and handling of complex models in these projects with just a few lines of code.

Before moving on to the details of Infer.NET, let us first understand some of the underlying concepts.

Probabilistic Programming

Probabilistic programming is a programming paradigm in which statistical models are specified for which inference occurs automatically. Try to understand it through a lucid example given below.

Suppose, a programmer scribbles a word on a touch screen. He may be certain that he wanted to write the word “bull”, but the application may remain in ambiguity about whether the scribbled hand-writing meant “hull” or “bull”. Probabilistic programming resolves such uncertainty. Wondering how? Read further to know!

Conventional variables in programming such as a bool or an int variable need some well-defined value. On the contrary, probabilistic programming approach uses the concept of random variables. They are nothing but the extensions of standard data types and can represent uncertain values. Each random variable represents a set or range of possible values. It also has an associated distribution which assigns a probability to each of the possible values. 

Click here to read about probabilistic programming in detail.

Bayesian Inference

Bayes inference is a methodology which forms the basis of a probabilistic programming paradigm. It enables reasoning backwards from observation to its origin – say, from scribble to the possible words and their probabilities based on a probabilistic model. It also includes a robust way of assessing the models’ quality (termed as ‘evidence’) which one can use to choose the best model.

Overview of Infer.NET

Microsoft’s Infer.NET framework can be used from any .NET language, including C++, C#, F#, Visual Basic and IronPython

It significantly simplifies the process of implementing probabilistic programming by providing:

  •  A modelling API which simplifies the creation of probabilistic models.
  •  An inference engine, which combines the model with your observations, and handles the complex inference process.

The goal of Infer.NET is to allow Machine Learning algorithms to be created in hours instead of several days or weeks. This is a challenging task since the framework needs to cope with a huge variety of real-world problems and fast enough to cope with today’s internet-scale datasets. The flexibility in Infer.NET comes from a rich modelling language that allows complex customized models to be designed and quickly tested. 

Highlighting features of Infer.NET

  • It supports continuous as well as discrete variables. These variables can be combined using factors including arithmetic operations, linear algebra, range and positivity constraints, boolean operators and many more.
  • It has been designed to be scalable to large models and large datasets. 
  • It acts as a compiler that converts a model description directly into source code to perform inference with no overhead costs. 
  • To achieve accurate inference in different models, it provides multiple algorithms to choose the one suited for your task.
  •  Its message-passing architecture allows you to use standard tools to step through the inference procedure and examine individual messages making it easy to identify and fix inference problems. 

How Infer.NET works?

The steps followed in sequence for the entire process summarized in the above diagram are as follows:

  1. Create a model definition (using modelling API)
  2. Specify a set of inference queries relating to the model.
  3. Pass the model definition and inference queries (Running Inference) to the model compiler.
  4. The compiler creates source code required to compute those queries on the model by applying the specified inference algorithm. 
  5. Compile the source code to create a compiled algorithm. 
  6. According to the user-specified settings, using a set of observed values, the inference engine executes the compiled algorithm. It thus produces the marginal distributions requested in the queries. The process can be repeated for different settings of the observed values without recompiling the algorithm.

Practical implementation

Here is an example of performing gender prediction task using Infer.NET. We have used Bayes Point Machine, a probabilistic model designed for classification. 

Suppose, we have the data of 1000 people in terms of height and weight of each person. The task is to predict the gender of an individual based on his height and weight information. For instance, if his height is said 160cm and he weighs 65kg, what is the probability of his being a male?

Click here to know about the probabilistic model Bayes Point Machine used for this demonstration.

Steps for implementation in Visual Studio

  • Open Microsoft Visual Studio.
  • Click on “New Project” 
  • Select “Console Application”. Name the application.
  • Right-click on “References”, choose “Add Reference…” , browse to the bin directory in the Infer.NET release and add the following three assemblies:

(i) Microsoft.ML.Probabilistic.Learners.dll

(ii) Microsoft.ML.Probabilistic.Learners.Classifier.dll

(iii) Microsoft.ML.Probabilistic.dll

Suppose, the height and weight of each person are stored in memory as instances of Vector type. (Microsoft.ML.Probabilistic.Math) and the corresponding gender (“Male” or “Female”) is stored in a string type object. Thus, the data has an array of 1000 (dense) Vector objects and an array of 1000 string objects.

Mappings

Infer.NET provides a mechanism called ‘mapping’ which allows you to specify in which format the learner should take the input data. It defines the way input data is passed for inference. It avoids unnecessary data conversions and enables choosing the format as per your convenience.

See Also
time series analysis and forecasting with R

Visit this page to learn about BPS’s two kinds of mappings (viz. IClassifierMapping and IBayesPointMachineClassifierMapping). 

Here, we are using IClassifierMapping interface, implemented using the following classes:

 public class ClassifierMapping  : IClassifierMapping<IList<Vector>, int, IList<string>, string, Vector>  
 {  
    //specify which instances are present in the data batch fed to the classifier
      public IEnumerable<int> GetInstances(IList<Vector> featureVectors)  
     {  
         for (int instance = 0; instance < featureVectors.Count; instance++)  
         {  
             yield return instance;  
         }  
     }  
 //specify how to get the feature values for a given instance
     public Vector GetFeatures(int instance, IList<Vector> featureVectors)  
     {  
         return featureVectors[instance];  
     }  
 //specify how to get the ground truth label for a given instance
     public string GetLabel(  
         int instance, IList<Vector> featureVectors, IList<string> labels)  
     {  
         return labels[instance];  
     }  
 //distinct class labels present in the data
     public IEnumerable<string> GetClassLabels(  
         IList<Vector> featureVectors = null, IList<string> labels = null)  
     {  
         return new[] { "Female", "Male" };  
     }  
 } 

Instantiate the ClassifierMapping

var mapping = new ClassifierMapping();

Create BPM binary classifier

var classifier = BayesPointMachineClassifier.CreateBinaryClassifier(mapping);

Train the classifier on the data

classifier.Train(trainingSet.FeatureVectors, trainingSet.Labels);

Where,

 trainingSet.FeatureVectors: array of Vector type which contains height and weight measurements as well as a bias 

trainingSet.Labels: array of strings showing gender label

Make predictions on unseen data

var predictions = classifier.PredictDistribution(testSet.FeatureVectors);

Suppose, we have to predict the gender of a person with weight 50kg and height 163cm. Then, testSet. FeatureVectors is an array containing a single Vector with the centred height and weight feature values for 163cm and 50kg and a constant which acts as bias.

Estimate the probability

string estimate = classifier.Predict(InstanceOfInterest, testSet.FeatureVectors);

(Read about the Predict() function of BPS here).

The output will be either ‘Male’ or ‘Female’ depending upon the training data and test sample’s information we feed to the model.

Source: https://dotnet.github.io/infer/userguide/Learners/Bayes%20Point%20Machine%20classifiers/Introduction.html

References

Refer to the following sources to have an in-depth understanding of Infer.NET:

What Do You Think?

Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.
Join our Telegram Group. Be part of an engaging community

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top