Guide To Vowpal Wabbit: A State-of-the-art Library For Interactive Machine Learning

Vowpal Wabbit

Vowpal Wabbit is a flexible open-source project designed to tackle complex interactive machine learning tasks. With Microsoft Research and (earlier) Yahoo! Research as major project contributors, Vowpal Wabbit results from intensive community research and contributions since 2007. It provides you with rapid, online and active machine learning solutions for supervised learning and reinforcement learning. 

 Vowpal Wabbit supports Windows, macOS and Ubuntu operating systems. To date, C#, command line and Python packages of Vowpal Wabbit are available for Windows OS, while Java configuration is yet to be released. For macOS and Ubuntu, C# and Java packages will be out soon.

Most common applications of Vowpal Wabbit

  1. Reductions
Vowpal Wabbit application

Image source: Official website

  1. Contextual bandits in which the learner learns from real-time behaviour to choose among distinct actions in a particular context. Before proceeding, refer to this page if you are unfamiliar with the contextual bandits approach.

Highlighting features of Vowpal Wabbit

  1. Reinforcement learning
  • Learning 2 search: It is a guided reinforcement learning technique based on learning to search from search space defined by the problem for complex joint prediction tasks.
  • Contextual bandit approach: It enables continuous adaptation as the learning algorithm tests various actions and learns its own the highest rewarding outcome for a particular situation.

2. Supervised learning

  • Some classification algorithms of Vowpal Wabbit can run in logarithmic time for problems having many possible output classes (such tasks are termed as ‘extreme multi-class learning’). Such fast classifiers are useful for applications like recommendation systems and documents tagging.
  • Vowpal Wabbit provides several algorithms for ‘active learning,’ i.e., picking which samples to label provided a source of unlabeled samples.

3. Interactive learning: Vowpal Wabbit enables online machine learning which does not require all the input data to be available before the algorithm learns to infer. It allows learning from an expanding data source for problems that vary.

4. Efficient learning: Vowpal Wabbit can handle problems with a huge number of sparse features. Also, it achieves scalability by allowing the feature set to be independent of the training data size.

5. Versatile learning

  • Vowpal Wabbit can be deployed on the command line, as a daemon, as a library and as a service via MMLSpark and Microsoft Azure Cognitive Services Personalizer.
  • The flexible input format for learning algorithms are allowed, e.g. features with free form text or combine features from multiple sources for ranking problems.

Practical implementation

Here’s a demonstration of solving a contextual bandits problem using Vowpal Wabbit The code has been implemented in Google colab with Python 3.7.10 and vowpalwabbit 8.9.0 versions. Step-wise explanation of the code is as follows:

  1. Install Python package for Vowpal Wabbit
 !pip install vowpalwabbit 
 !pip install boost      #framework to interface Python and C++
 !apt-get install libboost-program-options-dev zlib1g-dev libboost-python- 
 dev -y 
  1. Import required libraries 
 import numpy as np
 import pandas as pd
 import sklearn 
 from vowpalwabbit import pyvw 
  1. Create sample training data
 training_data = 
[{'action': 1, 'cost': 2, 'prob': 0.3, 'f1': 'a', 'f2': 'c', 'f3': ''}, {'action': 3, 'cost': 1, 'prob': 0.2, 'f1': 'b', 'f2': 'd', 'f3': ''}, {'action': 4, 'cost': 0, 'prob': 0.6, 'f1': 'a', 'f2': 'b', 'f3': ''},
{'action': 2, 'cost': 1, 'prob': 0.4, 'f1': 'a', 'f2': 'b', 'f3': 'c'},
{'action': 3, 'cost': 2, 'prob': 0.7, 'f1': 'a', 'f2': 'd', 'f3': ''}] 

Where ‘prob’ denotes the probability of the actions’ occurrence, and ‘f’ denotes feature.

  1. Convert the above training data in the form of list into a Pandas dataframe.

training_df = pd.DataFrame(training_data)

  1. Add proper index to the training dataframe
 #create a column named ‘index’
 training_df['index'] = range(1, len(training_df) + 1)
 #set the newly created column as the index column
 training_df = training_df.set_index("index") 

Training data:

  1. Repeat steps (3), (4) and (5) for creating test data and form its dataframe
 testing_data = [{'f1': 'b', 'f2': 'c', 'f3': ''},
             {'f1': 'a', 'f2': '', 'f3': 'b'},
             {'f1': 'b', 'f2': 'b', 'f3': ''},
             {'f1': 'a', 'f2': '', 'f3': 'b'}]
 testing_df = pd.DataFrame(testing_data)
 # Add index to data frame
 testing_df['index'] = range(1, len(testing_df) + 1)
 testing_df = testing_df.set_index("index") 

Test data:

  1. Create a contextual bandit with four possible actions (1,2,3 and 4)

vw = pyvw.vw("--cb 4")

‘pyvw’ is a Python binding for pylibvw class. –cb is the contextual bandit module for optimizing the predictor based on already existing data without further exploration. ‘4’ in “–cb 4” above denotes the number of possible actions.

  1. Call learn() method for each training example to perform an online update.
 #Extract action, its cost, probability and features of each training sample
 for i in training_df.index:
   action = training_df.loc[i, "action"]
   cost = training_df.loc[i, "cost"]
   probability = training_df.loc[i, "prob"]
   feature1 = training_df.loc[i, "f1"]
   feature2 = training_df.loc[i, "f2"]
   feature3 = training_df.loc[i, "f3"]
   # Construct the ith example in the required vw format.
   learn_ex = str(action) + ":" + str(cost) + ":" + str(probability) + " |  
   " + str(feature1) + " " + str(feature2) + " " + str(feature3)
   #Perform actual learning by calling learn() on the ith example
  1. Perform predictions on the test set. Construct the examples as done in step (8) but exclude labels and pass them to the predict() method.
 print("test sample  action")
 #extract features of each test sample
 for i in testing_df.index:
   feature1 = testing_df.loc[i, "f1"]
   feature2 = testing_df.loc[i, "f2"]
   feature3 = testing_df.loc[i, "f3"]
 #construct the test sample in required vw format
   test_ex = "| " + str(feature1) + " " + str(feature2) + " " + 
 #Make prediction on the ith test sample
   choice = vw.predict(test_ex)
 #Print the instance number and predicted choice of action
   print("    "+str(i)+"\t\t"+str(choice)) 


Vowpal Wabbit output

According to the training data’s cost structure, contextual bandit assigns each test instance to action 4 as can be seen from the above output.


For more applications and a detailed understanding of Vowpal Wabbit, refer to the following sources:

Download our Mobile App

Nikita Shiledarbaxi
A zealous learner aspiring to advance in the domain of AI/ML. Eager to grasp emerging techniques to get insights from data and hence explore realistic Data Science applications as well.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

6 IDEs Built for Rust

Rust IDEs aid efficient code development by offering features like code completion, syntax highlighting, linting, debugging tools, and code refactoring