 Beginners Guide To Data Visualisation With Matplotlib   When it comes to Python and its visualisation capabilities, Matplotlib is undoubtedly the mother of all visualisation libraries. Matplotlib is a very popular library that has revolutionised the concept of making impressive plots with Python effortlessly.

In our previous articles, we introduced you to some of the most popular plotting libraries such as Pandas plots, Seaborn, Plotly and Cufflinks. Most of these are built on top of Matplotlib which makes it an important library to know about.

In this article, we will introduce you to Matplotlib and will take you through a hands-on session to plot beautiful visualisations.

Plotting With Matplotlib

Matplotlib supports a wide variety of plots from the basic line and scatter plots to advanced multi-dimensional plots. We will start with basic plots and will discuss some of the best practices to make an attractive and intuitive visualisation.

Installing Matplotlib

Use the pip installer to install Matplotlib into your working environment. Type and execute the following command in your terminal.

pip install matplotlib

If you are using Anaconda distribution use conda install matplotlib to install the library.

Let’s make some plots!

Importing the libraries

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

The %matplotlib inline function allows for the plots to be visible when using Jupyter Notebook.

Importing the dataset

Here we will use a simple data set made of random numbers. This is what the data looks like. Quick Plots

The Matplotlib enables us to plot to functional plots with ease. These plots are on-the-go plots that helps us visualise data in a quick and effortless way by calling the plot method and passing the axes as arguments.

Simple Plot

Let us plot a simple line plot to depict how the value of A changes for each observation in the dataset.

plt.plot(data['A'], c = 'b')
plt.title("Change In A")
plt.xlabel("Index")
plt.ylabel("A")
plt.legend()

• In the above code block, we pass an indexed data (data[‘A’]) to the plot method of the Matplotlib object. The c argument sets the colour of the plot. In the above case c=’b’ sets the colour of the plot to blue.
• The title method sets the passed string as the main title of the plot. The xlabel and ylabel methods label the x-axis and y-axis respectively.
• The legend method displays the plot legends.

Output: Scatter Plot

Lets us scatter plot the values of columns A and B against each other.

plt.scatter(x = data['A'],y = data['B'],s = 500, c = 'r', marker = "*",alpha = 1, linewidths = 1, edgecolors='b')
plt.title("Scattered A vs B")
plt.xlabel("A")
plt.ylabel("B")

The Scatter method in the above code block specifies that the plots are scattered.

• x = data[‘A’] sets the x-axis with values from the feature A of the dataset
• y = data[‘B’] sets the y-axis with values from the feature B of the dataset
• marker = “*” sets the plot symbol as *. For the list of all available markers click here.
• s = 500 sets the size of the marker symbol in the plot to 500
• linewidths = 1 sets the border width of the marker symbol used
• edgecolors=’b’ sets the colour of the marker symbol edge to blue

Matplotlib supports all HTML colour codes which can be passed in as arguments. You can get all HTML colour codes here. Copy the colour code of your choice and prepend it with a ‘#’ symbol and set it to the colour parameter.

Output: Pie Chart

plt.pie(x = [10,10,30,25,5,20], explode = [0.1,0.1,0.1,0.1,0.1,0.1], labels = ['A','B','C','D','E','F'], radius=2, autopct='%.01f%%',shadow=True, textprops={'size': 'smaller'})

The above code block produces a pie chart for the values passed in as x.

• explode = [0.1,0.1,0.1,0.1,0.1,0.1] spaces each block or wedge at 0.1 units away.
• labels = [‘A’,’B’,’C’,’D’,’E’,’F’] sets the label for each wedge.
• autopct=’%.01f%%’ displays the x values in percentage inside the pie chart.
• textprops={‘size’: ‘smaller’} sets the size of the text

Output: Creating Sophisticated Plots With Objects

Matplotlib also has object-based plotting which adds flexibility to its plots. Let’s look at some examples.

Subplots

• Initializing an empty figure object

fig = plt.figure(tight_layout=True)

• Initializing a 2×2 grid of Axes inside the figure object

fig, axes = plt.subplots(2, 2)

• Setting a title

fig.suptitle('Change In A,B & C ')

• Plotting Index vs A in subplot[0,0]– 0th row and 0th column

axes[0,0].plot(data['A'])

• Plotting Index vs B in subplot[0,1]– 0th row and 1st column

axes[0,1].plot(data['B'])

• Plotting Index vs C in subplot[1,0]– !st row and 0th column

axes[1,0].plot(data['C'])

• Plotting D vs A for A values greater than 0 in subplot[1,1]

axes[1,1].bar(data[data['A']>0]['D'], data[data['A']>0]['A'])

Output: Let’s look at another one:

import matplotlib.gridspec as gridspec

• Initialize the figure object

fig = plt.figure(tight_layout=True)

• specify the geometry of the grid to place the subplots(The number of rows and number of columns of the grid need to be set.)

gs = gridspec.GridSpec(2, 2)

• Plotting in Subplot1

ax.scatter(x = data['A'], y = data['B'], c='r')
ax.set_ylabel('B')
ax.set_xlabel('A')

• Plotting in Subplot2

ax.plot(data['A'])
ax.set_ylabel('A')
ax.set_xlabel('Index')
fig.align_labels()

• Plotting in Subplot3

ax.plot(data['B'])
ax.set_ylabel('B')
ax.set_xlabel('Index')
fig.align_labels()

Output: Improving The Plots

Drawing Axis Lines

#Scatter plot for A vs B
plt.scatter(data['A'],data['B'],s = 200, c = '#DAFF33', marker = "*",alpha = 0.5, linewidths = 3, edgecolors='#0C96F0',zorder=1)
#Scatter plot for A vs C
plt.scatter(data['A'],data['C'],s = 100, c = '#F03C0C', marker = "X",alpha = 1, edgecolors='#0CF05F',zorder=2)
#Display Legends
plt.legend(loc = 'lower right')

#Draw axis lines
plt.axhline(0.5,ls = '--' ) #horizontal line
plt.axvline(0,ls = '--') #vertical line

The first code block plots two scatter plots on the same graph which are differentiated by two different marker symbols and colours. The axhline and axvline methods allow us to draw horizontal and vertical lines from the axes respectively at the specified value/constant.

• ls = ‘–’ sets the line style to dashed lines
• zorder = 1 sets the current plot/layer in the background.
• loc = ‘lower right’ in legend method relocates the legends to the lower right corner of the graph.

Output: plt.axhspan(0.5, 2, alpha = 0.3, color = 'r')
plt.axvspan(0, 2, alpha = 0.2)

The axhspan and axvspan methods allow us to make horizontal and vertical shades from the axes respectively at the specified range.

Appending the above code block to the previous section code will produce the following output. Closing Note

Matplotlib is an essential package that allows users to make visualisations with less effort. Many modern data visualisation libraries are built on top of Matplotlib and have similar methods and API calls for visualising with various kinds of plots.

More Great AIM Stories

A Guide to Surprise – Python Tool for Recommender Systems A Computer Science Engineer turned Data Scientist who is passionate about AI and all related technologies. Contact: amal.nair@analyticsindiamag.com

More Stories

MORE FROM AIM  Hands-On Tutorial on Visualizing Spectrograms in Python

For visualising signals into an image, we use a spectrogram that plots the time in the x-axis and frequency in the y-axis and, for more detailed information, amplitude in the z-axis. Also, it can be on different colors where the density of colors can be considered the signal’s strength. Finally, it gives you an overview of the signal where it explains how the strength of the signal is  Guide To Hummingbird – A Microsoft’s Library For Expediting Traditional Machine Learning Models

Traditional ML libraries and toolkits like scikit-learn, h2o, ML.NET etc can run on CPU environments. Microsoft’s
Hummingbird library converts conventional ML pipelines into tensor-oriented computations as neural network systems do so that they can take advantage of hardware accelerators like GPUs and TPUs for faster real-time deployment.