Beginners Guide To Data Visualisation With Matplotlib

When it comes to Python and its visualisation capabilities, Matplotlib is undoubtedly the mother of all visualisation libraries. Matplotlib is a very popular library that has revolutionised the concept of making impressive plots with Python effortlessly.

In our previous articles, we introduced you to some of the most popular plotting libraries such as Pandas plots, Seaborn, Plotly and Cufflinks. Most of these are built on top of Matplotlib which makes it an important library to know about.

In this article, we will introduce you to Matplotlib and will take you through a hands-on session to plot beautiful visualisations.

Plotting With Matplotlib

Matplotlib supports a wide variety of plots from the basic line and scatter plots to advanced multi-dimensional plots. We will start with basic plots and will discuss some of the best practices to make an attractive and intuitive visualisation.

Installing Matplotlib

Use the pip installer to install Matplotlib into your working environment. Type and execute the following command in your terminal.

`pip install matplotlib`

If you are using Anaconda distribution use `conda install matplotlib` to install the library.

Let’s make some plots!

Importing the libraries

`import pandas as pd`
`import matplotlib.pyplot as plt`
`%matplotlib inline`

The %matplotlib inline function allows for the plots to be visible when using Jupyter Notebook.

Importing the dataset

`data = pd.read_csv("sample_data.csv")`

Here we will use a simple data set made of random numbers. This is what the data looks like.

Quick Plots

The Matplotlib enables us to plot to functional plots with ease. These plots are on-the-go plots that helps us visualise data in a quick and effortless way by calling the plot method and passing the axes as arguments.

Simple Plot

Let us plot a simple line plot to depict how the value of A changes for each observation in the dataset.

`plt.plot(data['A'], c = 'b')`
`plt.title("Change In A")`
`plt.xlabel("Index")`
`plt.ylabel("A")`
`plt.legend()`

• In the above code block, we pass an indexed data (data[‘A’]) to the plot method of the Matplotlib object. The c argument sets the colour of the plot. In the above case c=’b’ sets the colour of the plot to blue.
• The title method sets the passed string as the main title of the plot. The xlabel and ylabel methods label the x-axis and y-axis respectively.
• The legend method displays the plot legends.

Output:

Scatter Plot

Lets us scatter plot the values of columns A and B against each other.

`plt.scatter(x = data['A'],y = data['B'],s = 500, c = 'r', marker = "*",alpha = 1, linewidths = 1, edgecolors='b')`
`plt.title("Scattered A vs B")`
`plt.xlabel("A")`
`plt.ylabel("B")`

The Scatter method in the above code block specifies that the plots are scattered.

• x = data[‘A’] sets the x-axis with values from the feature A of the dataset
• y = data[‘B’] sets the y-axis with values from the feature B of the dataset
• marker = “*” sets the plot symbol as *. For the list of all available markers click here.
• s = 500 sets the size of the marker symbol in the plot to 500
• linewidths = 1 sets the border width of the marker symbol used
• edgecolors=’b’ sets the colour of the marker symbol edge to blue

Matplotlib supports all HTML colour codes which can be passed in as arguments. You can get all HTML colour codes here. Copy the colour code of your choice and prepend it with a ‘#’ symbol and set it to the colour parameter.

Output:

Pie Chart

`plt.pie(x = [10,10,30,25,5,20], explode = [0.1,0.1,0.1,0.1,0.1,0.1], labels = ['A','B','C','D','E','F'], radius=2, autopct='%.01f%%',shadow=True, textprops={'size': 'smaller'})`

The above code block produces a pie chart for the values passed in as x.

• explode = [0.1,0.1,0.1,0.1,0.1,0.1] spaces each block or wedge at 0.1 units away.
• labels = [‘A’,’B’,’C’,’D’,’E’,’F’] sets the label for each wedge.
• radius=2 sets the radius of the pie chart to 2.
• autopct=’%.01f%%’ displays the x values in percentage inside the pie chart.
• shadow=True enables shadow for the plot.
• textprops={‘size’: ‘smaller’} sets the size of the text

Output:

Creating Sophisticated Plots With Objects

Matplotlib also has object-based plotting which adds flexibility to its plots. Let’s look at some examples.

Subplots

• Initializing an empty figure object

`fig = plt.figure(tight_layout=True) `

• Initializing a 2×2 grid of Axes inside the figure object

`fig, axes = plt.subplots(2, 2)`

• Setting a title

`fig.suptitle('Change In A,B & C ') `

• Plotting Index vs A in subplot[0,0]– 0th row and 0th column

`axes[0,0].plot(data['A'])`

• Plotting Index vs B in subplot[0,1]– 0th row and 1st column

`axes[0,1].plot(data['B'])`

• Plotting Index vs C in subplot[1,0]– !st row and 0th column

`axes[1,0].plot(data['C'])`

• Plotting D vs A for A values greater than 0 in subplot[1,1]

`axes[1,1].bar(data[data['A']>0]['D'], data[data['A']>0]['A'])`

Output:

Let’s look at another one:

• Import additional library

`import matplotlib.gridspec as gridspec `

• Initialize the figure object

`fig = plt.figure(tight_layout=True)`

• specify the geometry of the grid to place the subplots(The number of rows and number of columns of the grid need to be set.)

`gs = gridspec.GridSpec(2, 2)`

• Plotting in Subplot1

`ax = fig.add_subplot(gs[0, :])`
`ax.scatter(x = data['A'], y = data['B'], c='r')`
`ax.set_ylabel('B')`
`ax.set_xlabel('A')`

• Plotting in Subplot2

`ax = fig.add_subplot(gs[1, 0])`
`ax.plot(data['A'])`
`ax.set_ylabel('A')`
`ax.set_xlabel('Index')`
`fig.align_labels()`

• Plotting in Subplot3

`ax = fig.add_subplot(gs[1, 1])`
`ax.plot(data['B'])`
`ax.set_ylabel('B')`
`ax.set_xlabel('Index')`
`fig.align_labels()`

Output:

Improving The Plots

Drawing Axis Lines

`#Scatter plot for A vs B`
`plt.scatter(data['A'],data['B'],s = 200, c = '#DAFF33', marker = "*",alpha = 0.5, linewidths = 3, edgecolors='#0C96F0',zorder=1)`
`#Scatter plot for A vs C`
`plt.scatter(data['A'],data['C'],s = 100, c = '#F03C0C', marker = "X",alpha = 1, edgecolors='#0CF05F',zorder=2)`
`#Display Legends`
`plt.legend(loc = 'lower right')`

```#Draw axis lines plt.axhline(0.5,ls = '--' ) #horizontal line plt.axvline(0,ls = '--') #vertical line ```

The first code block plots two scatter plots on the same graph which are differentiated by two different marker symbols and colours. The axhline and axvline methods allow us to draw horizontal and vertical lines from the axes respectively at the specified value/constant.

• ls = ‘–’ sets the line style to dashed lines
• zorder = 1 sets the current plot/layer in the background.
• loc = ‘lower right’ in legend method relocates the legends to the lower right corner of the graph.

Output:

Colour Shading

` #horizontal shading`
`plt.axhspan(0.5, 2, alpha = 0.3, color = 'r') `
` #vertical shading`
`plt.axvspan(0, 2, alpha = 0.2) `

The axhspan and axvspan methods allow us to make horizontal and vertical shades from the axes respectively at the specified range.

Appending the above code block to the previous section code will produce the following output.

Closing Note

Matplotlib is an essential package that allows users to make visualisations with less effort. Many modern data visualisation libraries are built on top of Matplotlib and have similar methods and API calls for visualising with various kinds of plots.

Download our Mobile App

A Computer Science Engineer turned Data Scientist who is passionate about AI and all related technologies. Contact: amal.nair@analyticsindiamag.com

Our Upcoming Events

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

6 IDEs Built for Rust

Rust IDEs aid efficient code development by offering features like code completion, syntax highlighting, linting, debugging tools, and code refactoring

The Rise and Fall of JS Frameworks

Node.js is not broken enough to be fixed

‘Upskilling of Engineering Talent Key to Staying Relevant in Global Markets’

The company remains dedicated to upskill its employees and help them navigate new technologies and roles.

Why Time is Ripe for the ‘Real’ GPT-4

OpenAI ups the ante to challenge Gemini with GPT-Vision

Is MongoDB Vector Search the Panacea for all LLM Problems?

By introducing proprietary data, developers can narrow down the pool of possible responses, significantly reducing the likelihood of hallucinations

Why Intel Closing the Gap with NVIDIA is Good News

Gaudi2’s performance surpassed NVIDIA H100’s on a state-of-the-art vision language model on Hugging Face’s performance benchmarks

Can OpenAI Save SoftBank?

After a tumultuous investment spree with significant losses, will SoftBank’s plans to invest in OpenAI and other AI companies provide the boost it needs?

Oracle’s Grand Multicloud Gamble

“Cloud Should be Open,” says Larry at Oracle CloudWorld 2023, Las Vegas, recollecting his discussions with Microsoft chief Satya Nadella last week.

NVIDIA Catches Up to AMD, Intel with MCM Design

GH100 was also expected to have an MCM design, but it came with a monolithic architecture again.

13 Startups that Google Funded Under its First AI Startup Cohort

Google’s inaugural AI Startup Cohort features thirteen diverse startups that address important issues with AI. The 10-week program offers tailored support, mentoring, and technical expertise