Data Visualization is a scientific study of the data in order to find out the anomalies, patterns, or trends in a particular dataset. It can be done using a variety of plots and graphs which we can use to visualize different properties of the attributes of the dataset. Visualization is one of the easiest ways of understanding the data as we can clearly visualize the data with our naked eyes and our brain processes the data to give us a clear picture of what the data is trying to say.
Visualization can be of many types like Bar Charts, Histograms, Scatter Plots, etc. which can be used on different types of data to gain useful insights about the data. Python has a large number of libraries/modules which can be used for data visualization and creating highly informative and attractive graphs and plots. Holoviews is one such library that makes the process of visualization easier such that we can create highly informative and insightful visualizations in a few lines of code.
Sign up for your weekly dose of what's up in emerging technology.
Holoviews is an open-source python library that makes data visualization easier. Holoviews works on conveying the message that data is trying to tell rather than focusing on how to plot visualizations. Holoviews works on Numpy and Params, and for visualization, it supports ‘Bokeh’ and ‘Matplotlib’.
In this article, we will see how we can create different types of visualizations using Holoviews and how we can manipulate them according to our requirements.
Like any other python library, we will install Holoviews and all its dependencies using pip install holoviews.
- Importing Required Libraries
In this article, we will use two different datasets for different visualization. For loading the dataset we will import Pandas and Seaborn. For visualization purposes, we will import holoviews.
import pandas as pd
import holoviews as hv
from holoviews import opts
import seaborn as sns
hv.extension('bokeh', 'matplotlib') #extensions used for visualization
- Loading the Dataset
For visualization using Holoviews, we will use an Advertising dataset of an MNC which contains different attributes like Sales, Newspaper, etc. and the second dataset we will be using is a sample dataset defined under seaborn namely the ‘Tips’ Dataset which contains different attributes of restaurant billing history like ‘total bill, ‘tip’, etc.
df = pd.read_csv('Advertising.csv')
df1 = sns.load_dataset('tips')
- Creating Visualizations
We will start by creating visualizations for the advertising dataset and after that, we will create some advanced visualization using the tips dataset.
- Scatter Plot
We will plot a scatter plot between our target variable i.e’ ‘Sales’ and all other feature variables. For creating the visualizations we will start by defining our feature variables and then creating a dataset using Holoviews.
vdims = [('Newspaper'), ('Radio'), ('TV')]
ds = hv.Dataset(df, ['Sales'], vdims)
Now will use this dataset and create the visualization.
layout1= (ds.to(hv.Scatter, 'Sales', 'Newspaper') + ds.to(hv.Scatter, 'Sales', 'TV') + ds.to(hv.Scatter, 'Sales', 'Radio')).cols(2)
For creating this visualization we used “cols” functions which define the number of columns in which visualization is created that is why we can see the graphs in two rows. This graph is plotted using bokeh as we can see the symbol on the right side top corner, so these graphs are highly interactive and visually appealing.
- Bar Plot
We will create a Bar Plot of ‘Sales’ and ‘Radio’.
layout2 = (ds.to(hv.Bars, 'Sales', 'Radio'))
- Distribution Plots
We will create distribution plots of all the attributes so that we can visualize how data is distributed among all these attributes.
distribution = (hv.Distribution(ds, ['Sales']) + hv.Distribution(ds, ['TV']) + hv.Distribution(ds, ['Newspaper'])+hv.Distribution(ds, ['Radio'])).cols(2)
distribution.opts( width=400, height=250)
- Box Plot
Now we will the second dataset i.e tips dataset and create some advanced statistical charts like boxplot.
title = 'Total Bill according to Gender'
box = hv.BoxWhisker(df1, ['sex'], 'total_bill', label=title)
box.opts( width=600, cmap='Set1')
The opts function as seen in the code is used for defining the height and width of the plots and can be used to manipulate other features also.
- Violin Plots
We will create violin plots among different attributes of the tips dataset and visualize them.
violin= (hv.Violin(df1, ['day'], 'tip', label='Tip according to Day') + hv.Violin(df1, ['smoker'], 'tip', label='Tip according to Smokers')).cols(2)
In this article, we saw how easily we can create visualizations using holoviews, we started with creating some basic visualizations and after that, we created some advanced visualizations. We saw how we can manipulate the size and color of the graphs we created using the opts function. Similarly, we can create different types of visualizations using holoview and visualize different datasets easily with highly interactive and visually appealing graphs and plots.