Visualization is one of the best ways to identify any pattern anomalies or trends in the dataset. Data visualization is considered to be a scientific method in which we visualize the data using different charts, bars, graphs, etc. in order to gain some useful and actionable insights. Visualization makes it easier for the human eyes to analyze the trend in the dataset which is not so prominent in tabular datasets.
Python provides different modules/packages/libraries which are used for data visualization. Altair is an open-source python library used for declarative statistical visualization and is based on Vega and Vega-Lite. Altair creates highly interactive and informative visualizations so that we can spend more time in understanding the data we are using and it’s meaning.
Altair’s is simple, easy to use, and consistent because it is built on top of the powerful Vega-Lite visualization grammar. It produces beautiful and effective visualizations with a minimal amount of code. The visualizations produced can be downloaded in different formats and can be manipulated using different parameters.
In this article, we will explore different types of visualizations that can be produced using Altair and how we can customize these visualizations according to our requirements.
Like any other python library, we can install Altair using pip install altair. Also, we can download the sample datasets available by altair by pip install altair vega_datasets.
a. Importing required libraries
We will be using Pandas for storing and loading our dataset, Seaborn for downloading the dataset other than this we will be using the Altair library for creating beautiful visualizations.
import pandas as pd
import seaborn as sns
import altair as alt
b. Loading the Dataset
We will be using a sample dataset named ‘Tips’ which we will download using the seaborn library and store it into a dataframe. This dataset contains different attributes like ‘tips’, ‘total bill’, etc. of different restaurant customers.
df = sns.load_datasets(‘tips’)
c. Creating Visualization
Altair supports a large variety of visualizations that can be manipulated using different parameters. Lets us create some of the most used statistical visualizations using different parameters.
- Bar Graph
A bar chart or graph represents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. Altair supports different types of bar graphs.
# Simple Bar Graph
alt.Chart(df).mark_bar().encode(x= 'total_bill',y = 'day')
#Stack Bar Graph
alt.Chart(df).mark_bar().encode(x= 'total_bill',y = 'day', color='sex')
- Scatter Plots
A scatter plot is a type of plot which displays the values for two variables for a set of data. It is generally used to visualize the relationship between two variables.
#Scatter Plot with tool tip
tooltip=['total_bill', 'tip', 'sex', 'day']
- Line Chart
A line chart is a graphical representation of price action that connects a series of data points with a continuous line. Here I will plot it using the data I am using but generally line charts are used to display the time series data like historic stock price over a time period.
y = 'tip'
A histogram is a graphical display of data using bars of different heights. In a histogram, each bar groups numbers into ranges. Taller bars show that more data falls in that range.
- Box Plots
A box plot is a type of chart often used in explanatory data analysis to visually show the distribution of numerical data and skewness through displaying the data quartiles (or percentiles) and averages.
y = 'tip'
- Binned Heatmaps
A heat map is a data visualization technique that shows magnitude of a phenomenon as color in two dimensions. Heat maps make it easy to visualize complex data and understand it at a glance.
Here we have some of the statistical charts using Altair, similarly, we can create different other charts that are there in Altair. All the graphs and plots can be downloaded in several formats which makes them more scalable and usable.
Now let us look at some of the parameters which can be used to enhance the visualizations created by Altair.
d. Parameters/Function for Enhancing Visualization
There are certain parameters which when passed along with the graphs can enhance the visualization and make it more informative and insightful.
- Interactive Function
The interactive function is used to make the graphs interactive i.e. by using the interactive function you can zoom in and zoom out of the graph easily. It is called at the end of the code for visualization.
alt.Chart(df).mark_bar().encode(x= ‘total_bill’,y = ‘day’).interactive()
The scale parameter is by default active in altair which makes the axis starting point as zero for numerical variables, we can change it according to our requirements.
- Color Schemes
Altair provides different color schemes which can be used for different types of visualization. We can easily use the color scheme we want for our visualization.
- Size of Visualization
By using properties parameters we can set the height and width of the visualization we have generated/created.
In this article we learn how we can use Altair to create different visualizations, we saw how easily and effortlessly we created these visualizations. All the visualization we created can be easily downloaded into different formats and used accordingly. Altair also provides the option of editing the graphs in Vega-Lite. Using different parameters we can scale our visualizations in order to make them more insightful and visually appealing.
If you loved this story, do join our Telegram Community.
Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.
An aspiring Data Scientist currently Pursuing MBA in Applied Data Science, with an Interest in the financial markets. I have experience in Data Analytics, Data Visualization, Machine Learning, Creating Dashboards and Writing articles related to Data Science.