Hands-On Tutorial On Lens: Python Tool For Swift Statistical Analysis

The lens is an open-source python library which is used for fast calculation of summary statistics and the correlation in the dataset. It helps us explore the properties of different attributes of the dataset in just a single line of code.
Lens Data Analysis

Whenever we are working with datasets the first step is generally understanding what is the data all about. So for exploring the data we start with Exploratory Data Analysis which is analyzing the data with certain techniques and visualization in order to get a clear idea of the data we are dealing with. In EDA we analyze different attributes and their statistical properties also we visualize the data using different graphs and plots.

EDA is a necessary step so we cannot neglect it, but performing EDA generally is a pretty time-consuming task because we need to write different types of code for statistical properties as well as codes for different types of visualizations. There are different python libraries and modules which can help in reducing the efforts and time taken in EDA by simple and easy to use codes. The lens is one such library.

The lens is an open-source python library which is used for fast calculation of summary statistics and the correlation in the dataset. It helps us explore the properties of different attributes of the dataset in just a single line of code. It creates different types of visualizations of all the attributes in the data. It works on both numerical and categorical data. It is blazingly fast and easy to use. 


Sign up for your weekly dose of what's up in emerging technology.

In this article, we will explore how we can perform EDA using Lens and save time and effort.


We will start by installing lens using pip install lens

  1. Importing Required Libraries

We would load the dataset we will use using pandas so we will import pandas and we will import lens for data analysis and visualizations.

import pandas as pd

import lens

  1. Loading the Dataset

The dataset we will use here is an advertising dataset of an MNC which contains different attributes like ‘Sales’, ‘TV’, etc. We will load this dataset using pandas.

df = pd.read_csv(‘Advertising.csv’)


Dataset Used
  1. Statistical Analysis of Data

Now as we have loaded the dataset we will work on displaying the statistical properties of this dataset. We will use the summarise and explore function to display the statistical properties of the dataset.

data = lens.summarise(df)

exp = lens.explore(data)


Dataset Summary

Similarly, we can use these functions to display the properties of a single column also.


Column Summary
  1. Correlation in Dataset

Analyzing and visualizing is easy in the lens, we just need to write a single line of code.


Correlation Matrix


Correlation Plot
  1. Visualization

We can easily visualize different attributes of the dataset using different plots which are already defined in Lens. Let us look at some of the visualizations.


Distribution Plot


CDF Plot

Lens has an attractive function named ‘interactive’ which creates a user interface where users can select different attributes and different type of attributes. Let us visualize this interface.


Distribution Plot

Here you can clearly see that we can select different attributes and visualize the different type of plots and graphs of those attributes. Let us see some other plots also.

Density Plot
CDF Plot


In this article, we learned about Lens which helps in fast calculation of summary statistics and correlation. We saw how we use the lens for analyzing the statistical property of a dataset as well as of single columns. We also saw different types of visualization that are provided by the lens and created some of the plots. Finally, we saw the interactive function which created a user interface for selecting different graphs and plots for different attributes. The lens makes the process od data analysis and visualization simpler and effortless. 

More Great AIM Stories

Himanshu Sharma
An aspiring Data Scientist currently Pursuing MBA in Applied Data Science, with an Interest in the financial markets. I have experience in Data Analytics, Data Visualization, Machine Learning, Creating Dashboards and Writing articles related to Data Science.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM