Exploratory data analysis is the first thing we would do after acquiring the dataset. EDA allows us to better understand the data. Do some operations and make data ready to be fitted into the model. Pandas is our go-to library for exploratory data analysis of tabular data or structured data in Python. Pandas is the most preferred library due to its ease of access, readily available functions and enhanced operations. Analysis and visualization of data go hand-in-hand.
PandasGUI is a graphical user interface to visualize and analyse pandas DataFrame. There are numerous operations that PandasGUI can perform such as statistical operations, applying filters, plotting graphs(scatter, box, histogram, etc), reshape the dataframe and many more. One of its key features is Drag and drop which is handy to directly import data from any dataframe. Can handle multiple dataframes at a time.
In this article, I’ll discuss the features of PandasGUI and demonstrate the operations that it can perform on Pandas DataFrames.
Installation
pip install pandasgui
Basic Operation
Using pandas dataframe is loaded or initialized
import pandas as pd from pandasgui import show df = pd.DataFrame(([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), columns=['a', 'b', 'c']) show(df)
The show() function upon the dataframe will activate the GUI window and it’ll be shown as below.
The statistics tab will show the statistical analysis of the dataframe – type of each data, count, the number of the unique data present(n unique), mean, standard deviation, min and max.
The Grapher tab has various options for graphs – histogram, scatter plot, line plot, bar chart, boxplot, heatmaps, pie charts, etc. These plots are produced using Plotly.
Line graph
Users can also provide their customization using the Custom Kwargs option.
Scatter plot
Various operations over the plot are also available as shown in the right corner of the image. This plot could be downloaded in a png format, zoom in and out, autoscaling, resetting axes, etc.
Box Plot
Editing Data
One of the interesting features of pandasgui is editing data. Data can be replaced with other values or deleted and also copied and pasted to any notepad, word or excel file.
Working with a dataset
Pandasgui library has a dataset module containing datasets – iris, titanic, pokemon, car crashes, mpg, stockdata, tips, mi_manufacturing, gapminder. These datasets will at first be downloaded in CSV format from the API. Custom datasets can also be uploaded similarly as the above example using pandas dataframe to then
For demonstration, I’ve shown the iris dataset:
from pandasgui import show from pandasgui.datasets import iris show(iris)
For Grapher to set names and values of specific variables you should drag and drop them from the variable list to the right column. Here variation of species with sepal width is seen in the pie chart.
Filtering Data
The filters tab is used to provide query expressions and according to that data will be filtered.
Reshaper
Data can be reshaped using pivot and melt features.
Pivot
Melt
Conclusion
PandasGUI can be used as an alternative to pandas. Its features can be quite accessible for any beginner and professional data scientists. Its quick and handy features make it user-friendly. It’s under active development. It can be easily used by non-technical or non-coders. PandasGUI is a super cool toolkit for any data analysis and visualization work.
Have fun experimenting with it.