Creating reports using python is an easy task because we can use different python libraries and combine our exploration of the data with some meaningful insights. But sharing this report is not that easy because not everyone or your client is used to python so that he can open your jupyter notebook and understand what you are trying to tell.
Datapane is an open-source python library/framework which makes it easy to turn scripts and notebooks into interactive reports. We can share these reports with our viewers or clients so that they can easily understand what the data is trying to tell.
Datapane allows you to systematically create reports from the objects in your Python notebook, such as pandas DataFrames, plots from visualization libraries, and Markdown text. We can also choose to publish our datapane reports online by selecting the desired audience.
In this article, we will explore how we can create a data report using Datapane and publish it to an HTML file.
Implementation:
Like any other library, we will install datapane using pip install datapane.
- Importing Required Libraries
We will be loading our dataset using pandas so we need to import pandas and for creating the report we will import datapane.
import pandas as pd
import datapane as dp
- Loading the required Dataset
We will load an advertisement dataset that contains different attributes like ‘Sales’, ‘TV’ etc of an MNC. We will use this data to create a report of the data.
df = pd.read_csv(‘Advertisement’.csv’)
df.head()
- Creating Visualizations for report
We need to create the required visualizations in the jupyter notebook so that we can pass it to the datapane reports for visualizing. Here we will create a histogram, box plots, and Regression plots.
#histogram
histogram = df.hist()
#boxplot
boxplot = df.boxplot(column=['TV', 'Radio', 'Newspaper'])
#Regression Plots
import seaborn as sns
scat1 = sns.regplot(x="Sales", y="TV", data=df)
scat2 = sns.regplot(x="Sales", y="Radio", data=df)
scat1 = sns.regplot(x="Sales", y="Newspaper", data=df)
We have created these plots and stored them in a variable so that we can call them in our report for visualization.
- Creating the report
The next step is to create a report, we will use markdowns so that we can clearly define different sections of our report.
report = dp.Report(
dp.Markdown("Advertisement Report With Sales Data"),
dp.Table(df),
dp.Markdown("Histogram Of all Attributes"),
dp.Plot(histogram),
dp.Markdown("Box Plot of the Feature Variable"),
dp.Plot(boxplot),
dp.Markdown('Regression Plots for all features against target variable'),
dp.Plot(scat1),
dp.Plot(scat2),
dp.Plot(scat3)
)
Here we have created the report with importing all the plots we have created and the dataframe as a table. Next, we will publish this report to an Html file.
- Publishing Report
The final step is to publish the report we created so that we can share the reports with respective clients/users.
report.save(path='adver_data.html')
Now let us open the report we have created and see how it looks. The report would be downloaded in your system where your python setup is installed, go to the directory, and open the file.
This is the main page of our report, you can see here that the data is represented in the form of a table and you can search for different values in the particular column. Also, we can arrange the data in ascending or descending order. Let’s look at the visualization we created and how they look in our report.
Here, we can see in the report that the data and the visualization along with the markdowns create a highly informative report which can be shared with the respective person.
Similarly, we can use different datasets to create different types of reports and visualization and publish them into shareable Html reports.
Conclusion:
In this article, we saw how we can create a report with a dataset and visualize different types of charts and graphs in it. We saw how we can publish these reports in an offer to make it a shareable report in HTML format. Datapane is easy to use and open-source due to which it is highly recommended for creating shareable data reports.