Python has many visualization libraries(matplotlib, seaborn, bokeh, plotly, etc.) to show the EDA of a dataset, but this would need many lines of code to process. Tableau is known for its amazing interactive dashboards. Imagine how powerful it would be to have leveraged the power of both Python and Tableau together to build amazing business insights.
TabPy(Tableau Python Server) is an API which allows python scripts to be run on a Tableau server. Thereby enabling EDA and visualization more effectively. This helps in building better dashboards and advanced analytics to draw insights out of it. TabPy handles the data exploration and visualization part when data developers can focus more on the data science logic to better handle business use cases.
Data science use cases are iterative in nature. Understanding, analyzing and visualizing takes a major part. Before modelling the data, proper data cleaning and preparation is needed.
So let’s now look at how to set up TabPy.
The GitHub repository can be cloned and used otherwise via pip (It is recommended to run in a virtual environment). TabPy is compatible with Python 3.6 and above.
pip install tabpy==2.3.1
If you want you can also separately install the tabpy-server and tabpy-client.
After installation from CLI run the command “tabpy” to connect to the server.
Note the port number is 9004 which will later be used in server connection from Tableau.
Open Tableau Software
Go to Help -> Settings and Performance -> Manage External Service Connection.
Specify the port number as 9004 and Sign in. Additionally, users can check the box for Require SSL to encrypt the data sent over the network. Lastly, click on Test Connection and the subsequent successful connection will be shown.
Now tableau is linked to TabPy. It’s time to do some operations with our data with Python script. Load the data onto a new worksheet in Tableau and create a calculated field.
Passing Expressions to Python
- To let tableau know that the expressions need to go to Python, it must be passed through any one of the following four functions – SCRIPT_BOOL , SCRIPT_REAL , SCRIPT_INT , SCRIPT_STR.
- Python Functions take the form of Table calculations in Tableau.
- In table calculations, all the Fields being passed to Python must be represented as Sum (PROFIT), Max (Profit), MIN(Profit), ATTR( Category) etc.
Here is an example of sentiment analysis. The python script is written in the calculated field. The right side of the page shows the sentiment scores when the calculated field is passed on the view.
Another example shows Profit greater than zero. SCRIPT_BOOL returns a boolean value from the specified expression. The expression is passed to the running external service instance. In Python expressions, use _argn to reference parameters ( _arg1, _arg2, etc.).
- _arg1 is equal to SUM([Profit])
- All the Fields being passed to python must be aggregated like Sum(PROFIT), MIN(Profit), Max (Profit), ATTR( Category) etc.
In this article, we have shown an overview of how to set up and use TabPy. As the next step to this, connect your data and use it for complete visualization and advanced analytics, storytelling, interactive graphs etc. with TabPy. Tableau can be connected to multiple data sources to create real-time and can create dashboards that are dynamic in nature.
Subscribe to our NewsletterGet the latest updates and relevant offers by sharing your email.
Machine learning and data science enthusiast. Eager to learn new technology advances. A self-taught techie who loves to do cool stuff using technology for fun and worthwhile.