Why PyWhy not DoWhy

Inspired by Pearl's work and others in the causal inference study, Microsoft introduced a software library called DoWhy in 2018.
Causal Inference
Listen to this story

Computer scientist Judea Pearl and science writer Dana Mackenzie wrote The Book of Why on causal reasoning. Pearl famously said, “data are profoundly dumb”. While data can be leveraged to make accurate predictions, even the most sophisticated machine learning techniques fail to explain how they came to the conclusions.

Pearl started working in artificial intelligence in the 1970s. He argued causation cannot be reduced to correlation. In short, you could never get causal information without using causal hypotheses.


Sign up for your weekly dose of what's up in emerging technology.

Two decades before The Book Of Why was published, Pearl developed do-calculus, which facilitates the identification of causal effects in non-parametric models. His research paper proposed a method to determine if the available assumptions are sufficient for identifying the causal effects from non-experimental data. 

Inspired by Pearl’s work in the causal inference study, Microsoft introduced software library DoWhy in 2018. It highlights the often neglected yet important assumptions underlying causal inference analyses. The library offers a programmatic interface to popular causal inference methods. In the last four years, this framework has become quite popular with contributions from dozens of data scientists. Last May, Microsoft moved DoWhy to an independent open-source governance model in a new PyWhy GitHub organisation. To build the model, Microsoft is collaborating with AWS.

Microsoft, causal inference, and DoWhy

Microsoft uses causal inference to make several important decisions, like estimating the impact of recommendation systems. The tech giant is working on fundamental advances that combine traditional machine learning with causal inference methods. Causal inference is focused on the effect of an action, unlike machine learning which is only concerned with the final outcome. 

There are a number of critical research challenges in the evaluation of causal machine learning models and in formalising and integrating domain expertise into machine learning pipelines. The standard procedure usually involves doing all the steps from scratch such as finding the right identification strategy, devising an estimator, and conducting robustness checks. However, understanding the assumptions and validating them were cumbersome.

To deal with these challenges, Microsoft has released several open-source tools and libraries such as DoWhy. The library uses the Bayesian graphical model framework to represent assumptions formally. Here, users can specify what they know about the data generating process. The open-source library estimates causal effects based on historical data alone; it is particularly useful when you can’t run experiments due to time or cost constraints.

Credit: Microsoft

DoWhy focuses on four steps of an end-to-end causal inference analysis:

Modeling: Causal reasoning starts with creating a clear model of the causal assumptions being made. 

Identification: In this step, strategies for identifying causal effects are created. 

Estimation: Once the causal effect is identified, you can choose from a range of several statistical and machine learning-based estimation methods to answer the causal question.

Refutation: In this step, the underlying assumptions are tested.

Credit: Microsoft

In PyWhy, you can build and host interoperable libraries, tools, and other resources for a host of causal tasks and applications. It is connected through a common API on foundational causal operations, and the focus is on the end-to-end analysis process.

Similar libraries and frameworks from Microsoft

DoWhy is not the only library Microsoft has introduced on causal inference. Microsoft’s ALICE team introduced a Python package called EconML to apply machine learning techniques to estimate individualised causal responses from observational or experimental data. Incorporating individual machine learning steps into interpretable causal models improves the reliability of what-if predictions and makes causal analysis faster and easier.

Project Azua is also a good case in point. It helps in developing machine learning solutions for efficient decision making that show human expert-level performance across domains. The framework divides decisions into two types – best next question and best next action.

Microsoft continues to push the boundaries of causal learning through several new initiatives, approaches, statistical advances, and deep learning methods for end-to-end causal discovery and inference. Microsoft also recognises the importance of causal learning for fairness, explainability, and interoperability of machine learning models.

More Great AIM Stories

Shraddha Goled
I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at shraddha.goled@analyticsindiamag.com.

Our Upcoming Events

Masterclass, Virtual
How to achieve real-time AI inference on your CPU
7th Jul

Masterclass, Virtual
How to power applications for the data-driven economy
20th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, Virtual
Deep Learning DevCon 2022
29th Oct

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Will Tesla Make (it) in India?

Tesla has struggled with optimising their production because Musk has been intent on manufacturing all the car’s parts independent of other suppliers since 2017.

Now Reliance wants to conquer the AI space

Many believe that Reliance is aggressively scouting for AI and NLP companies in the digital space in a bid to create an Indian equivalent of FAANG – Facebook, Apple, Amazon, Netflix, and Google.