Turing award-winning scientist Yoshua Bengio and his team demonstrated how fundamentals of causal inference are related to the crucial problems of machine learning and may even pave the way for transfer learning and generalisation.
Not just Bengio, many researchers accord great importance to the potential of causal inference in machine learning. Causal inference can reduce adversarial vulnerabilities, strengthen reinforcement learning mechanisms, and bring us closer to how the human mind works. A causal inference analysis helps estimate the causal effect of an intervention on some outcomes obtained from real-world non-experimental observational data.
IBM has been one of the proponents of causal inference techniques. In 2019, its research arm developed an open-source Causal Inference 360 Toolkit. It is a one of its kind toolkit that offers a comprehensive suite of methods under a unified API. IBM Research has released a new version of the open-source Python library with additional features and functionalities.
Causal Inference Toolkit
Causal inference contains a set of methods to estimate the effect of some intervention on outcomes obtained from observational data. IBM Causal Inference 360 Toolkit offers individuals access to several tools that move their decision-making process from a ‘best guess’ scenario to data-based concrete answers.
IBM Causality 360 library uses machine learning models internally and allows users to seamlessly plugin almost any machine learning model of their choice. Furthermore, it has methodologies for selecting the best machine learning models and parameters based on cross-validation paradigms and using established and novel causal-specific metrics.
This package offers a suite of causal methods under a unified scikit-learn based API. This fit and predict-like API allows a more honest effect estimation as it allows training on one set of examples and estimates the effect on the other. It implements meta algorithms that allow plugging in complex machine learning models; such a modular approach supports highly flexible causal modelling.
The package also includes an evaluation suite. It also helps in diagnosing poor-performing models by re-interpreting known machine learning evaluation from a causal perspective.
IBM’s Approach To Causal Inference
Following are the points that IBM considers for addressing causal inference estimation:
Potential outcome prediction: Every causal effect is defined by two potential outcomes. IBM adopts a two-step approach by separating the effect-estimating step from the potential-outcome-prediction step. This approach comes with the advantage that it supports multi-treatment problems where the effect is not well defined.
Average treatment effect: Special attention is devoted to the population on which the effect is estimated; for example, the average treatment effect on the entire sample and the average treatment effect on the treated.
Causal inference models: There are two types of models:
- Weight models: This type of model weights the data to balance between the treatment and control group before estimating the outcome using a weighted average of the observed outcome.
- Direct outcome model: These models use the covariates or features and treatment assignment to build a model for direct outcome prediction. This model can be used to predict the outcome under any treatment values.
Confounders and DAGs: It is imperative to properly select both dimensions of the data to avoid introducing bias. On the rows, one must thoughtfully choose the correct inclusion or exclusion criteria for individuals in the data; on the columns, care should be taken in choosing which covariates would act as confounders and should be included in the analysis.
Real-world Application
IBM Causal Inference Toolkit is being used at its research lab at Haifa, Israel, as part of their drug repurposing task, which is a method for finding new therapeutic uses for accepted drugs. The lab has discovered two new potential treatments for dementia that typically accompany Parkinson’s disease.
The team used the toolkit in collaboration with one of Israel’s private network of hospitals, Assuta health services, to analyse the impact of COVID19 on access to healthcare. The team analysed more than 300,000 invitations sent to women for breast screening exams to determine how many did not show up for their appointments. The causal inference technology revealed that the number of newly infected people influenced whether the patients showed up for their appointments instead of previously thought government’s non-pharmaceutical intervention.