“But to measure cause and effect, you must ensure that simple correlation, however tempting it may be, is not mistaken for a cause. In the 1990s, the stork population in Germany increased and the German at-home birth rates rose as well. Shall we credit storks for airlifting the babies?”Neil deGrasse Tyson, American astrophysicist.
One of the basic tenets of statistics is: correlation is not causation. Correlation between variables shows a pattern in the data and that these variables tend to ‘move together’. It is pretty common to find reliable correlations for two variables, only to discover that they are not at all causally linked.
Take, for instance, the ice cream-homicide fallacy. This theory attempts to establish a correlation between increasing sales of ice creams with the rate of homicides. So do we blame the harmless ice cream for increased crime rates? The example shows when two or more variables correlate, people are tempted to conclude a relationship between them. In this instance, the correlation between ice cream and homicide are mere statistical coincidences.
Machine learning, too, hasn’t been spared from such fallacies. A significant difference between statistics and machine learning is that while the former focuses on the model’s parameters, machine learning concentrates less on parameters and more on the predictions. The parameters in machine learning are only as good as their ability to predict an outcome.
Often statistically significant results of machine learning models indicate correlations and causation of factors, when in reality there is a whole assortment of vectors involved. A spurious correlation happens when a lurking variable or confounding factor is ignored, and cognitive bias forces an individual to oversimplify the relationship between two completely unrelated incidents. As in the case of the ice-cream-homicide fallacy, warmer temperatures (people consume more ice cream, but they are also occupying more public spaces and prone to crimes) is the confounding variable that is often ignored.
The faulty correlation-causation relationship is getting more significant with the growing data. A study titled ‘The Deluge of Spurious Correlations in Big Data’ showed that arbitrary correlations increase with the ever-increasing data sets. The study said such correlations appear due to their size and not their nature. The study noted that correlations could be found in randomly generated large databases, which implies most correlations are spurious.
In ‘The Book of Why. The New Science of Cause and Effect’, authors Judea Pearl and Dana Mackenzie pointed out that machine learning suffers from causal inference challenges. The book said deep learning is good at finding patterns but can’t explain its relationship—a sort of black box. Big Data is seen as the silver bullet for all data science problems. However, the authors posit ‘data are profoundly dumb’ because it can only tell about an occurrence and not necessarily why it happened. Causal models, on the other hand, make up for the disadvantages that deep learning and data mining suffers from. Author Pearl, a Turing Awardee and the developer of Bayesian networks, thinks causal reasoning could help machines develop human-like intelligence by asking counterfactual questions.
In recent times, the concept of causal AI has gained much momentum. With AI being used in almost every field, including critical sectors such as healthcare and finance, relying solely on the predictive models of AI could lead to devastating results. Causal AI can help identify precise relationships between cause and effect. It seeks to model the impact of interventions and distribution changes using a combination of data-driven learning and learning that are not part of the statistical description of a system.
Recently, researchers from the University of Montreal, the Max Planck Institute for Intelligent Systems, and Google Research showed that causal representations help build the robustness of machine learning models. The team noted that discovering causal relationships requires acquiring robust knowledge beyond observed data distribution and extends to situations involving reasoning.
Subscribe to our NewsletterGet the latest updates and relevant offers by sharing your email.
I am a journalist with a postgraduate degree in computer network engineering. When not reading or writing, one can find me doodling away to my heart’s content.