Last updated February 1, 2018
In AI Origins & Evolution

Finally! Big Data 2.0 addresses causality…

Share

Illustration by cars parked in a row outside.

Published on February 1, 2014

by Roland Hallebeek

One of the most famous examples in recent Big Data literature is the example of colours of second-hand cars. You want a second-hand car with the smallest chance of defects? Buy an orange second-hand car. The correlation between this colour and the %-tage defects is statistically speaking very strong and there are several explanations for that. It could be that orange is a colour for car-lovers: a special colour for people who see their cars as a very important part of their life and, thus, treat their car accordingly. Maintenance, oil checks, turtle wax, the lot. Or, it could be that orange is such a signal colour that other motorists notice these cars a fraction earlier so on average there are less collisions with orange cars, and so they survive traffic with on average less damage than cars with other colours.

Big data WILL BE about causality

Whatever the reason, it is good news for prospective car buyers because now they know how to react when their goal is a car with the least chance of defects. It is, however, bad news …………….if you are a car dealer who is in the business of selling white second-hand cars. In other words, correlation is great for reacting, but it is causation you need for managing and controlling. Because if you know (not: assume) that orange cars are maintained so well by their owners, you, as a dealer who sells white second-hand cars, can put more emphasis on maintenance reports. If correlation shows you that teams with an higher average age underperform, the correlation-reflex is fire them and hire younger employees. The managers among us know that is not (or only limited) possible. So, causality is needed in order to truly fix the problem.

Enter ‘soft controls’

Accountants talk about little else: soft controls are all those quantified variables that you cannot find in the company datawarehouse. “Has our strategy landed among employees?”, “Are we innovative enough?”, “Are our processes working efficiently?”, “Where to improve our B2B sales force?”. The answers to critical issues like these aren’t in the corporate admin. You have to ask people. Ask people in a certain way, add some clever algorithms and you will get surprising new insights. Our company, Transparency Lab, specializes in quantifying any business issue the corporate datawarehouse can’t help you with.

How to measure soft controls?

As we said before: for soft controls you have to ask people. Check any questionnaire in your organization and 95% asks for a respondent’s opinion.

We are an innovative organization
Not agree/somewhat agree/neutral/somewhat agree/agree

How good do we serve clients? (give your rating)
0 – 1 – 2 – 3 – 4 – 5 – 6 – 7 – 8 – 9 – 10

Make a note now: opinions do not cut it. In hard controls, in the corporate datawarehouse, a 1 is a 1 and 463 is 463. This is different with opinions. When asking for opinions my ‘2’ rating for how we serve clients might be a ‘4’ for you. We both might somewhat agree that we are an innovative organization but still not agree among us why. So, for big data and causality, we need to ask for verifiable facts or verifiable behaviour. Ask people about verifiable facts/-behaviour, add some clever algorithms and you will get surprising new insights. Not just analysis but actionable reports on status, ambition, maturity, alignment, best management style, knowledge sharing, waste and personal improvement. True ‘control’ when talking soft controls.

Soft controls are forward looking

Data in the corporate admin will not tell you what’s going to happen. Sure, you an extrapolate sales trends and do a forecast but it’s rought at best. Soft controls, however, do have a future component. Look at the following multiple-choice question:

How has your team defined innovation objectives?

Not done yet
Busy doing so
We have a formal set of SMART objectives
We have a formal set of SMART objectives

AND these have been signed off by management

Here you can ask for an actual situation (e.g. “Not done yet”) and an ambition over, say, 6 months (e.g. “We have a formal set of SMART objectives”)

Soft controls have their own patterns

Allow us to share some of the patterns we found after 500 assessment with 50.000 respondents and 5 million data points. These projects where in various industries (Financial Service, Public sector, Industry, Energy, Retail), about various topics (strategy, HR, marketing & sales and IT) and in various regions (Germany, Italy, Netherlands, Switzerland, USA). So, the patterns we show here tell something very general: how groups of people in organizations apparently do their work.

Pattern 1: 70-20-10
We asked in each of the 500 projects for two answers on each question: a score for the actual situation and a score for the future (usually the respondents ambition/expectation for over 6 or 12 months).

In 70% of the projects the respondents wanted to improve almost equally on each question in the assessment and usually improve with approx. 20% (so: no priority)
In 20% of the projects the respondents wanted to improve dramatically: on a scale from 0 to 10 on average from a 4.5 to a 9.5. (so: no realism)
In 10% of the projects the respondents didn’t plan any improvement (so: no ambition)

Take away for managers
In none of the 500 projects the overall group scores more or less represented what the management objective was. So, managers should not (only) set objectives in hard controls (“Sell 10% more”) but also in soft controls (“Improve the sales proces with items X, Y and Z)

Pattern 2: in 1 out of 3 areas there is a harmful level of disagreement
Using a tweaked form of a dendrogram (Google on ‘cluster analysis’), we can quantify the level of (dis-)agreement in a group of respondents. On average, the 500 projects had 10 different groups of respondents so we compared 5.000 dendrograms. It turned out that – irrespective of industry, topic of the analysis or region – that in 1 out of 3 areas there is a harmful level of disagreement. Harmful being enough to stall the working proces. And this misalignment is not so much because people are fighting or do not want to change: respondents can move very enthousiastically in opposite directions netting a no-change effect.

Take away for managers
There is always resistance; you just don’t know where. Use dendrograms or similar analysis to find out where there is resistance. And …. How heavy.

Pattern 3: 20% of the topics is 50% of the gap
We designed our own graph to compare the respondents’ actual situation score with the management target. In every group of respondents there was underscore (respondents not having achieved the management target) and overscore (respondents already achieving – or surpassing – the management target). But, looking at the macro picture, it turned out – on average – that 20% of the assessment topics/soft controls contributed to 50% of the gap between actual situation and management target.

Take away for managers
It’s only a limited set of soft controls that help to quickly close the gap with the target which means managers can focus their resources to achieve the most progress

Causality between hard and soft controls is the new pinnacle in Big Data

We showed you how soft controls can be made sufficiently “hard” in order to calculate patterns. So, if soft controls turn out to be hard enough, the next step is to correlate hard and soft controls. We can ask owners of orange cars about their driving- and maintenance habits. And compare them with owners of cars with other colours. We already have correlated a whole set of innovation-related soft controls with the amount of revenue-from-innovations. In retail, we correlated in-store activities with revenue growth and developed an early warning system signalling stores that betted heavily (the “In 6 months”-scores) on activities that proved to correlate negatively with revenue growth. Stay tuned for more.

Just correlation is not enough. Causality! Soft controls!
Ask people verifiable questions about actual situation and ambition
Pattern 1: 70-20-10
Pattern 2: harmful disagreement in 1 out of 3 areas
Pattern 3: 20% of the topics is responsible for 50% of the gap
The new pinnacle in Big Data: causality between hard and soft controls

Access all our open Survey & Awards Nomination forms in one place