Consider this price and sales table for four cities:
Can you figure out how each city is performing? Notice that the average for each city is the same.
Now take a look at the same data, plotted.
The patterns are a lot clearer now, and you can quickly see that:
- The four cities are completely different in behavior and need different strategies for growth.
- That Delhi is price sensitive, while Bangalore and Hyderabad are not
- There is at least 1 data point each at Hyderabad and Mumbai that look like aberrations
This is an example of the issue industry faces today: significantly larger quantities of data, but still visualized as plain tables.[divider top="1"]
This is an anonymised version of our very first visualization.
We were working with a leading mobile operator who provided us with minutes-of-usage data. We plotted this time-series on a calendar, creating the Calendar Map you see above. Red cells show days with lower usage, and green cells show days with higher usage.
This made it possible to spot a number of patterns that were relatively hidden until then. For example, on this calendar map, it's obvious that call volumes are lower on Sundays. But 31st July was a relatively good Sunday, with high call volumes. That's tough to spot on a line graph because it's not high in absolute terms -- just high for a Sunday.
With this visualization, our client discovered a number of insights in calling pattern behavior of their customer segments. For example, the share of rural traffic rises on Sundays, mainly because urban traffic falls while rural traffic is unaffected. It also made it possible for them to identify specific days on which their competitors' call volumes shot up, and helped them identify which competitor's campaigns were proving effective against them.[divider top="1"]
Here's a visualization of the social network of geeks across different cities in India. An interactive version of this is available at http://gramener.com/codersearch. We built this to identify who would be a good candidate to hire, as well as decide which city is the best hunting ground for geeks.
Each circle represents a developer. The size indicates the number of followers they have on Github. The color indicates the language they code in. Networks of followers are connected by lines and clustered together.
This is an instance of transforming relatively unstructured data into quantitative metrics (distance between a pair of people; density of a network; etc) and displaying them purely visually, without any numbers. As a result, it conveys far more richness and meaning intuitively to the viewers.[divider top="1"]
Another instance is this visualization of the entire history of batting in Indian one-day cricket. The size of the box represents the number of runs scored by the player. The color indicates the speed at which they scored those runs (red is slow, green is fast.)
It's evident that among the big scorers, Sehwag is India's fastest run-getter. Clicking on the players shows an second drill-down featuring every match they've played. An interactive version of this is available at http://gramener.com/cricket/batting-India-plain
This compresses over 150 pages of information into a single sheet without any loss. Part of the power of data visualization comes in this ability to compress information and compactly convey insights.[divider top="1"]
We're moving into visualizations of non-quantitative data. There's a lot more text out there than numbers, and it's possible to mine information from that. For example, even a pure-text corpus like the Mahabharata lends itself to social network analysis.
Register for our upcoming events:
- Join the Grand Finale of Intel Python HackFury2: 21st Oct, Bangalore
- WEBINAR: HOW TO BEGIN A CAREER IN DATA SCIENCE | 24th Oct
- Machine Learning Developers Summit 2020: 22-23rd Jan, Bangalore | 30-31st Jan, Hyderabad
Enjoyed this story? Join our Telegram group. And be part of an engaging community.
Provide your comments below
What's Your Reaction?
Anand has advised and designed IT systems for organisations such as the Aditya Birla Group, Citigroup, Honda, ICICI, IBM, Oracle, RBS & SAP, Steelcase, Tesco, etc. He has an MBA from IIM Bangalore with two gold medals and a B.Tech from IIT Madras. He has worked at IBM, Lehman Brothers, The Boston Consulting Group , Infosys Consulting and co-founded Gramener where he is Chief Data Scientist. He blogs at s-anand.net.