MITB Banner

Correlation Analysis can be first step to Insights but proceed with Caution

Share

The value of every information system is the opportunity for insight. Once an organization has insight all things are possible. With insight comes new opportunity – to make money, to save money, to improve goods and services, and so forth.

But where does insight come from? Insight can come as easily as glancing out a window and seeing a rainbow or insight may be as difficult as investing money in a dot com startup only to ultimately discover that there never was a business case for what was being built.

But one of the surest ways to spark the fire of insight is through correlation analysis. Correlation analysis is analysis of events that occur together. Suppose a study is about event A. But in gathering information about event A it is noticed that another event – event B – happens frequently. From an insight perspective, the fact that two events occur in tandem with each other is great opportunity to gather insight. Some typical questions are –

  • Does event A cause event B? If so, under what conditions?
  • Is there another cause of event A and event B? If so, what is the cause and under what conditions are the different events triggered?

Caution must be exercised when making inferences based on coincidence of events. Given enough factors a correlation can be discovered which is nonsensical. Just because two factors have events in concert with each other does not necessarily imply a real relationship.

A famous correlation once was the correlation between the winner of the Super Bowl and whether the stock market would rise or fall that year. For many years when the NFL won the Super Bowl the stock market would rise and other years when the AFL won the Super Bowl the market would fall.

Of course, there is no actual relationship between professional football and the stock market. The coordination of the events was purely a random coincidence.

Having stated that, there are many cases where there is a real reason for the correlation of events. When there is a real relationship between events, correlation analysis is a powerful analytic tool.

There are plenty of other possibilities for insight when using correlation analysis.

Correlation analysis is never more powerful and never more pregnant with possibilities then when applied to the medical and health care environment. When a medical or healthcare organization gathers all of its episodes of care and other encounters between doctors and patients and then integrates and assimilates the associated events, insightful conclusions can result. Often times very interesting and unexpected results can be the result.

Looking at the natural correlations that have evolved over the years in the practice of medicine can be very useful. What happens is that often times no major patterns are discerned by any one doctor (short of alerting his/her intuition) because the doctor sees only one patient at a time, that is the patient that is immediately in front of him/her.

But given many observations taken over many doctors and patients and taken over a lengthy period of time, medical and health patterns start to emerge that have otherwise been unnoticed. And sometimes these patterns have very profound implications to healthcare.

One of the challenges of doing correlation analysis that is meaningful is that many events must be reported. It doesn’t do much good to examine 50 or even 500 incidences of medical care. Instead 500,000 or 5,000,000 incidences of care are much better for spotting previously unseen correlations and patterns.

Another of the challenges is that of dealing with the data. Medical and healthcare data is notoriously unstructured. Medical and healthcare data is usually textual data. In addition, there is a very variable nomenclature for the same event or activity. Recently a knowledgeable physician told me that there were at least 15 ways to describe the same thing – a broken bone.

Yet another problem with meaningful correlation analysis is that in some cases there are terms that are spelled differently. To do meaningful correlation analysis, there needs to be a single consistent spelling of the same term.

Still another problem is that for correlation analysis to be effective, the results need to be visual. When correlation analysis is not visual, patterns that are unclear or faint tend to hide – to get lost in the massive amount of data and other more obvious patterns. But when visual techniques are used for correlation analysis, the chances of spotting faint and unclear patterns become greatly enhanced.

The good news is that correlation analysis for the medical community is now a real possibility. Leading medical research firms are starting to use powerful new technology for very sophisticated correlation analysis.

It is a reasonable expectation that today’s technology can spot the important patterns that were as little as a year ago unable to be found.


Now you can hear and see Bill Inmon on the Internet. Take a look at his new videotape education series on safaribooksonline.com

Bill covers IT topics from A to Z.

Share
Picture of William Inmon

William Inmon

William H. Inmon (born 1945) is an American computer scientist, recognized by many as the father of the data warehouse. Bill Inmon wrote the first book, held the first conference (with Arnie Barnett), wrote the first column in a magazine and was the first to offer classes in data warehousing. Bill Inmon created the accepted definition of what a data warehouse is - a subject oriented, nonvolatile, integrated, time variant collection of data in support of management's decisions.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.