Machine Learning in plain English
If someone asks you, “What is ML?”, what will be your conceptual, non-technical answer?
Mine is . . . ML is “cluster”, “classify” and “combine”.
I use these words in their English language sense and not as techniques. What do I mean by that?
Cluster: Structure in the data is information – find the structure.
Classify: Transform structure into a Mathematical form.
Combine: Convert into insight/ action.
Do this by Learning – meaning, use the ability to generalize from experience.
This captures the essence of ML for me. From my experience, I find that –
- Combine: best done by a “paired” Data Scientist – Domain Expert combo.
- Classify: there is a grab bag of tools and techniques that the Data Scientist can exploit on one’s own. You can see my attempt at unifying this bag of tricks here – “Unifying Machine Learning to create breakthrough perspectives”. http://pgmadblog.blogspot.com/2015/10/unifying-machine-learning.html
- Cluster: I am not referring to specific clustering *algorithms* here. This step is where the Data Scientist works to sense, identify and extract structure or patterns or features in the data which are the bearers of information!
“Cluster” is the hardest part – data do not tell you where it hides the structure. Finding patterns is an “art” where inspiration, skill, experience, knowledge of inter-related theories, etc. play a major part. In a current algorithm work that I am doing, it turned out (after *months* of slicing and dicing the data) that rendering data into “phasors” (or complex variables) revealed the structure hidden in the data “by itself”!
If you are able to get at the most descriptive and discriminatory features at the “Cluster” stage, the rest of the steps will just fall into place (almost) and provide the best robust solution! If not, you may succeed but you will work many times harder to Classify and Combine and end up with non-optimal answers.
It must be clear that my comments apply only to the first time development of an algorithm for a new business problem; once an end-to-end algorithm is in place, of course, the Cluster-Classify-Combine steps can be automated for repeated application to similar data sets. But for the first-time ML algorithm solution development, automation cannot replace art!
Why is Predictive Analytics important to business?
A prerequisite for performance at a high level in business is the ability to understand and manage complexity. Complex systems to be managed properly requires a ton of data at the right time. BIG Data provide us the data we need; to put these data to work in order to take us to the high levels of complexity required while still managing it, we have to anticipate what is about to happen and react when it happens in a closed loop manner. Predictive Analytics will allow us to push our “system” to the edge (without “falling over”) in a managed fashion. This is why businesses embrace Predictive Analytics – to manage businesses at a high level of performance at the edge of complexity overload.
Prediction – the other dismal science?
An insightful person once said, “Prediction is like driving your car forward by looking only at the rearview mirror!”. If the road is dead-straight, you are good . . . UNLESS there is a stalled vehicle ahead in the middle of the road.
We should consider short-term and long-term prediction separately. Long-term prediction is nearly a lost cause. In the 80’s and 90’s, chaos and complexity theorists showed us that things can spin out of control even when we have perfect past and present information (predicting weather beyond 3 weeks is a major challenge, if not impossible). Even earlier, stochastic process theory told us that “non-stationarity” where statistics evolve (slowly or fast) can render longer term predictions unreliable.
If the underlying systems do not evolve quickly or suddenly, there is some hope. Causal systems (in Systems Theory, it means that no future information of any kind is available in the current state of the system), where “the car is driven forward strictly by using the rearview mirror”, outcomes are predictable in the sense that, as long as the “road is straight” or “curves only gently”, we can be somewhat confident in predicting a few steps ahead. This may be quite useful in some Data Science applications (such as in Fintech).
Another type of prediction involves not the actual path of future events (or the “state space trajectories” in the parlance) but the occurrence of a “black swan” or an “X-event” (for an elegant in-depth discussion, see John Casti, “X-Events: Complexity Overload and the Collapse of Everything’, 2013). For that matter, ANY unwanted event can be good to know about in advance – consider unwanted destructive vibrations (called “chatter”) in machine tools, as an example; early warning may be possible and very useful in saving expensive work pieces (“Instantaneous Scale of Fluctuation Using Kalman-TFD and Applications in Machine Tool Monitoring”). We find that sometimes the underlying system does undergo some pre-event changes (such as approach “complexity overload”, “state-space volume inflation”, “increase in degrees of freedom”, etc.) which may be detectable and trackable. However, there is NO escaping False Positives (and associated wastage of resources preparing for the event that does not come) or False Negatives (and be blind-sided when we are told it is not going to happen).
At Syzen Analytics, Inc., we use an explicit systems theory approach to Analytics. In our SYSTEMS Analytics formulation (“Future of Analytics – a definitive Roadmap”), the parameters of the system and its variation over time are tracked adaptively in real-time which tells us how long into the future we can predict safely – if the parameters evolve slowly or cyclically, we have higher confidence in our predictive analytics solutions.
Wanting to know the future has always been a human preoccupation – we see that you cannot truly know the future but in some cases, predictions to some extent are possible . . . surrounded by many caveats; more of “excuses” than definitive answers. Sounds a lot like a dismal science!
Future of Analytics – Spatio-temporal data:
As businesses push to higher levels of performance, higher fidelity models are going to be necessary to produce more accurate and hence valuable predictions and recommendations for business operations.
ALL data are spatio-temporal! At the simplest to more complex levels –
- Data can be considered isolated at the simplest level – a “snap shot”.
- Then we realize that data exist in a “social” network with mutual interactions.
- In reality, data exist in *embedded* forms in “influence” networks of one type or the other which are distributed in time and space – a “video”!
Spatial extent of data (distance) can be folded into time if we assume a certain information diffusion speed. Graph-theoretic methods do not account for time dimension. For accurate analysis, no escaping Dynamics over Time; meaning the use of differential (or difference) equations . . . and Systems Theory!
Systems Theory + Analytics = “SYSTEMS Analytics”! A few example business applications are shown above. As you can see, it spans most of the current Analytics use cases and many more promising ones when network graphs and spatio-temporal nature of data are fully incorporated in the coming years – basic theories and some algorithms are already in hand. For specific technologies, see –
- “SYSTEMS Analytics – the future of Analytics today!” http://www.jininnovation.com/Systems-Analytics.html
- For a full 30-minute discourse, Youtube video on “Future of Analytics – a definitive roadmap” at https://youtu.be/1TAYLQw3u9s
From the simple explanation of ML, the power and caveats about prediction and the promising Analytics technology roadmap ahead, it is clear that Data Science is indeed a rich area to mine that can create even bigger impact on business performance in the coming years.