Can You Always Bet Big On Machine Learning?

Machine learning sure is an umbrella word for many methodologies and tools but one must be clear about the fact that it is not an umbrella word for all the solutions. No one can deny that machine learning has revolutionised the way data can be squeezed in for discoveries.

What one should care about is that the enhancement of any technology also depends on a relentless introspective approach in attacking the shortcomings. The rise in popularity sure lures every amateur into believing that they have reached their destination. With tools and frameworks being open-sourced, everyone can play with data, experiment with MNIST datasets and get really good accuracy scores. But one should always question oneself if these results could be translated to a larger platform. Do these accuracies replicate for complex human tasks like speech recognition and object detection?


If the ultimate aim of AI is to replicate human behaviour then there are a few problems that will lurk around for a while. AI might have managed to beat chess grandmasters but can it stand a chance with the language learning capabilities of a 5-year-old? Can machine learning algorithms correctly predict the next economic shutdown?


Sign up for your weekly dose of what's up in emerging technology.

Most of these questions do seem to appear on the ethical side of the spectrum. But, the technical side too, offers some difficult scenarios for AI to transform to General AI.

The problem in the early ‘50s was more of a computational problem. There were theories and mathematical proofs but there weren’t many machines to test these algorithms on.

Download our Mobile App

Later it was the lack of data to work on. Collecting data manually was tedious enough not to forget the questionable authenticity of the sources that generated data.

Skip to the ‘80s and there was considerable advancement in computation but what appeared out of blue is our own lack of understanding human intelligence.

Now we have the best hardware for accelerated computation, we have frameworks which collect data, then there is cloud to store and access data in real time. But even after 40 years since predictions of pioneers like Minsky, we still are struggling to find solutions to inherently mystical human understanding and consciousness.

Problems outside a few niches (vision, speech, NLP, robotics) aren’t clearly amenable to this approach. For example, datasets generally include event videos without other objects appearing nearby unless this object is used, (i.e a chair, stool, or a bed) and as a consequence, occlusion scenarios are rarely represented. The lack of occlusions in most existing datasets offers an unrealistic perspective of virtually all indoor (i.e. home) environments. Therefore, in the event of an occluded action, current algorithms are generally untested for such scenarios.

Let’s list a few of the shortcomings in the fundamental concepts observed by machine learning scientist John Langford:

Bayesian Learning

Explicitly specifying a reasonable prior is often hard. Human intensive. Partly due to the difficulties above and partly because “first specify a prior” is built into framework this approach is not very automatable.

Convex Optimisation

Limited models. Although switching to a convex loss means that some optimisations become convex, optimisation on representations which aren’t single layer linear combinations is often difficult.

Gradient Descent

There are issues with parameter initialisation, step size, and representation. It helps a great deal to have accumulated experience using this sort of system and there is little theoretical guidance.

Kernel-based learning

Specification of the kernel is not easy for some applications (this is another example of prior elicitation). O(n2) is not efficient enough when there is much data.


The boosting framework tells you nothing about how to build that initial algorithm. The weak learning assumption becomes violated at some point in the iterative process.

Decision tree learning

There are learning problems which can not be solved by decision trees, but which are solvable. It’s common to find that other approaches give you a bit more performance. A theoretical grounding for many choices in these algorithms is lacking.

The current error reducing, cost-cutting methodologies will flourish in the fields of finance, movie recommendations and other non-fatal avenues. In the case of diagnosis or self-driving cars, this is no excuse to have passable accuracy scores. So, if AI is deemed to shoulder the future of our species, it is only reasonable to expose the flaws in its nascent stage.


Machine learning, at its core, is a set of statistical methods meant to find patterns of predictability in datasets. Is your problem the kind of problem where getting things right 80% of the time is enough? Can you deal with an error rate?  Bad examples include predicting profits from the introduction of a completely new and revolutionary product line or extrapolating next year’s sales from past data when an important new competitor just entered the market.

Even in the fully supervised setting, a predictive model is only as good as the data on which it’s trained. Current data sets are rather limited and unrepresentative in terms of variability in physical characteristics and patterns of behaviour as well as due to issues around scene setup, occlusions, data adaptation and privacy, amongst others.

To achieve General AI, one area to focus on more is the learning patterns found in nature. This self-learning would outclass the pre-constrained models and might lead the path to a more trustworthy AI.


More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

AIM Upcoming Events

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Early Bird Passes expire on 10th Feb

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Top BI tools for Mainframes

Without BI, organisations will not be able to dominate with data-driven decision-making but focus on experiences, intuition, and gut feelings.