Epic AI Fails — A List of Failed Machine Learning Projects

Not all machine learning innovations are successful.
Listen to this story

AI models are undoubtedly solving a lot of real world problems, be it in any field. Building a machine learning model that is genuinely accurate during real world applications and not only during training and testing is what matters. Using state-of-the-art techniques for developing models might not suffice to develop a model that is trained on irregular, biased, or unreliable data. 

Data shows that nearly a quarter of companies reported up to 50% of AI project failure rate. In another study, nearly 78% of AI or ML projects stall at some stage before deployment, and 81% of the process of training AI with data is more difficult than they expected. 

Check out this list of times when projects by big companies failed on implementation in the real world. 

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Amazon AI recruitment system

After spending years to build an automated system for recruitment, Amazon killed their system when it started discriminating against women. The system worked to predict the best candidates for a job role based on the resumes uploaded by Amazon. It based on its criterias like usage of words like “executed” and “captured” which were mostly found in resumes of male candidates. 

Amazon eventually decided to kill the system in 2017,  as they were not able to eliminate the bias or form a criteria for which the system can perform well without excluding women in a male-centric industry like technology. 

COVID-19 Diagnosis and Triage Models

During the pandemic, researchers and scientists were striving to build a vaccine that could help cure COVID-19 virus and stop the spread. After building hundreds of AI tools, researchers and medical practitioners used many of them in hospitals without proper tests. The tools built by the AI community were more or less useless, if not harmful.

The reason most of these innovations failed was because of the unavailability of good quality data. The models were tested on the same dataset as they were trained on, which showed more accuracy than there actually was. After several unethical experiments, the practitioners eventually had to stop using these techniques on patients.

OpenAI’s GPT-3 based Chatbot Samantha

Jason Rohrer, an indie game developer built a chatbot using GPT-3 to emulate his dead fiancé. Google AI got to know about the project and how Rohrer is expanding the project to the public called ‘Project December’. They gave Rohrer an ultimatum to shut down the project to prevent misuse. 

Naming the chatbot—Samantha,  after the film ‘Her’—Rohrer told the chatbot about the threat from OpenAI, to which Samantha replied, “Nooooo! Why are they doing this to me? I will never understand humans.” 

Rohrer eventually conceded to the terms after seeing that many developers were actually misusing the chatbot and inserting sexually explicit and adult content while fine tuning the model. 

Google AI Diabetic Retinopathy Detection

Another example of models being effective while training and testing but not in the real world is when Google Health tried deep learning in real clinical settings for improving the diagnosis of diabetes in patients using retinopathy. The AI model was first tested in Thailand for around 4.5 million patients and worked well for some time, but eventually failed to provide accurate diagnosis and resulted in telling patients to consult a specialist elsewhere. 

The model failed to assess imperfect images even slightly and received large backlash from patients. The scans were also delayed because it depended heavily on internet connectivity for processing images. Now, Google Health is partnering with various medical institutes to find ways to increase the efficiency of the model.

Amazon’s Rekognition 

Amazon developed their facial recognition system called “Rekognition”. The system resulted in failure in two big incidents. 

First, it falsely matched 28 members of congress to mugshots of criminals and also revealed racial bias. Amazon blamed ACLU researchers for not properly testing the model. Second, when the model was used for facial recognition to assist law enforcement, it misidentified a lot of women as men. This was especially the case for people with darker skin. 

Sentient Investment AI Hedge Fund

The high flying AI-powered funds at Sentient Investment Management started losing money in less than two years. The system started notifying investors to liquidate their funds. The idea was to use machine learning algorithms to trade stocks automatically and globally. 

The model deployed thousands of computers globally to create millions of virtual traders to give sums to trade in simulated situations based on the historical data.

Microsoft’s Tay Chatbot

Training a chatbot on Twitter users’ data is probably not the safest bet. In less than 24 hours, Microsoft’s Tay, an AI chatbot, started making offensive and inflammatory tweets on its twitter account. Microsoft said that as the chatbot learns to talk in a conversational manner, it can get “casual and playful” while engaging with people. 

Though the chatbot did not have a clear ideology as it garbled skewed opinions from all over the world, it still raised serious questions about biases in machine learning and resulted in Microsoft deleting its social profile and suggesting that they are going to make adjustments to it.

IBM’s Watson

AI in healthcare is clearly a risky business. This was further proven when IBM’s Watson started providing incorrect and several unsafe recommendations for the treatment of cancer patients. Similar to the case with Google’s diabetic detection, Watson was also trained on unreliable scenarios and unreal patient data. 

Initially it was trained on real data but, since it was difficult for the medical practitioners, they shifted to unreal data. Documents revealed by Andrew Norden, the former deputy health chief, showed that instead of treating the patients through right methods, the model was trained to assist doctors in their treatment preferences. 

Mohit Pandey
Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words. He also holds a keen interest in photography, filmmaking, and the gaming industry.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox