Why Deep Learning Is A Costly Affair

Deep learning models have brought great success to NLP applications thanks to the untiring efforts of the ML community to improve the accuracy of these models. These improvements, however, come at a cost. The computational resources required and the time consumed add up to the overall tweaking of the model. NLP models especially, have become quite popular with Microsoft, Google and NVIDIA releasing large models in the past couple of years. But how much does training these models cost is rarely talked about. In an effort to investigate this, Israeli company AI21 labs published a work detailing the costs of model training.

Deep Learning Is A Costly Affair

via AI21 labs

The researchers at AI21 labs have quantitatively estimated the costs of training differently sized BERT models on the Wikipedia and Book corpora (15 GB) where they have obtained the cost of one training run, and a typical fully-loaded cost. The following figures are based on the experiments carried out at AI21 Labs:

  • $2.5k – $50k (110 million parameters)
  • $10k – $200k (340 million parameters)
  • $80k – $1.6m (1.5 billion parameters)

However, the authors admit that the cost can be lower than the ones displayed above owing to using preemptible versions of the system, but not very far from these values. These figures also assume the usage of cloud solutions such as GCP or AWS, and on-premise implementations are sometimes cheaper. Still, the figures provide a general sense of the costs.  

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Based on information released by Google, the researchers estimate that, at list-price, training the 11B parameter variant of T5 costs well above $1.3 million for a single run. Assuming 2-3 runs of the large model and hundreds of the small ones, the (list-) price tag for the entire project may have been $10 million.

A similar but more rigorous approach was taken by Emma Strubell and her colleagues in a work published last year. The results of their work can be seen as follows:

The above table contains the estimated cost of training a model in terms of CO2 emissions (lbs) and cloud compute cost in USD.

Since it has been established that the training of deep learning is indeed an expensive affair. Let’s take a look at the factors that contribute to such high costs (according to AI21 labs):

  • size of a dataset, 
  • model size and 
  • training volume 

The researchers say that an increase in above factors result in an increase in the number of FLOPs, and the costs usually boil down to the number of FLOPs. 

And since there isn’t a proper formula to quantify how many FLOPs are needed for a certain NLP model, things get more complicated.


Based on their observations, the authors have concluded their findings as follows:

  • Since the prices on AWS were reduced over 65 times since its launch in 2006, and by as much as 73% between 2014 and 2017, they expect the same trend for AI-oriented compute offerings
  • There needs to be an end to the state-of-the-art race as many top players put all their resources just to top a leaderboard, which is impractical to put into use for others
  • Useful as neural networks are, there is a school of thought that holds that statistical ML is necessary but insufficient, and will get you just that far

Smaller organisations do not have the resources to replicate the successes these leaderboard toppers flaunt, so the authors conclude by saying that the Googles of the world should pre-train and publish the large language models while the rest of the world stick to fine-tuning them as it would be an affordable approach.

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox