It is a long-standing joke among the industry experts that while AI may crunch massive amounts of data, write codes that run huge machinery or even author a book, it would still fail tasks that a three-year-old human child can accomplish. This is also why AI systems still have a long path to trace to be truly called ‘intelligent’.
Hubert Dreyfus, a well-known philosopher, was one of the staunchest critics of overestimating computer/AI’s capabilities. He wrote three books – Alchemy and AI, What Computers Can’t Do, and Mind over Machine, where he critically assessed the progress of AI. One of his arguments was that humans learn from implied knowledge, and such capability cannot be incorporated into a machine.
Having said that, there have been tremendous advancements in the human endeavour to move away from Narrow AI towards the Coveted General AI. Many new models like GPT 3, DALL.E, LaMDA, Switch Transformer, etc., are extremely powerful, swallowing billions and even trillions of parameters to multi-task. But, the fact is, we are still far from reaching the goal.
The Quest For General AI
The powerful language models and the newer zero-shot text-to-image generation models are all marching really fast towards the intended goal of performing tasks for which they were not trained. Each one is outwitting the previous one for its applications and uses.
DeepMind, one of the most well known AI research institutes (owned by Alphabet), has centred its ultimate goal at achieving AGI. Interestingly, this year, the lab published a paper titled ‘Reward Is Enough‘, where the authors suggested that techniques like reward maximisation can help machines develop behaviour that exhibits abilities associated with intelligence. They further concluded that reward maximisation and reinforcement learning, in extension, can help achieve artificial general intelligence.
Let’s take a closer look at GPT-3 created by DeepMind’s closest competitor, OpenAI, which created a major buzz in the scientific community. It was widely considered a massive breakthrough when achieving General AI.
GPT-3 leverages NLP to imitate human conversations. It has been trained on one of the biggest datasets with 175 billion parameters, making it so powerful that it can complete a paragraph based on a few input words. Moreover, unlike typical narrow AI models, GPT-3 can perform tasks beyond generating human-like text, like translating between languages, reading comprehension tasks without additional input, and even writing codes!
On the flipside, GPT-3 does not have ‘common sense’. It almost blindly learns from the material that it has been trained on, scrouged from several pages on the internet. Due to its lack of common sense, GPT-3 can pick biassed, racist and sexist ideas from the internet and rehash them [Note: Open AI is working to mitigate bias and toxicity in many ways, but it is not eliminated yet]. Additionally, the transformer lacks causal reasoning and is unable to generalise correctly beyond the training set, making it far from General AI.
But, as we know, this quest is going to go on. This journey is fraught with billions of computational inputs, thousands of tonnes of energy power consumption and billions of dollars of expense to build and train these AGI models. Most ignore or fail to comprehend its consequences on the environment.
AI’s Carbon Footprint
Climate change has become a fundamental crisis of our time, and AI plays a dual role. It can help reduce the effects of the climate crisis and control it through solutions like smart grid designing or developing low emission infrastructure. But, on the other hand, it can lead to the undoing of all sustainability efforts, making the extent of its carbon emission hard to ignore.
Estimates suggest that training a single AI generates close to 300 tonnes of carbon dioxide, equivalent to five times the lifetime emissions of an average car. AI requires increased computational powers, and data centres that store large amounts of AI data consume high energy levels. The high-powered GPUs required to train advanced AI systems need to run for days at a time, utilising tonnes of energy and generating carbon emissions. This is increasing the ethical price of running an AI model.
The Price Of GPT-3 and its successors
GPT-3 is one of the largest language models. The Neural Architecture Search process of training general transformer models requires more than 270,000 hours of training and 3000 times the energy, so much so that the training has to be split over dozens of chips and broken down over months. If the input is so massive, the output is worse. A 2019 study found that training an AI language-processing system generates anywhere between 1,400 to 78,000 pounds of emission. This is equivalent to 125 round trip flights between New York and Beijing.
Sure, it is better in performance, but at what cost? Carbontracker suggested training GPT-3 just once requires the same amount of power used by 126 homes in Denmark every year. It is also the same as driving a car to the moon and back.
GPT-3 isn’t the only large language model in the market today. Microsoft, Google, and Facebook are working on and have released papers on more complex models involving images and powerful searches that go far and beyond language, to create multi-tasking and multi-modal models.
OpenAI identified how, since 2012, the amount of computing power in training large models has been increasing exponentially with a 3.4 month doubling time. If this is true, one can only imagine the energy consumption and carbon emissions until we reach AGI.
Rethinking Our Approach Towards AGI
AI could become one of the most significant contributors to climate change if this trend continues.
This entails employing efficient techniques for data processing or search and training models on specialised hardware, like AI accelerators, that are more efficient per watt than general-purpose chips. Google published the paper on Switch Transformers that use more efficient sparse neural nets, facilitating the creation of larger models without increasing computational costs. Researchers such as Lasse Wolff Anthony, who has worked on AI power usage, have suggested that large companies train their models in greener countries such as Estonia or Sweden. Given the availability of greener energy supplies, a model’s carbon footprint can be reduced by more than 60 times.
The solutions aren’t many as of now, but attempts are being made to devise them. It is important that we have conversations about it. While innovation is the basis on which a society moves forward, we must also be conscious of the cost such ‘innovation’ brings. The need of the hour is to strike a balance between the two.
This article is written by a member of the AIM Leaders Council. AIM Leaders Council is an invitation-only forum of senior executives in the Data Science and Analytics industry. To check if you are eligible for a membership, please fill the form here.