According to Google Flights’ estimate, a round trip of a fully loaded passenger jet between San Francisco and New York would release 180 tonnes of carbon dioxide equivalent (CO2e). Meanwhile, the training emissions of Google’s 11 billion parameter T5 language model and OpenAI’s GPT-3(175 billion parameters) stands ~26%, ~305% of the round trip, respectively. The “state-of-the-art” models require a substantial amount of computational resources and energy, leading to high environmental costs. Deep learning models are getting larger by the day. Such large models are routinely trained for thousands of hours on specialised hardware accelerators in data centers.
AI can enable smart and low-carbon cities through autonomous electric vehicles, optimising smart grids etc. However, advanced AI research and product design leave a huge carbon footprint. According to a 2020 estimate, the total electricity consumption of information and communications technologies (ICTs) would account for 20% of the global electricity demand by 2030, from around 1% in 2020. According to researchers, the amount of compute used by largest ML training has been exponentially increasing and grown by more than 300,000 times between 2012-18, which is equivalent to a 3.4-months doubling period. To put things in perspective, Moore’s law has a 2-year doubling period.
Sign up for your weekly dose of what's up in emerging technology.
According to researchers at Google, the following factors influence AI/ML carbon footprint:
- Large, sparse deep neural networks(DNNs) can consume <1/10th the energy of large, dense DNNs.
- Geographic location matters for machine learning workload scheduling since the fraction of carbon-free energy and resulting carbon emissions vary (upto 10 times), even within the same country and the same organisation.
- Datacenter infrastructure matters, as Cloud datacenters can be more energy efficient than typical data centers, and the ML-oriented accelerators inside them can be (upto five times) more effective than off-the-shelf systems.
According to the researchers, electricity required to run an ML model is a function of the algorithm, the program that implements it, the number of processors that run the program, the speed and power of those processors, a datacenter’s efficiency in delivering power and cooling the processors, and the energy supply mix (renewable, gas, coal, etc.). A simplified formula for the carbon footprint of an ML model can be calculated as follows:
According to a 2018 Nature report data centers use 200 terawatt-hours (TWh), way more than the national electricity consumption of some countries. Due to the large difference between electricity-specific CO2e emission factors among countries, the carbon footprint of model training can be highly dependent on the geolocation of hardwares. Training in France always has the lowest carbon emissions as it relies on nuclear energy as their main source. This is why Abhishek Gupta, founder of Montreal AI Ethics Institute and an ML engineer at Microsoft, proposed a four pillar plan to address many such challenges. The SECure framework, as he calls it, provides actionable insights to practitioners and consumers of AI systems to trigger behaviour change– a major missing element in the current tooling that measures carbon impacts of AI systems.
Accountability With A SECure Framework
According to Gupta, the following initiatives can facilitate a more efficient framework to keep track of AI’s carbon footprint:
- Use compute efficient machine learning methods. This can be done through quantisation methods that can make CPUs run on par with GPUs and TPUs. Using resource constrained devices like the IoT devices on edge computing can be one solution.
- Utilisation of federated learning(FL)methods can be beneficial in the context of carbon footprint. For instance, in a study done by researchers at the University of Cambridge, FL was found to have an advantage due to the cooling needs of data centers. Though GPUs or even TPUs are getting more efficient in terms of computational power delivered by the amount of energy consumed, the need for a strong and energy-consuming cooling remains, thus the FL can always benefit from hardware advancements. The deployed FL tasks (i.e on device) will be lightweight and aim at lowering the number of communications.
- There are online tools like MLCO2 calculator that require developers to input parameters including the specific hardware used, the cloud provider, the datacenter region, and the number of hours trained, to get an estimate for the carbon emissions. Code-based tools like CodeCarbon and CarbonTracker alleviate the problem of the natural workflow disruption, with CodeCarbon now offering integration with CometML that further naturalizes the inclusion of such metrics with the rest of the data science and machine learning workflows and artefacts gathering.
- Initiatives like Green AI, advocate for including efficiency alongside accuracy and other evaluation metrics when building AI systems.
- Strategies like real-time energy-use tracking can help mitigate carbon emissions.
Though Gupta’s summary of the available tools and his proposals looks promising, incentivising the society to move towards accountability will still be a challenge: However, a few lingering questions remain: Shouldn’t we concentrate more on clean energy generation instead of energy consumption? Is the AI carbon footprint a result of ill-informed training practices by the researchers or does this happen during training in general? Who actually benefits from AI Research and why should a common man suffer from carbon footprints generated for the interests of a few companies? Should the research be regulated?