Active Hackathon

Why Benchmarking TinyML Systems Is Challenging

Modern day semiconductor devices can perform a million mathematical operations while occupying only a tiny amount of area (think tip of a pencil). The compactness of these chips brought the powers of machine learning to the edge; into our pockets. Machine learning on smartphones is already a thing of the past. So, what’s next?

According to Matthew Mattina of ARM, groundbreaking ML models like AlphaGo, Alexa, GPT-3, and AlphaFold, which require GPUs and massive power supplies, will soon be running on devices that consume less power than a single LED bulb!


Sign up for your weekly dose of what's up in emerging technology.

The rise in demand for smarter and more compact devices warrants expanding the edge ML to microcontroller-class devices. This is where TinyML comes in. The objective of TinyML is to enable ultra, low powered devices that can run machine learning models.

Applications of TinyML

  • Privacy offered by TinyML platforms can make them an excellent alternative to edge and cloud for ML-based applications.
  • With the large scale adoption of 5G around the corner, TinyML can play a significant role in connecting millions of people.
  • Home Pods that run on real-time inferences using ML models require TinyML frameworks.
  • Low powered augmented reality glasses can make a comeback with TinyML (think Google Glass).
  • All IoT devices that flourish on microcontrollers can leverage machine learning with TinyML solutions.

Other applications include, hardware (dedicated integrated circuits), algorithms and software capable of performing on-device sensor (vision, audio, IMU, biomedical, etc.) data analytics at extremely low power (typically in the mW range and below, hence enabling a variety of always-on use-cases, and ideal for battery-operated devices).

With so many applications, why don’t we see much of TinyML solutions? Because, to back TinyML solutions, researchers need to convince their developers. For this to happen, a benchmark is required for displaying consistency. But, benchmarking TinyML is not easy. The systems must be small enough to fit within the tight constraints of microcontroller units (MCU) class devices with only a few hundred KB of memory and limited onboard compute (processor clock speed).

Benchmarking Low Powered ML Systems Devices

(Source: Paper by Banbury et al.,)

In a report, published by a team from Harvard University and other top collaborators, the experts have investigated the pain points plaguing the TinyML ecosystem. “Today’s state-of-the-art benchmarks are not designed to handle the challenges readily. They need careful re-engineering to be flexible enough to handle the extent of hardware heterogeneity that is commonplace in the TinyML ecosystem,” observed the authors.

Inconsistent Power Supply

TinyML is known for its ultra-low power consumption, often in the range of 1 mWatt and below. Since TinyML models consume different amounts of power, maintaining accuracy across the range of devices becomes difficult. The above plot is a logarithmic comparison of the active power consumption between TinyML systems and those supported by MLPerf; MLPerf is a consortium of AI experts promoting fair benchmarks. As we can see, TinyML systems can be up to four orders of magnitude smaller in the power budget as compared to state-of-the-art MLPerf systems.

It makes things even difficult for determining when data paths and pre-processing steps can vary significantly between devices. Further, factors like chip peripherals and underlying firmware can impact the measurements.

Limited Resources

The tiny in TinyML denotes both power and memory. The memory on the device is cut to the brim. Even, popular image classification tasks with large label space are fit for low-power always-on applications but, are computationally intensive and memory hungry for today’s TinyML hardware. The budget is tight. There is no workaround like in smartphones that can cope with resource constraints in the order of a few GB; TinyML systems are typically coping with resources that are two orders of magnitude smaller.

From a benchmarking perspective, wrote the experts, hand coded submission will likely produce the best numerical results at the cost of reproducibility, comparability and time. If models can be made to sit comfortably with the tight on-chip memory constraints, computational costs that usually hamper traditional machine learning platforms go out of the equation. This breakthrough alone can lead to widespread adoption, and dispersion of TinyML platforms.

(Source: Paper by Banbury et al.,)

To make benchmarking accessible for others, the experts have shortlisted a few datasets and models for each use case as shown above. The datasets listed in the report help specify the use cases, are used to train the reference models, and are sampled to create the tests sets used during the measurement on device. These well known datasets that are relevant to industry use cases can also be availed to train a new or modified model in the open division. 

For TinyML to go mainstream, there needs to be a universally acceptable standard; a benchmark that would allow one to assess TinyML as a service. The experts, however, are optimistic as they are betting on the low-power, cheap 32-bit MCUs that have revolutionised the computational capability on edge. For instance, Arm’s Cortex-M, a well-known ML-based IoT platform, are now regularly performing tasks that were previously thought to be infeasible.

Pocketing The Future

In an interview for Andrew Ng’s Batch newsletter, Mattina hoped that 2021 would be the year when TinyML will go mainstream. Talking about its implications, he cited the example of the pandemic and how neural networks were put into use. “There was research suggesting that a collection of neural networks trained on thousands of “forced cough” audio clips may be able to detect whether the cougher has the illness, even when the individual is asymptomatic. The neural networks used are computationally expensive, requiring trillions of multiplication operations per second. TinyML could run such cough-analyzing neural networks,” he explained.

Neural networks rely heavily on multiplication, and emerging hardware implements multiplication using low-precision numbers. TinyML will enable chip designers to add more multipliers in a smaller area, and exponentially increase the number of operations per second. The success of TinyML will allow big, math-heavy neural networks to power sensors, wearables, and phones through ultra-efficient neural networks inferencing.
Find the full report here.

More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

The curious case of Google Cloud revenue

Porat had earlier said that Google Cloud was putting in money to make more money, but even with the bucket-loads of money that it was making, profitability was still elusive.

Global Parliaments can do much more with Artificial Intelligence

The world is using AI to enhance the performance of its policymakers. India, too, has launched its own machine learning system NeVA, which at the moment is not fully implemented across the nation. How can we learn and adopt from the advancement in the Parliaments around the world? 

Why IISc wins?

IISc was selected as the world’s top research university, trumping some of the top Ivy League colleges in the QS World University Rankings 2022