Open engineering consortium MLCommons has published the results for its machine learning inference performance benchmark suite, MLPerf Inference v1.0. The results gauged how quickly a trained neural network can process new data for a wide range of applications on various form factors and a system power measurement methodology, the MLCommons statement said. The latest benchmark includes 1,994 performance and 862 power efficiency results for leading ML inference systems.
The foundation of MLCommons was laid in 2018 after a group of researchers and engineers released MLPerf, a benchmark for measuring the speed of machine learning software and hardware. The companies including AMD, Baidu, Google, Intel and researchers from Harvard University, Stanford University, University of California Berkeley, University of Minnesota, and the University of Toronto had backed the initiative. Last December, MLCommons was launched.
Towards an efficient system
MLCommons brought together diverse companies and organisations to create vast public databases for AI training, allowing researchers from all over the world to collaborate at a higher level and advance the field as a whole. The submissions were received from Alibaba, Centaur, DellEMC, EdgeCortix, Fujitsu, Gigabyte, HPE, Inspur, Intel, Krai, Lenovo, Moblin, Neuchips, Nvidia, Qualcomm, Supermicro and Xilinx.
The system power measurement methodology was developed in collaboration with Standard Performance Evaluation Corp. (SPEC), a non-profit corporation formed to create standardised benchmarks and tools to assess computing systems’ performance and energy efficiency.
“As we look at the accelerating adoption of machine learning, AI, and the anticipated scale of ML projects, the ability to measure power consumption in ML environments will be crucial for sustainability goals across the globe,” said Klaus-Dieter Lange, SPEC Power Committee Chair.
MLCommons built MLPerf in the best tradition of vendor-neutral standardised benchmarks, and SPEC is excited to be a partner in their development process. We look forward to extensive adoption of this extremely valuable benchmark, added Lange.
Apart from this addition of the power category, MLPerf also made two small rule changes. “One is that for data centre submissions, we require that any external memory be protected with ECC. And we also adjusted the minimum runtime to better capture sort of equilibrium behaviour,” said David Kanter, Executive Director of MLCommons.
The report is released every year. As per Kanter, there’s a gap of six months between the inference and training. However, one cycle was missed last year due to the pandemic.
The different scenarios and MLPerf models used in the latest inference exercise:
The annual benchmark tests offer chip vendors and system developers an opportunity to showcase how well they can do a representative set of machine learning tasks. NVIDIA is the only company to make submissions in all three tests conducted so far. Last year and this year, NVIDIA achieved record-breaking success in every category with its newly launched A30 and A10 GPUs.
“As AI continues to transform every industry, MLPerf is becoming an even more important tool for companies to make informed decisions on their IT infrastructure investments,” said Ian Buck, General Manager and Vice President of Accelerated Computing at NVIDIA.
He further added, “Now, with every major OEM submitting MLPerf results, NVIDIA and our partners are focusing not only on delivering world-leading performance for AI but on democratising AI with a coming wave of enterprise servers powered by our new A30 and A10 GPUs.”
The industry needs a mutually agreed-upon range of best practices and metrics to promote the ongoing growth, deployment, and sharing of machine learning and AI technologies and assess quality, speed, and reliability. Moreover, reproducibility is a challenge in ML. MLCommons benchmarks will help advance the field and also save others from reinventing the wheel.