The Evolution of ImageNet for Deep Learning in Computer Vision

From 2010 ImageNet Large Scale Visual Recognition Challenge (ILSVRC) which is a global annual contest held where software programs(mostly these are Convnets) compete for image classification and detection of objects and scenes. The best algorithm with the least top 5 error rate is selected as the winner.

Working with computer vision problems such as object recognition, action detection the first we think of is acquiring the suitable dataset to train our model over it. Earlier in the field of AI, more focus was given to machine learning and deep learning algorithms, but there was a lack of proper dataset to run these algorithms. As a result, it was limited to researchers only; the business world did not find much interest in AI back then. 

In 2006, Fei Fei Li came up with the idea to run these algorithms in the real world. Thus ImageNet started originating under the hood of WordNet. ImageNet is the biggest image dataset containing more than 14 million images of more than 20000 different categories having 27 high-level subcategories containing at least 500 images each. All of these images are manually annotated by the ImageNet developers, and over 1million images contain the bounding boxes around the object in the picture. In 1.2 million pictures SIFT(Scale-Invariant Feature Transform) is provided, which gives a lot of information regarding features in an image.

From 2010 ImageNet Large Scale Visual Recognition Challenge (ILSVRC) which is a global annual contest held where software programs(mostly these are Convnets) compete for image classification and detection of objects and scenes. The best algorithm with the least top 5 error rate is selected as the winner.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Within six years, the error rate came down from 26% to 2.25%, which is a huge achievement. 

2014INCEPTION V1 (GoogLeNet)
VGG NET (Runner up) 

It was a revolution in the world of AI, and people started taking an interest in it. Researchers say humans have a top-5 error rate of 5.1% which is almost double of the best performing deep learning model trained on ImageNet.

 In today’s article, we will be discussing the ImageNet database and its variants. 


It was developed by many authors, mainly Fei-Fei Li, who started building it. As per the 2015 ILSVRC paper Olga Russakovsky, Jonathan Krause, Aditya Khosla, Michael Bernstein, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Jia Deng and Hao Su, Andrej Karpathy, and Alexander C. Berg are among the other authors. 

WordNet is a language database. Based on English language semantics of wordnet Fei Fei Li started building Imagenet around each of the synsets(most of which are nouns). At least 1000 images were provided for each synset. The developers used Amazon Mechanical Turk to help them with the image classification. Images have been subsampled to 256×256 to fit in the deep learning models.

Dataset size: 155.84 GiB

Data : train set- 1281167 images, validation set – 50000 images, test set- 100000 images.

Code Snippet:

With TensorFlow (dataset requires to be downloaded manually from here)

import tensorflow_datasets as tfds
train,test = tfds.load('imagenet2012', split=['train', 'test'])

Using PyTorch (works with Scipy library)

from torchvision import transforms, datasets
train = datasets.ImageNet('', train=True, download=True,
test = datasets.ImageNet('', train=False, download=True,

Mini ImageNet 

This dataset was created for few-shot learning trained through meta-transfer learning of one hundred classes with 600 samples per class. Images will be resized to 84×84. Download dataset from here

Performance measures of mini Imagenet: 

The GitHub repository for generating a mini Imagenet from Imagenet.


Developed in 2020 by Xiaohua Zhai, Aaron van den Oord, Alexander Kolesnikov, Lucas Beyer and Olivier J. Henaff presented in the paper “Are We Done With Imagenet”. This dataset contains 50000 validation images of the original Imagenet, with real labels. It provides multiclass labels and better annotations than the original labels and annotations of Imagenet. 

Dataset Size: 6.25 GiB

Code Snippet:

With TensorFlow (dataset requires to be downloaded manually)

import tensorflow_datasets as tfds
imreal = tfds.load('imagenet2012_real')

An implementation of this dataset is given in this Github repository.


This dataset is also developed in 2020 by Kornblith, Simon, Norouzi, Chen, Ting, Mohammad and Geoffrey Hinton. As the name suggests, this is a subset of the ImageNet2012 containing 1% of total dataset and 10% of the total dataset. This is purposed to be used in semi-supervised learning algorithms.

1pct Configuration(By default):

Dataset size: 7.6 GiB

Data is split into 12811 training images and 50000 validation images.

10 pct configuration:

Dataset size: 19.91 GiB

Data is split into 128116 training images and 50000 validation images.

Code Snippet:

With TensorFlow (dataset requires to be downloaded manually)

import tensorflow_datasets as tfds
train,test = tfds.load('imagenet2012_subset', split=['train', 'test'])

ImageNet_A and ImageNet_O

Developed in 2019 by Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt and Dawn Song mentioned in their paper “Natural Adversarial Examples”. These datasets contain images labelled with original ImageNet labels of those 1000 classes. These are real-world, unmodified images that ResNet-50 failed to classify correctly. Imagenet-A contains images which are of the same classes as the original ImageNet while ImageNet-O contains images from classes which are not seen earlier.

Dataset Size: 650.87 MiB 

Data: 7500 testing images

Results show the black text as the actual class and red text as predicted class with confidence score by ResNet-50. 

Code Snippet:

With TensorFlow 

import tensorflow_datasets as tfds
img_a = tfds.load('imagenet_a')


It was developed in 2020 by Dan Hendrycks, Steven Basart, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhuand Norman Mu, Saurav Kadavath, Samyak Parajuli, Mike Guo, Dawn Song, Jacob Steinhardt and Justin Gilmer. Here ‘R’ stands for Rendition as its a rendition provided to 200 Imagenet classes. This dataset contains art, paintings, patterns, Deviantart, graffiti, embroidery,  sketches, tattoos, cartoons, graphics, origami, plastic objects, plush objects, sculptures, toys, and video game renditions from the original ImageNet. 

Dataset Size: 2.03GiB

Data: 3000 images

Code Snippet:

With TensorFlow 

import tensorflow_datasets as tfds
img_r = tfds.load('imagenet_r')

An implementation of the above dataset can be found in this GitHub repository.


Developed in 2017 by Chrabaszcz, Hutter, Patryk, Loshchilov, Ilya, and Frank. This dataset was built for downsampled images of original Imagenet, as an alternative to CIFAR datasets.

Data Split is same as original ImageNet

8×8 downsampled images( by default) :

Dataset Size: 237.11 MiB

16×16 downsampled images:

Dataset Size: 932.34 MiB

32×32 downsampled images:

Dataset Size: 3.46 GiB

64×64 downsampled images:

Dataset Size: 13.13 GiB

Code Snippet:

With TensorFlow 

import tensorflow_datasets as tfds
img_resize = tfds.load('imagenet_resized')


Some other datasets inspired by Imagenet – Imagenet-V2, Imagenette, Imagewoof, Imagewang. ImageNet has collaboration with PASCAL VOC. Imagenet is under constant development to serve the computer vision community. As of 2019, a report generated bias in most images. Imagenet is working to overcome bias and other shortcomings. Tiny ImageNet Visual Recognition Challenge is a project by Stanford, which is similar to ILSVCR. The annotation process of Imagenet is based on 3rd party and crowdsourcing.

Jayita Bhattacharyya
Machine learning and data science enthusiast. Eager to learn new technology advances. A self-taught techie who loves to do cool stuff using technology for fun and worthwhile.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox