MITB Banner

The Evolution of ImageNet for Deep Learning in Computer Vision

From 2010 ImageNet Large Scale Visual Recognition Challenge (ILSVRC) which is a global annual contest held where software programs(mostly these are Convnets) compete for image classification and detection of objects and scenes. The best algorithm with the least top 5 error rate is selected as the winner.

Share

Working with computer vision problems such as object recognition, action detection the first we think of is acquiring the suitable dataset to train our model over it. Earlier in the field of AI, more focus was given to machine learning and deep learning algorithms, but there was a lack of proper dataset to run these algorithms. As a result, it was limited to researchers only; the business world did not find much interest in AI back then. 

In 2006, Fei Fei Li came up with the idea to run these algorithms in the real world. Thus ImageNet started originating under the hood of WordNet. ImageNet is the biggest image dataset containing more than 14 million images of more than 20000 different categories having 27 high-level subcategories containing at least 500 images each. All of these images are manually annotated by the ImageNet developers, and over 1million images contain the bounding boxes around the object in the picture. In 1.2 million pictures SIFT(Scale-Invariant Feature Transform) is provided, which gives a lot of information regarding features in an image.

From 2010 ImageNet Large Scale Visual Recognition Challenge (ILSVRC) which is a global annual contest held where software programs(mostly these are Convnets) compete for image classification and detection of objects and scenes. The best algorithm with the least top 5 error rate is selected as the winner.

Within six years, the error rate came down from 26% to 2.25%, which is a huge achievement. 

YEARWINNERTOP 5 ERROR RATE %
2012ALEXNET15.3
2013ZFNET11.2
2014INCEPTION V1 (GoogLeNet)
VGG NET (Runner up) 
6.67
7.3
2015ResNet3.57
2016ResNeXt4.1
2017SENet2.251
2018PNASNet-53.8

It was a revolution in the world of AI, and people started taking an interest in it. Researchers say humans have a top-5 error rate of 5.1% which is almost double of the best performing deep learning model trained on ImageNet.

 In today’s article, we will be discussing the ImageNet database and its variants. 

ImageNet2012

It was developed by many authors, mainly Fei-Fei Li, who started building it. As per the 2015 ILSVRC paper Olga Russakovsky, Jonathan Krause, Aditya Khosla, Michael Bernstein, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Jia Deng and Hao Su, Andrej Karpathy, and Alexander C. Berg are among the other authors. 

WordNet is a language database. Based on English language semantics of wordnet Fei Fei Li started building Imagenet around each of the synsets(most of which are nouns). At least 1000 images were provided for each synset. The developers used Amazon Mechanical Turk to help them with the image classification. Images have been subsampled to 256×256 to fit in the deep learning models.

Dataset size: 155.84 GiB

Data : train set- 1281167 images, validation set – 50000 images, test set- 100000 images.

Code Snippet:

With TensorFlow (dataset requires to be downloaded manually from here)

import tensorflow_datasets as tfds
train,test = tfds.load('imagenet2012', split=['train', 'test'])

Using PyTorch (works with Scipy library)

from torchvision import transforms, datasets
train = datasets.ImageNet('', train=True, download=True,
                       transform=transforms.Compose([
                           transforms.ToTensor()
                       ]))
test = datasets.ImageNet('', train=False, download=True,
                       transform=transforms.Compose([
                           transforms.ToTensor()
                       ]))

Mini ImageNet 

This dataset was created for few-shot learning trained through meta-transfer learning of one hundred classes with 600 samples per class. Images will be resized to 84×84. Download dataset from here

Performance measures of mini Imagenet: 

The GitHub repository for generating a mini Imagenet from Imagenet.

ImageNet2012_real

Developed in 2020 by Xiaohua Zhai, Aaron van den Oord, Alexander Kolesnikov, Lucas Beyer and Olivier J. Henaff presented in the paper “Are We Done With Imagenet”. This dataset contains 50000 validation images of the original Imagenet, with real labels. It provides multiclass labels and better annotations than the original labels and annotations of Imagenet. 

Dataset Size: 6.25 GiB

Code Snippet:

With TensorFlow (dataset requires to be downloaded manually)

import tensorflow_datasets as tfds
imreal = tfds.load('imagenet2012_real')

An implementation of this dataset is given in this Github repository.

ImageNet2012_subset

This dataset is also developed in 2020 by Kornblith, Simon, Norouzi, Chen, Ting, Mohammad and Geoffrey Hinton. As the name suggests, this is a subset of the ImageNet2012 containing 1% of total dataset and 10% of the total dataset. This is purposed to be used in semi-supervised learning algorithms.

1pct Configuration(By default):

Dataset size: 7.6 GiB

Data is split into 12811 training images and 50000 validation images.

10 pct configuration:

Dataset size: 19.91 GiB

Data is split into 128116 training images and 50000 validation images.

Code Snippet:

With TensorFlow (dataset requires to be downloaded manually)

import tensorflow_datasets as tfds
train,test = tfds.load('imagenet2012_subset', split=['train', 'test'])

ImageNet_A and ImageNet_O

Developed in 2019 by Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt and Dawn Song mentioned in their paper “Natural Adversarial Examples”. These datasets contain images labelled with original ImageNet labels of those 1000 classes. These are real-world, unmodified images that ResNet-50 failed to classify correctly. Imagenet-A contains images which are of the same classes as the original ImageNet while ImageNet-O contains images from classes which are not seen earlier.

Dataset Size: 650.87 MiB 

Data: 7500 testing images

Results show the black text as the actual class and red text as predicted class with confidence score by ResNet-50. 

Code Snippet:

With TensorFlow 

import tensorflow_datasets as tfds
img_a = tfds.load('imagenet_a')

ImageNet_R

It was developed in 2020 by Dan Hendrycks, Steven Basart, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhuand Norman Mu, Saurav Kadavath, Samyak Parajuli, Mike Guo, Dawn Song, Jacob Steinhardt and Justin Gilmer. Here ‘R’ stands for Rendition as its a rendition provided to 200 Imagenet classes. This dataset contains art, paintings, patterns, Deviantart, graffiti, embroidery,  sketches, tattoos, cartoons, graphics, origami, plastic objects, plush objects, sculptures, toys, and video game renditions from the original ImageNet. 

Dataset Size: 2.03GiB

Data: 3000 images

Code Snippet:

With TensorFlow 

import tensorflow_datasets as tfds
img_r = tfds.load('imagenet_r')

An implementation of the above dataset can be found in this GitHub repository.

ImageNet_Resized

Developed in 2017 by Chrabaszcz, Hutter, Patryk, Loshchilov, Ilya, and Frank. This dataset was built for downsampled images of original Imagenet, as an alternative to CIFAR datasets.

Data Split is same as original ImageNet

8×8 downsampled images( by default) :

Dataset Size: 237.11 MiB

16×16 downsampled images:

Dataset Size: 932.34 MiB

32×32 downsampled images:

Dataset Size: 3.46 GiB

64×64 downsampled images:

Dataset Size: 13.13 GiB

Code Snippet:

With TensorFlow 

import tensorflow_datasets as tfds
img_resize = tfds.load('imagenet_resized')

Conclusion

Some other datasets inspired by Imagenet – Imagenet-V2, Imagenette, Imagewoof, Imagewang. ImageNet has collaboration with PASCAL VOC. Imagenet is under constant development to serve the computer vision community. As of 2019, a report generated bias in most images. Imagenet is working to overcome bias and other shortcomings. Tiny ImageNet Visual Recognition Challenge is a project by Stanford, which is similar to ILSVCR. The annotation process of Imagenet is based on 3rd party and crowdsourcing.

Share
Picture of Jayita Bhattacharyya

Jayita Bhattacharyya

Machine learning and data science enthusiast. Eager to learn new technology advances. A self-taught techie who loves to do cool stuff using technology for fun and worthwhile.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India