6 AI research papers you can’t afford to miss

ImageNet is a dataset of over 15 million labelled high-resolution images across 22,000 categories.

Qualitative research lies at the heart of innovation. The survival of any domain is predicated on research, and it especially holds true for fledgeling fields like artificial intelligence. Inarguably, research is the catalyst pushing the frontiers of complex fields like AI and ML. 

To bring you up to speed on the critical ideas driving artificial intelligence, we handpicked top drawer research papers from google scholar based on the number of citations. 

Adam: A Method for Stochastic Optimization(2015) – 1,00,651 citations

The paper from Diederik P.Kingma and Jimmy Lei Ba, details ADAM, an algorithm for first-order gradient-based optimization of stochastic objective functions based on adaptive estimates of lower-order moments. The computationally efficient method has few memory requirements and is invariant to diagonal rescaling of the gradients. Adam is built for problems that are large in terms of data and/or parameters. The method is also ideal for non-stationary objectives and problems with noisy gradients. The hyper-parameters have intuitive interpretations and therefore require little tuning. Adam works well in practice and rates highly in comparison to other stochastic optimization methods. 

Imagenet classification with deep convolutional neural networks(2012) – 1,04,283 citations

The researchers Alex Krizhevsky, Ilya Sutskever and Geoffrey E Hinton trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1,000 different classes. The neural network, with 60 million parameters and 500,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and two globally connected layers with a final 1000-way softmax.

Source: DNN architecture from

ImageNet is a dataset of over 15 million labelled high-resolution images across 22,000 categories from the web and labelled by human labellers using Amazon’s Mechanical Turk crowd-sourcing tool.  The study was conducted on a set of 2 GTX 580 GPUs for cross-GPU parallelisation.

The results proved a large, deep convolutional neural network is capable of achieving state-of-the-art results on a challenging dataset using supervised learning. However, the network’s performance drops if a single convolutional layer is removed. 

Distributed representations of words and phrases and their compositionality(Word2Vec)(2013) – 32,320 citations

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado and Jeffrey Dean built on the Skip-gram model, an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper, the researchers presented several extensions that improve both the quality of the vectors and the training speed. By subsampling the frequent words, the team gained significant speedup and also learned more regular word representations. 

Skip-Gram architecture of the Word2Vec algorithm


The paper also proposed a simple alternative to the hierarchical softmax called negative sampling. For training the Skip-gram model, an internal Google dataset consisting of 1 billion words was used. 

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift(2015) – 34,857 citations

Sergey Ioffe and Christian Szegedy’s paper posited different methods to improve the speed and stability of training neural networks through normalisation of the input parameters. Deep networks trained with SGD(Stochastic gradient descent) were used to achieve cutting edge performance. However, this simple and effective method comes with a cost as layers need to continuously adapt to the new distribution.

The implementation of Batch normalisation with ImageNet classification network matches the performance of previous methods with only 7% of the training steps.

Faster R-CNN: towards real-time object detection with region proposal networks(2015) – 40,122 citations

The study from Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun, showed ‘Faster R-CNNs’ can be used for object detection in images and videos at an improved scale. 

Region Proposal Network with examples


The paper presented a unified, deep-learning-based object detection system that runs with near-real-time frame rates. The methodology also improves region proposal quality and thus the overall object detection accuracy.

Generative Adversarial Networks (GANs)(2014) – 41,545 citations

In this paper, Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio, proposed the framework for estimating generative models by simultaneously training two models, a generative model G and a discriminative model D to showcase the potential of the framework through qualitative and quantitative evaluation of the generated samples. 

samples from general adversarial nets


The paper showed GAN is a viable framework for adversarial modelling. This powerful class of neural networks that are used for unsupervised learning takes up a game-theoretic approach, to provide solutions for image detection.

Download our Mobile App

Kartik Wali
A writer by passion, Kartik strives to get a deep understanding of AI, Data analytics and its implementation on all walks of life. As a Senior Technology Journalist, Kartik looks forward to writing about the latest technological trends that transform the way of life!

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Upcoming Events

15th June | Bangalore

Future Ready | Lead the AI Era Summit

15th June | Online

Building LLM powered applications using LangChain

17th June | Online

Mastering LangChain: A Hands-on Workshop for Building Generative AI Applications

20th June | Bangalore

Women in Data Science (WiDS) by Intuit India

Jun 23, 2023 | Bangalore

MachineCon 2023 India

26th June | Online

Accelerating inference for every workload with TensorRT

MachineCon 2023 USA

Jul 21, 2023 | New York

Cypher 2023

Oct 11-13, 2023 | Bangalore

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox