Most Popular NLP Papers Of 2021

Natural Language Processing includes the analysing of data to extract and process meaningful information.
NLP, NLP Papers

Natural Language Processing or NLP is a technique to teach computers to process and comprehend human/natural languages. NLP is a part of data science and includes the analysis of data to extract, process, and output meaningful information. Some of the important applications of NLP include: 

  • Text mining 
  • Text and sentiment analysis 
  • Speech generation 
  • Text classification 
  • Speech Generation 
  • Speech Classification 

In this article, Analytics India Magazine lists the top journals for NLP that one must read. These journals are information repositories that can help one stay at the top of their NLP game. 


Sign up for your weekly dose of what's up in emerging technology.

(Note that the list is in no particular order.)

Dynabench: Rethinking Benchmarking in NLP 

This year, researchers from Facebook and Stanford University open-sourced Dynabench, a platform for model benchmarking and dynamic dataset creation. Dynabench runs on the web and supports human-and-model-in-the-loop dataset creation. It addresses how contemporary models quickly achieve performance on benchmark tasks but fail on simple examples or real-world scenarios. Dynabench helps in dataset creation, model development, and model assessment which leads to more robust and informative benchmarks.

Causal Effects of Linguistic Properties 

This paper on Causal Effects of Linguistic Properties deals with the problem of using observational data. The paper addresses challenges related to the problem before developing a practical method. Based on the result, it introduces TextCause— an algorithm to estimate the causal effects of linguistic properties. It leverages distant supervision to improve noisy proxies’ quality; and BERT, the pre-trained language model, to adjust for the text. Finally, it presents an applied case study to investigate the effects. The paper was presented at the NAACL 2021. 

Transformer-based Binary Word Sense Disambiguation 

Released at the second International Conference on NLP and Big Data, this paper deals with the word sense disambiguation problem as a classification task and presents a model for text ambiguity problems with the help of transformers. In recent solutions for NLP tasks, transformers have shown improvements. However, researchers find the correct meaning of every word in a particular text in this task. This paper further depicts how the usage of pre-train transformer models improve the accuracy of the architecture. These experiments also showcase how NLP task performance can be improved with the help of data augmentation techniques. 

Single Headed Attention RNN: Stop thinking with your head 

Published by Harvard University graduate Steven Merity, the paper ‘Single Headed Attention RNN: Stop thinking with your head’, introduces a state-of-the-art NLP model called Single Headed Attention RNN or SHA-RNN. The author does so by using the example of the LSTM model with SHA in order to achieve state-of-the-art, byte-level language model results on enwik8

NLP applied on issue trackers 

The NLP applied on issue trackers paper discusses the various NLP techniques, including top analysis, similarity algorithms (N-grams, Jaccard, LSI algorithm), descriptive statistics, and others, along with machine learning (ML) algorithms such as support vector machines (SVM) and Decision trees. These techniques are usually used for a better understanding of the characteristics, classification, lexical relations, and prediction of duplicate development tasks. Tuning the different features to predict the development tasks with a Fidelity loss function, a system can identify duplicate tasks with almost 100 percent accuracy. 

Attention in Natural Language Processing

Attention is a popular mechanism in neural architectures and has been realised in various formats. However, owing to the fast-paced advances in this domain, a systematic overview of attention is still missing. This paper defines a unified model for attention architectures in NLP while focusing on those that are designed to work with vector representations of textual data. The writers have proposed a taxonomy of attention models according to four dimensions: 

  • Representation of input 
  • Compatibility function 
  • Distribution function 
  • Multiplicity of the input and output 

Additionally, the paper provides instances of how prior information can be exploited in attention models while discussing ongoing research efforts and open challenges, providing extensive categorisation of the huge body of literature. 

More Great AIM Stories

Debolina Biswas
After diving deep into the Indian startup ecosystem, Debolina is now a Technology Journalist. When not writing, she is found reading or playing with paint brushes and palette knives. She can be reached at

Our Upcoming Events

Masterclass, Virtual
How to achieve real-time AI inference on your CPU
7th Jul

Masterclass, Virtual
How to power applications for the data-driven economy
20th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, Virtual
Deep Learning DevCon 2022
29th Oct

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

What can SEBI learn from casinos?

It is said that casino AI technology comes with superior risk management systems compared to traditional data analytics that regulators are currently using.

Will Tesla Make (it) in India?

Tesla has struggled with optimising their production because Musk has been intent on manufacturing all the car’s parts independent of other suppliers since 2017.