Active Hackathon

Top Papers Presented At EMNLP 2021

The Best Paper Awards are intended to recognise the best papers presented at EMNLP 2021.

The most exceptional papers were presented at the Empirical Methods in Natural Language Processing (EMNLP 2021) Conference. For the main conference papers, the organisers examined the following three categories.

Best Long Paper:

Visually Grounded Reasoning across Languages and Cultures

The researchers develop a novel annotation methodology in which the choice of images and descriptions is fully determined by native speakers. Additionally, the researchers build and assess a set of multilingual and multimodal baselines, including transfer using models and translation. The researchers discovered that their performance is occasionally slightly above chance and is significantly harmed by the out-of-distribution character of concepts, visuals, and languages in MaRVL in comparison to English datasets. This provides reason to suppose that it provides a more accurate assessment of the suitability of state-of-the-art models for real-world applications outside a confined linguistic and cultural domain.


Sign up for your weekly dose of what's up in emerging technology.

For further information, refer to the article.

Best Short Paper:  

CHoRaL: Collecting Humor Reaction Labels from Millions of Social Media Users 

The researchers propose the CHoRaL framework for collecting humour reaction labels automatically, as well as a dataset containing 785K postings with and without humour scores. Additionally, the researchers analyse humour expressions in our dataset and develop algorithms capable of detecting humour on a level with human labellers. CHoRaL enables the construction of algorithms for detecting comedy on any topic, and the dataset has the potential to aid in broader applications, such as separating malevolent disinformation posts from benign humorous posts. Additionally, CHoRaL can be used to classify additional human emotions, such as anger and melancholy.

For further information, refer to the article.

Outstanding Papers:

MindCraft: Theory of Mind Modeling for Situated Dialogue in Collaborative Tasks

This work stimulates disparities in a virtual environment by developing a novel dataset and experimental framework for in-depth research of the theory of mind modelling for situational discussion in collaborative projects. The initial research yields numerous intriguing results that will aid in the creation of computer models for a variety of challenges. The baseline findings emphasise the critical role of interaction discourse and visual experience in forecasting mutual belief levels about the job at hand and about a collaborative partner in order to establish common ground. The researchers hope that their work will contribute to further advancements in areas such as agent planning and decision-making.

For further information, refer to the article.

SituatedQA: Incorporating Extra-Linguistic Contexts into QA

The researchers present the first study to examine the effect of extra-linguistic circumstances on open retrieval quality assurance. The study demonstrates that contemporary systems are incapable of adapting to changes in the temporal or geographical context. As a result, we define tasks and develop a dataset for training and evaluating quality assurance systems that are capable of simulating how facts vary across contexts. The dataset will provide abundant opportunities for future research into constructing models that can elegantly alter their predictions in response to changing temporal and geographical circumstances. Future studies may focus on incorporating source materials chronologically and spatially dependent, such as news items, or on other extra-linguistic factors, such as who is asking the question, while considering individual preferences.

For further information, refer to the article.

When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute

The article describes a very efficient architecture that combines fast recursion with attention and evaluates its effectiveness on a variety of language modelling datasets. The researchers demonstrate that using rapid RNNs with little attention not only achieves superior outcomes but also dramatically reduces training costs. The work is conceptually distinct from accelerating attention and hence provides a counter-intuitive route for enhancing state-of-the-art model design. The researchers feel that the model can be improved further with the addition of stronger attention or recurrent implementations, as well as improved normalisation or optimisation techniques.

For further information, refer to the article.

Shortcutted Commonsense: Data Spuriousness in Deep Learning of Commonsense Reasoning

Commonsense is the quintessential barrier to AI and deep learning approaches. The purpose of this paper is to determine how much of these benefits are due to generalisation beyond the training datasets used to perform these tasks and whether these achievements are susceptible to the problem recently discovered in some other areas of natural language processing. CSQA outperforms the other tasks in terms of generalisation capacity, as it provides a far better approximation to other tasks in a zero-shot environment. These findings indicate the need for additional research with a thorough examination and comparison of methods and procedures.

For further information, refer to the article.

Best Demo Paper:

Datasets: A Community Library for Natural Language Processing 

Hugging Face Datasets is a community-driven open-source toolkit that standardises the processing, dissemination, and documentation of natural language processing datasets. The core library is designed to be simple to use, fast and to support datasets of varied sizes using the same interface. With over 650 datasets contributed by over 250 contributors, it simplifies the use of standard datasets, enables new use cases for cross-dataset NLP, and includes advanced features for indexing and streaming big datasets.

For further information, refer to the article.

More Great AIM Stories

Dr. Nivash Jeevanandam
Nivash holds a doctorate in information technology and has been a research associate at a university and a development engineer in the IT industry. Data science and machine learning excite him.

Our Upcoming Events

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Council Post: Enabling a Data-Driven culture within BFSI GCCs in India

Data is the key element across all the three tenets of engineering brilliance, customer-centricity and talent strategy and engagement and will continue to help us deliver on our transformation agenda. Our data-driven culture fosters continuous performance improvement to create differentiated experiences and enable growth.

Ouch, Cognizant

The company has reduced its full-year 2022 revenue growth guidance to 8.5% – 9.5% in constant currency from the 9-11% in the previous quarter