MITB Banner

Google Introduces Two New Datasets For Improved Conversational NLP

Google’s published study investigates pre-trained language models for their temporal reasoning capabilities in dialogs using TimeDial and Disfl-QA.

Share

Google datasets

Conversational agents are a dialogue system through NLP to respond to a given query in human language. It leverages advanced deep learning measures and natural language understanding to reach a point where conversational agents can transcend simple chatbot responses and make them more contextual. Conversational AI encompasses three main areas of artificial intelligence research — automatic speech recognition (ASR), natural language processing (NLP), and text-to-speech (TTS or speech synthesis). These dialogue systems are utilised to read from the input channel and then reply with the relevant response in graphics, speech, or haptic-assisted physical gestures via the output channel.  

Modern conversational models often struggle when confronted with temporal relationships or disfluencies.The capability of temporal reasoning in dialogs in massive pre-trained language models like T5 and GPT-3 is still largely under-explored. The progress on improving their performance has been slow, in part, because of the lack of datasets that involve this conversational and speech phenomena. To overcome these data set problems, Google has introduced two new datasets for conversational NLP. 

Google’s solution

Google’s published study investigates pre-trained language models for their temporal reasoning capabilities in dialogs using TimeDial and Disfl-QA. These help with temporal commonsense reasoning in dialogs and understanding contextual disfluencies, respectively. They are benchmark datasets to demonstrate the gap between human performance and current state of the art NLP models.

TimeDial dataset

TimeDial makes it easier for conversational agents to have temporal conversations such as duration, frequency, or relative ordering of events in a dialog. Current NLP models tend to make a poor selection when tasked with filling in blank questions that demand a basic level of knowledge for reasoning or understanding temporal concepts. TimeDial introduces a multiple-choice span filling task targeted for temporal understanding.

For instance, we study this conversation shown on the Google AI Blog. 

Credit: Google AI Blog

Determining the time needed for the NLP model to understand the temporal relationship between events such as half-past one comes before three o’clock and half-past three comes after both. It also demands them to have world knowledge to determine that the individual is not late for the meeting yet. But current models like T5 and BERT end up picking the wrong answers. 

Fitting into this problem, Google’s TimeDial is a benchmark dataset that measures a model’s temporal commonsense reasoning abilities within the context of dialogue through four multiple-choice questions set up. 

Google led an experiment across three modelling paradigms- 

  • classification over the provided four options using BERT
  • mask filling for the masked span in the dialogue using BERT-MLM
  • Generative methods using T5. 

A quantitative error analysis concluded that the pre-trained language models could not truly reason over the context. Instead, they often rely on shallow and spurious features such as test matching. This calls for finding new ways of representing temporal objects in general text representations. 

 The dataset is publicly available at: https://github.com/google-research-datasets/timedial.

Disfl-QA dataset

Disfluency occurs in the text output generated by speech recognition systems. Therefore, it is essential to study this disfluent text to build conversational agents that understand human speech. But research in NLP faces two hurdles: 

  • The lack of curated datasets obstructs deeper research and model innovation. Datasets generally contain these disfluencies.
  • The available datasets are limited in scale and complexity.

These create a challenge for researchers to conduct a stress test of the NLP models. 

Google has claimed Disfl-QA to be the first dataset containing contextual disfluencies in an information-seeking setting. It is a targeted dataset for disfluencies which comprises questions (12k) containing these sentence complications.  

Disfl-QA comprises close to 90 percent of corrections or restarts which makes it a tough test for disfluency correction. In addition, it has a broader scope of semantic distractions, i.e., distractors that carry semantic meaning instead of simpler speech disfluencies. 

Google demonstrated this with the help of an example. 

Credit: Google AI Blog

In this sentence, Q1 is a question regarding the location of Normandy. However, in the disfluent version (DQ1), ‘Norse’ is mentioned before the question is corrected. This correctional disfluency confuses the QA model since it relied on shallow textual cues to answer the question. 

According to their experiment results, the performance of existing language models was unsatisfactory when tested on Disfl-QA. Data augmentation methods can be used to recover this loss in performance partially. The researchers also found the need for large-scale disfluency datasets for NLP models to be robust to disfluencies.

The dataset is publicly available at: https://github.com/ google-research-datasets/disfl-qa.

Share
Picture of Avi Gopani

Avi Gopani

Avi Gopani is a technology journalist that seeks to analyse industry trends and developments from an interdisciplinary perspective at Analytics India Magazine. Her articles chronicle cultural, political and social stories that are curated with a focus on the evolving technologies of artificial intelligence and data analytics.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.