MITB Banner

How To Establish Reasoning In NLP Models

Share

Search engines usually employ two techniques to get the most appropriate results: retrieve and read Question Answering (QA) approach, or a Knowledge Base QA (KB) approach. These two, however, have limitations—multi-hop question answering. In short, the challenge is in answering questions where the search engine has to scrape from multiple documents. 

To address this limitation, researchers at Carnegie Mellon University have used corpus as a virtual knowledge base (KB). They have developed a neural module that traverses textual data like a KB.

For example, if we have to do a search on the internet for the size of the COVID-19 virus, then the systems that try to answer this question first need to identify the virus responsible for COVID-19 and then look for the size of that virus.

Overview Of QA Systems

Question answering (QA) falls into two categories:

  1. Retrieve + Read systems, where the documents are taken, returned by standard search engines, and then a deep neural network is run over them to find text that is relevant to the question. 
  2. The second one is the Knowledge-Based QA, where documents are first converted into a knowledge base (KB) by running information extraction pipelines. 

However, in the case of reading and retrieving systems for multi-hop questions, all the relevant documents may not be retrieved by the search engine.

Motivated by the aforementioned limitations, the researchers explored if a text corpus can be directly treated as a knowledge base for answering these multi-hop questions. Whereas, in the case of KB, multi-hop questions can be efficiently answered. However, constructing KBs is an expensive and time-consuming process and can have errors. For instance, new information (such as about COVID) first appears on the web in textual forms and adding this to a KBs takes time.  

And, since all information cannot be expressed using the predefined set of relations allowed by a KB, they are usually incomplete.

Establishing Reasoning

via CMU blog

First, the researchers converted the corpus into a graph-structure similar to a KB while keeping the documents in their original form. They then defined a fast, differentiable operation for traversing the edges in the graph. So when a question is asked, it decomposes it into a sequence of relations that inform the graph how to move along the edges and find the answer. 

This design, claim the researchers, ensures that the traversal operation is differentiable, thereby allowing end-to-end training of the entire system. 

The fundamental difference between this graph and a KB is that the documents are left in their original form, instead of encoding them in a fixed set of relationships.

Hence, this graph structure is cheaper to construct and remains as interpretable as the original text while incorporating the structure. The researchers have also introduced a procedure that employs their graph structure to answer questions by moving from entities back and forth between mentions multiple times.

The key idea behind this approach is that QA can be formulated as a path-finding problem in the graph.

via CMU blog

These embeddings are obtained by passing the entire passage through a BERT model that is fine-tuned on simple questions, and representations can be extracted corresponding to the tokens in the mention.

The researchers have also introduced an efficient implementation of the traversal operation that scales to the entire Wikipedia in milliseconds!

Key Takeaways

In this work, the researchers have tried to bring human-like reasoning into the language models where asking a machine a certain question would always result in an appropriate response, no matter how layered the question is. Here are a couple of takeaways from this work:

  • This work led to a 10-100x increase in Queries/sec over baseline approaches

Know more about this work here.

Share
Picture of Ram Sagar

Ram Sagar

I have a master's degree in Robotics and I write about machine learning advancements.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.