5 Research Papers on Computational Linguistics For Your Reading List

Image source: MIT News

As one of the premier institutes for technology, Massachusetts Institute of Technology (MIT) has several prominent research which has resulted in many ground-breaking technological advancements.

In this article, we take a look at the top five recent research papers from on Computational Linguistics from the institute.

1) Learning an Executable Neural Semantic Parser

Authors: Jianpeng Cheng, Siva Reddy, Vijay Saraswat, and Mirella Lapata

Abstract: This article describes a neural semantic parser that maps natural language utterances ontological forms that can be executed against a task-specific environment, such as a knowledge base or a database, to produce a response. The parser generated tree-structured logical forms with a transition-based approach, combining a generic tree-generation algorithm with domain-general grammar defined by the logical language.

Research methodology: To tackle mismatches between natural language and logical form tokens, various attention mechanisms were explored. Finally, the researchers considered different training settings for the neural semantic parser, including fully supervised training where annotated logical forms were given, weakly supervised training where denotations were provided, and distant supervision where only unlabeled sentences and a knowledge base are available.  

2) Unsupervised Compositionality Prediction of Nominal Compounds

Authors: Silvio Cordeiro, Aline Villavicencio, Marco Idiart and Carlos Ramisch

Abstract: Nominal compounds such as red wine and nut case display a continuum of compositionality, with varying contributions from the components of the compound to its semantics. This article proposes a framework for compound compositionality prediction using distributional semantic models, evaluating to what extent they capture idiomaticity compared to human judgments.

Research methodology: For evaluation, the researchers introduced data sets containing human judgments in three languages: English, French, and Portuguese. The results obtained reveal a high agreement between the models and human predictions, suggesting that they were able to incorporate information about idiomaticity.

3) Automatic Inference of Sound Correspondence Patterns across Multiple Languages

Authors: Johann-Mattis List

Abstract: The researcher presented an automatic method for the inference of sound correspondence patterns across multiple languages based on a network approach. The core idea was to represent all columns in aligned cognate sets as nodes in a network with edges representing the degree of compatibility between the nodes.

Research methodology: The task of inferring all compatible correspondence sets can then be handled as the well-known minimum clique cover problem in graph theory, which essentially seeks to split the graph into the smallest number of cliques in which each node is represented by exactly one clique. The resulting partitions represent all correspondence patterns that can be inferred for a given data set. By excluding those patterns that occur in only a few cognate sets, the core of regularly recurring sound correspondences can be inferred. Based on this idea, the article presents a method for automatic correspondence pattern recognition, which is implemented as part of a Python library which supplements the article.

4) A Sequential Matching Framework for Multi-Turn Response Selection in Retrieval-Based Chatbots

Authors: Yu Wu, Wei Wu, Chen Xing, Can Xu, Zhoujun Li, and Ming Zhou

Abstract: The researchers studied the problem of response selection for multi-turn conversation in retrieval-based chatbots. The task involved matching a response candidate with a conversation context, the challenges for which include how to recognize important parts of the context, and how to model the relationships among utterances in the context.

Research Methodology: Using a new matching framework called sequential matching framework (SMF), the researchers proposed a sequential convolutional network and sequential attention network and conducted experiments on two public data sets to test their performance. Experiment results show that both models can significantly outperform state-of-the-art matching methods. The researchers also show that the models are interpretable with visualisations that provide us insights on how they capture and leverage important information in contexts for matching.

5)Parsing Chinese Sentences with Grammatical Relations

Authors: Weiwei Sun, Yufei Chen, Xiaojun Wan and Meichun Liu

Abstract:  The research represents grammatical information using general directed dependency graphs. Both only-local and rich long-distance dependencies are explicitly represented.

Research methodology: To create high-quality annotations, the researchers took advantage of an existing TreeBank, namely, Chinese TreeBank (CTB), which is grounded on the Government and Binding theory. Two key problems as addressed by the researchers include (a) how to decompose a complex graph into simple subgraphs, and (b) how to combine subgraphs into a coherent complex graph. For transition-based parsing, the researchers introduced a neural parser based on a list-based transition system. They also discussed several other key problems, including dynamic oracle and beam search for neural transition-based parsing. The evaluation gauged how successful GR parsing for Chinese can be by applying data-driven models. The empirical analysis suggests several directions for future study.


Download our Mobile App

Akshaya Asokan
Akshaya Asokan works as a Technology Journalist at Analytics India Magazine. She has previously worked with IDG Media and The New Indian Express. When not writing, she can be seen either reading or staring at a flower.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Can OpenAI Save SoftBank? 

After a tumultuous investment spree with significant losses, will SoftBank’s plans to invest in OpenAI and other AI companies provide the boost it needs?

Oracle’s Grand Multicloud Gamble

“Cloud Should be Open,” says Larry at Oracle CloudWorld 2023, Las Vegas, recollecting his discussions with Microsoft chief Satya Nadella last week. 

How Generative AI is Revolutionising Data Science Tools

How Generative AI is Revolutionising Data Science Tools

Einblick Prompt enables users to create complete data workflows using natural language, accelerating various stages of data science and analytics. Einblick has effectively combined the capabilities of a Jupyter notebook with the user-friendliness of ChatGPT.