Last updated April 22, 2024
In AI News & Update

Alibaba Launches LLM-R2 to Optimise SQL Query Efficiency

“It consistently outperforms baseline methods by reducing the execution time to 52.5%, 56.0%, and 39.8% on average across TPC-H, IMDB, and DSB datasets respectively, compared to the original queries”

Share

Published on April 22, 2024

by Sukriti Gupta

Listen to this story

Researchers at Nanyang Technological University, Singapore University of Technology and Design, and Alibaba‘s DAMO Academy recently introduced LLM-R2, a rule-based query rewrite system enhanced with an LLM, to significantly boost SQL query efficiency.

The core objective of query rewrite is to transform an SQL query into a new format that maintains the original results while executing more efficiently. This involves three key criteria: executability, equivalence, and efficiency. Traditional query rewrite systems heavily rely on predefined rules and are often constrained by the computational limitations and inaccuracies of DBMS cost estimators.

LLM-R2 addresses these challenges by integrating a LLM to suggest optimal rewrite rules for SQL queries, which are then implemented using an existing database platform. This allows the rewritten queries to maintain their executability and accuracy while improving efficiency.

One of the key advancements of LLM-R2 is its use of contrastive learning models that help in refining the selection of rewrite rules by understanding the structure and context of each query. This allows LLM-R2 to adapt and apply the most appropriate optimizations, leading to a significant reduction in query execution times across various datasets.

This method has proven to significantly cut down query execution times across various datasets including TPC-H, IMDB, and DSB, demonstrating improvements over both traditional rule-based methods and other LLM-based systems.

The results show that LLM-R2 can reduce the execution time of SQL queries to about 52.5% on average compared to original queries, and about 40.7% compared to state-of-the-art methods. This performance boost is especially pronounced in complex queries where the traditional methods often struggle to make effective improvements.

The research acknowledges that its main limitation lies in the higher rewrite latency compared to DB only methods. Because, compared to traditional DB methods, calling LLM API and selecting demonstrations consumes more time.

Despite this delay, the benefits are clear, LLM-R2 greatly reduces the time it takes to execute queries, making the system overall very effective. This shows that LLM-enhanced methods could be an effective solution for efficiency-oriented query rewrite.

Access all our open Survey & Awards Nomination forms in one place