Last updated May 24, 2024
In AI Trends & Future

RAG With Microsoft Copilot

“There’s no question that RAG is core to any AI-powered application, especially in the enterprise today,” says Microsoft chief Satya Nadella.

Share

Illustration by Nikhil Kumar

Published on May 24, 2024

by Gopika Raj

At Microsoft Build 2024, the company announced several new tools and advancements, most notably retrieval augmented generation, or RAG for short, incorporation in the Copilot Library, making it easier to use on-device data for your applications. It provides the right tools to build a vector store within the platform and enables semantic search, similar to Recall.

The RAG architecture offers an enterprise solution by allowing one to constrain generative AI to the organisation’s content. This content can come from vectorised documents, images, and other data formats, provided you have embedding models for them.

“There’s no question that RAG is core to any AI-powered application, especially in the enterprise today. Azure AI Search makes it possible to run RAG at any scale, delivering highly accurate responses using state-of-the-art retrieval systems,” said Microsoft chief Satya Nadella, saying that ChatGPT, data assistants, are all powered by Azure AI Search today.

Further, for efficient app development, one can combine the smart, human-like responses of Azure OpenAI with MySQL’s powerful database management and Azure AI Search’s advanced capabilities.

This integration enhances CMS, e-commerce, or gaming sites on Azure Database for MySQL by incorporating generative AI search and chat using LLMs from Azure OpenAI, along with vector storage and indexing from Azure AI Search, supported by RAG.

Elevating RAG and Search: The Synergy of Azure AI Document Intelligence and Azure OpenAI https://t.co/za5fmdLqmH
— Everything Microsoft (@EverythingMS) December 14, 2023

Azure AI Search capabilities were introduced one month ago. As a proven solution for information retrieval in a Retrieval-Augmented Generation (RAG) architecture, Azure AI Search offers robust indexing and query functionalities. It leverages the infrastructure and security of the Azure cloud, ensuring reliable and secure performance.

The Relevance of RAG

RAG was introduced to address LLM hallucinations by extending the model’s capabilities to external sources, vastly widening accessible information. LLMs rely on statistical patterns without true comprehension, excelling at text generation but struggling with logical reasoning. They hallucinate because they are fixed to the training data, no matter how big the model size or how long the context length.

Additionally, RAG allows customers to add new datasets, providing fresh information for the LLM to generate accurate answers, enabling enterprises to derive insights from their own data.

Every LLM builder does this by trying to expand the size of the model, or in the case of LLMs like Perplexity, get the answers directly from the internet. This also allows them to generate current and reliable information by not relying on outdated facts. Meanwhile, RAG ensures the LLM always has the latest and most trustworthy information at its disposal.

To ensure domain-specific knowledge, Cohere and Anthropic let enterprises provide their own personal data through Oracle Cloud to expand on the internal data. These LLMs, with RAG, provide insights that are more personalised with the company’s data.

Despite people claiming RAG is obsolete, it is actually evolving and increasingly being adopted by enterprises. RAG’s versatility spans various domains, such as customer service, educational tools, and content creation. Also, the developer community is actively exploring new ways to enhance RAG, such as creating applications with Llama-3 running locally.

Additionally, RAG is no longer limited to vector database matching. Many advanced RAG techniques are being introduced that significantly improve retrieval. For instance, integrating Knowledge Graphs (KGs) into RAG leverages structured, interlinked data, enhancing the system’s reasoning capabilities.

Going Beyond RAG

RAG reduces the rate of hallucinations by ensuring that all generated responses are supported by evidence, preventing the model from speculating blindly.

Meanwhile, there are other techniques to reduce hallucinations in LLMs, including Chain-of-Verification (CoVe) by Meta AI, which reduces hallucinations in LLMs by breaking fact-checking into manageable steps. It generates an initial response, plans verification questions, answers these independently, and produces a final verified response. Likewise, several other methods to reduce hallucinations have been pioneered, enabling the creation of more robust LLM systems.

On the other hand, with the launch of GPT-4 Turbo and the Retrieval API, OpenAI had also tried its hand at fixing the hallucination problem. With a long context length and the option for enterprises to integrate new data for information, OpenAI has almost cracked and solved the most important problem of LLMs, but at the cost of the data privacy of users.

For example, with a little fancier prompt engineering, a user on X was able to download the original knowledge files from someone else’s GPTs, an app built with the recently released GPT Builder, with RAG. This poses a major security issue for the model.

With Microsoft Copilot incorporating RAG into its library, it may also experiment with other ways to reduce hallucinations and better serve enterprises and their customers. But just like OpenAI, this has to be taken with a pinch of salt, especially in terms of privacy risks.

Access all our open Survey & Awards Nomination forms in one place

Gopika Raj

With a Master's degree in Journalism & Mass Communication, Gopika Raj infuses her technical writing with a distinctive flair. Intrigued by advancements in AI technology and its future prospects, her writing offers a fresh perspective in the tech domain, captivating readers along the way.