AIM logo Black
Search
Close this search box.

RAG With Microsoft Copilot

“There’s no question that RAG is core to any AI-powered application, especially in the enterprise today,” says Microsoft chief Satya Nadella.

Share

Illustration by Nikhil Kumar

At Microsoft Build 2024, the company announced several new tools and advancements, most notably retrieval augmented generation, or RAG for short,  incorporation in the Copilot Library, making it easier to use on-device data for your applications. It provides the right tools to build a vector store within the platform and enables semantic search, similar to Recall. 

The RAG architecture offers an enterprise solution by allowing one to constrain generative AI to the organisation’s content. This content can come from vectorised documents, images, and other data formats, provided you have embedding models for them.

“There’s no question that RAG is core to any AI-powered application, especially in the enterprise today. Azure AI Search makes it possible to run RAG at any scale, delivering highly accurate responses using state-of-the-art retrieval systems,” said Microsoft chief Satya Nadella, saying that ChatGPT, data assistants, are all powered by Azure AI Search today.

Further, for efficient app development, one can combine the smart, human-like responses of Azure OpenAI with MySQL’s powerful database management and Azure AI Search’s advanced capabilities.

This integration enhances CMS, e-commerce, or gaming sites on Azure Database for MySQL by incorporating generative AI search and chat using LLMs from Azure OpenAI, along with vector storage and indexing from Azure AI Search, supported by RAG.

Azure AI Search capabilities were introduced one month ago. As a proven solution for information retrieval in a Retrieval-Augmented Generation (RAG) architecture, Azure AI Search offers robust indexing and query functionalities. It leverages the infrastructure and security of the Azure cloud, ensuring reliable and secure performance.

The Relevance of RAG

RAG was introduced to address LLM hallucinations by extending the model’s capabilities to external sources, vastly widening accessible information. LLMs rely on statistical patterns without true comprehension, excelling at text generation but struggling with logical reasoning. They hallucinate because they are fixed to the training data, no matter how big the model size or how long the context length.

Additionally, RAG allows customers to add new datasets, providing fresh information for the LLM to generate accurate answers, enabling enterprises to derive insights from their own data.

Every LLM builder does this by trying to expand the size of the model, or in the case of LLMs like Perplexity, get the answers directly from the internet. This also allows them to generate current and reliable information by not relying on outdated facts. Meanwhile, RAG ensures the LLM always has the latest and most trustworthy information at its disposal.

To ensure domain-specific knowledge, Cohere and Anthropic let enterprises provide their own personal data through Oracle Cloud to expand on the internal data. These LLMs, with RAG, provide insights that are more personalised with the company’s data.

Despite people claiming RAG is obsolete, it is actually evolving and increasingly being adopted by enterprises. RAG’s versatility spans various domains, such as customer service, educational tools, and content creation. Also, the developer community is actively exploring new ways to enhance RAG, such as creating applications with Llama-3 running locally.

Additionally, RAG is no longer limited to vector database matching. Many advanced RAG techniques are being introduced that significantly improve retrieval. For instance, integrating Knowledge Graphs (KGs) into RAG leverages structured, interlinked data, enhancing the system’s reasoning capabilities.

Going Beyond RAG

RAG reduces the rate of hallucinations by ensuring that all generated responses are supported by evidence, preventing the model from speculating blindly. 

Meanwhile, there are other techniques to reduce hallucinations in LLMs, including Chain-of-Verification (CoVe) by Meta AI, which reduces hallucinations in LLMs by breaking fact-checking into manageable steps. It generates an initial response, plans verification questions, answers these independently, and produces a final verified response. Likewise, several other methods to reduce hallucinations have been pioneered, enabling the creation of more robust LLM systems.

On the other hand, with the launch of GPT-4 Turbo and the Retrieval API, OpenAI had also tried its hand at fixing the hallucination problem. With a long context length and the option for enterprises to integrate new data for information, OpenAI has almost cracked and solved the most important problem of LLMs, but at the cost of the data privacy of users.

For example, with a little fancier prompt engineering, a user on X was able to download the original knowledge files from someone else’s GPTs, an app built with the recently released GPT Builder, with RAG. This poses a major security issue for the model.

With Microsoft Copilot incorporating RAG into its library, it may also experiment with other ways to reduce hallucinations and better serve enterprises and their customers. But just like OpenAI, this has to be taken with a pinch of salt, especially in terms of privacy risks.

Share
Picture of Gopika Raj

Gopika Raj

With a Master's degree in Journalism & Mass Communication, Gopika Raj infuses her technical writing with a distinctive flair. Intrigued by advancements in AI technology and its future prospects, her writing offers a fresh perspective in the tech domain, captivating readers along the way.
Related Posts
CORPORATE TRAINING PROGRAMS ON GENERATIVE AI
Generative AI Skilling for Enterprises
Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.
Upcoming Large format Conference
June 28, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
discord-icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.

Subscribe to Our Youtube channel