Listen to this story
|
While generative AI has become the biggest buzzword of 2023, enterprises are still scrambling to find use cases for this technology. Moreover, the common limitations of LLMs — like hallucinations, inability to stay topical, and susceptibility to prompt injection, make them a hot potato for enterprises.
To solve some of these problems and remove the roadblocks stalling enterprise LLM adoption, companies have taken it upon themselves to create guardrails that will keep these stochastic parrots talking on the topic. For instance, OpenAI’s content moderation endpoint prevents its APIs from generating hurtful or hateful content, while others, like NVIDIA’s NeMo Guardrails SDK, protect against the shortcomings of LLMs more comprehensively.
Meanwhile, recent reports claimed that researchers had found shortcomings in NVIDIA’s NeMo Guardrails SDK. While the functionality of guardrails can be extended, it is clear that we are still on the path of making LLMs safe for enterprise use. Now, we are seeing the beginning of an ecosystem of tools making LLMs safe for enterprises. Will this be enough to spur LLM adoption in the enterprise?
Enter LangKit
LangKit was released earlier this week as a way to keep track of the outputs of LLMs. This monitoring toolkit provides a set of metrics that can gauge the quality of input or output texts from various LLM-powered chatbots like Bing Chat, GPT-4, Baidu, and Google Bard. It also plugs into other ecosystems like Hugging Face and LangChain.
LangKit was open-sourced by a company called WhyLabs, which is building an AI observability platform for enterprises. Through the WhyLabs platform, enterprises can detect ML issues faster and deploy models continuously. It also has features like outlier detection, data drift monitoring, and a strong focus on data privacy
This makes the LangKit toolkit the final piece of the puzzle when it comes to WhyLabs’ platform, which already takes on a host of the AI observability workload for enterprises. The toolkit was released as a way to offer greater visibility into what the LLMs are doing in deployment. It also includes a host of preventative measures to protect against malicious users.
LangKit can safeguard LLMs from hallucinations, toxicity, malicious prompts, and even jailbreak attempts. The toolkit does so by monitoring a variety of metrics such as text quality, text relevance, the overall security and privacy of the LLM, and the sentiment of the text.
Security is preserved by putting the user input through a similarity score algorithm, which compares the user input against known jailbreak attempts and prompt injection attacks. If the score is too high, the input is rejected. Similarly, the toolkit checks against a variety of metrics to ensure that the LLM works according to its specifications.
LangKit is just one of the many tools that companies can use to make their models safer for use. As enterprise adoption continues, we are seeing more products being released that mitigate some of the cons of deploying LLMs.
Solving for LLM dependability
LLMs are finicky beasts to tame. Due to their training process and datasets, aligning these glorified autocomplete algorithms is a difficult undertaking. Criticisms against LLMs include that they are ‘bullshit generators’ or ‘stochastic parrots’ which do not understand natural language but know how to manipulate it.
This set of vulnerabilities has resulted in LLMs being shut down in two days, a controversy around a deranged chatbot, and even stock price collapses. With the risks attached to deploying an LLM in an enterprise environment, it’s no surprise that decision-makers are dragging their feet.
Solutions like LangKit, NeMo Guardrails, and many more emerging companies for AI safety and observability are exactly what the enterprise needs to adopt LLMs. These tools, along with AI observability platforms, can help companies adopt LLMs without the risk of misaligned outputs.
Although few and far between, AI observability platforms are being offered by companies like IBM, WhyLabs, and Fiddler. These services can bridge the gap between processes and LLMs, providing greater transparency for companies when deploying these models. When combined with guardrails, this can be the beginning of a tech stack for enterprise LLMs.
The eventual growth of the LLM security ecosystem will become one of the biggest factors to urge enterprise leaders to stay ahead of the curve and adopt new technology as it emerges. However, companies must also ensure that they keep up with AI observability platforms and guardrails to deploy LLMs safely and responsibly.