Listen to this story
AI chatbots and hallucinations go hand-in-hand. Even if the technology has all the hype around it with ChatGPT, Bard, and a lot others releasing every second week, we cannot deny that it makes mistakes that can be dangerous as well. Sometimes the models might sound motivated to lie and even gaslight its users and also say negative things.
NVIDIA decided to take matters into its own hands in trying to fix this. The company has released NeMo Guardrails, an open-source toolkit that aims to make large language models (LLMs) based applications “accurate, appropriate, on topic and secure”, as announced in a blog by the company.
The toolkit made by NVIDIA is powered by and for LangChain, which was created by Harrison Chase. The toolkit was built for providing easy-to-use templates and patterns for building LLM-powered applications. Users can easily create boundaries around AI apps by adding the NeMo Guardrails on apps built using LangChain. It can also work with Zapier platform apps.
Chase said that John C, another developer behind LangChain, also had the idea of installing guardrails around their native development a few months back and have incorporated the ideas from it on the new guardrails by NVIDIA.
Jonathan Cohen, VP of applied research at NVIDIA, said that the company has been working on guardrails around similar systems for quite some time and GPT-4 and ChatGPT gave the right idea to him and the company. Cohen told TechCrunch, “AI model safety tools are critical to deploying models for enterprise use cases.”
The open-source nature of NeMo guardrails along with LangChain on Python, will allow any developer to use them, even if they are not a machine learning expert, the company said. These can be used for any tool that an enterprise uses with just a few lines of code. The NeMo framework is available on the NVIDIA AI Enterprise, along with being present on GitHub for developers.
Existence of Guardrails
Some criticise OpenAI and ChatGPT for their capability of generating harmful results, while others like Elon Musk criticise it for being too woke. Either way, it is important to put up guardrails to ensure that these models stop hallucinating at the least.
Apart from recently publishing a blog about brand guidelines refraining people from developing ‘GPT’-named apps and also filing for trademark on the same, the company has also been concerned about the safety and reliability of its models. The company had updated their usage policy last month to ensure there is no illegal activity including hateful content and generation of malware, while also disallowing a lot of content. It is now also allowing users to delete their chat history and data.
This also brings in the question of biases in chatbots. In February, OpenAI had published the blog, “How should AI systems behave, and who should decide?”, where the company explained the working of ChatGPT and how it is willing to allow more customisation and input from the public on the decision-making process. For this, the company decided to get reviewers to fine-tune their model.
Interestingly, the blog also said that OpenAI will allow ChatGPT to generate content that many people, including them, would strongly disagree with. “In some cases ChatGPT currently refuses outputs that it shouldn’t, and in some cases, it doesn’t refuse when it should. We believe that improvement in both respects is possible.” Striking the right balance is important.
Mira Murati, the CTO of OpenAI, on The Daily Show with Trever Noah said that there should be more involvement of the government to build regulations around AI products like ChatGPT.
On similar lines, Sundar Pichai said in a CBS “60 Minutes” interview that installing guardrails around AI is not for the company to decide alone. Similarly, former Google head Eric Schmidt also said that instead of pausing AI advancements and training of models that was proposed in a recent petition, it is more important for everyone to come together and discuss the appropriate guardrails.
Are there any problems?
Essentially, as defined by NVIDIA, these guardrails are sitting between the user and the conversational AI application. Though it will filter out content based on the topic, making it sound more relevant, it will also filter out content specified by the developer of the chatbot as unsafe or unethical.
This brings in the question of human induced bias in the chatbots as well. It is true that there should be some guardrails around chatbots to prevent them from generating dangerous content, defining hateful and banning that content can be biassed. This might make chatbots like ChatGPT even more restrictive than they are, even though the intention was the opposite.
Guardrails might make the content generated by chatbots more topical, but might also induce more bias, making them less reliable, though “safe”
The two month old OpenAI blog about letting people fine-tune its model, and to also allow content that they do not agree with sounds somewhat opposite to this. Who knows what would happen with Musk’s TruthGPT? Would it have the right type of guardrails?