Listen to this story
A few days ago, a bug in an open-source library caused ChatGPT to be taken off the grid. This issue allowed a small percentage of users to view the titles of other users’ conversation history, as well as the payment-related details of ChatGPT Plus subscribers. But, Redis was here to save OpenAI’s day.
“A bug report is one way to know someone uses your open-source software,” said Yiftach Shoolman, co-founder and CTO at Redis. The Redis team came to know that it is in fact their software which OpenAI is using to scale ChatGPT, when OpenAI reached out to them to find a fix for the bug, which their team was happy to do.
The bug and the fix
On Monday, March 20 at 1 am PST, a bug was introduced into the server at OpenAI that caused a surge in Redis request cancellations. The bug exposed information related to active users’ first and last name, email address, payment address, the last four digits (only) of a credit card number, and credit card expiration date. OpenAI uses Redis to cache user information in their server, and this bug resulted in a small likelihood of each connection returning corrupted data.
The Redis Cluster is used to distribute the load across multiple Redis instances, and OpenAI interfaces with Redis through the redis-py library. The interface with Redis is from OpenAI’s Python server, which runs with Asyncio. When using Asyncio, requests and responses with redis-py behave as two queues. If a request is cancelled after being pushed onto the incoming queue but before the response is popped from the outgoing queue, the connection becomes corrupted.
In most cases, this results in an unrecoverable server error, but in some cases, the corrupted data matches the data type that the requester was expecting, leading to the possibility of bad data being returned. This bug only affected the Asyncio redis-py client for Redis Cluster, and the OpenAI team has since resolved the issue.
Here is a close representation of the role Redis is playing in ChatGPT. Created by Shahrukh Khan and Navdeeppal Singh, this is a user flow of ChatGPT-memory, an extension for ChatGPT API.
After discovering the bug, OpenAI took several measures to enhance their system’s security. They thoroughly tested the fix, implemented redundant checks, and examined their logs to ensure messages are only accessible to the intended user. They also notified affected users and improved their logging system to detect and resolve similar incidents. OpenAI further improved the scalability and robustness of their Redis cluster to minimise connection errors during heavy traffic.
OpenAI was in high praise of Redis after the issue was resolved. They said, “Redis, along with other open-source software, plays a crucial role in our research efforts. Their significance cannot be understated—we would not have been able to scale ChatGPT without Redis.”
Speaking to AIM earlier, Shoolman had said that the draw in the open-source community was so high, mainly for three reasons: speed, versatility, and simplicity. Akin to a Swiss knife, Redis was able to cater to multiple use cases, while also keeping the documentation simple and intuitive for developers to learn and adopt.
A one-way effort
Initially founded as a non-profit AGI research organisation for developing open-source AI applications and algorithms, OpenAI quickly changed its course. While some attribute this shift to the high cost of AI research, and investors’ preference for profitable startups, OpenAI’s co-founder has a different perspective.
In a conversation with The Verge, Ilya Sutskever said, “We were wrong. Flat out, we were wrong.” According to the chief scientist of the company, AI or AGI is likely to become incredibly potent at some point, and therefore, open-sourcing it may not be the best approach.
But, it is not merely the safety aspect. The recent 98-page GPT-4 paper also mentions “competitive landscape” to be equally a factor in OpenAI proudly declaring that the report will contain no details about the architecture (including model size), hardware, training compute, training datasets and method, etc.
The company is using open-source extensively for the creation and maintenance of its products, but has completely closed itself to the world. The one-way effort has been in contest for a long time. Are companies obliged to give back to the open-source community in exchange for using codes from the community?
The question is not meant to target companies like OpenAI, but is a serious reflection on the future of open source. “These are not just interesting times to live, they are interesting, but potentially disastrous times. Very few people argue against the wider benefits of open source software, but its overall status and long-term future are not guaranteed unless more of us all get involved at one level or another,” wrote Adrian Bridgwater, a freelance technology journalist, for Forbes.