Listen to this story
The issue of copyright and assigning credits to the original developer has been discussed extensively ever since the release of GitHub Copilot, which is powered by OpenAI Codex. Now, GitHub and OpenAI are taking steps to avoid any further legal tussle in this context with OpenAI announcing that they are going to discontinue support for Codex API starting from March 23, 2023.
To avoid much scrutiny about this heavy move, the company said that given the advancement of their GPT-3.5 model, supporting the older model would be no longer useful for them or the customers. Therefore, all customers are “encouraged to transition to GPT-3.5-Turbo”, which is a much more cost effective and performant model in the GPT-3.5 family, according to OpenAI.
But what about the apps that are currently built on top of Codex API?
What’s The Reason?
Codex has been trained on billions of publicly available lines of code, which includes the public repositories on GitHub. This eventually led to investigations on the possibility of filing a copyright claim against GitHub, which led to a class action lawsuit against Microsoft, GitHub, and OpenAI for scraping licensed code for building AI-powered Copilot tool. Now, with GPT-4 in the picture, there is a high possibility that GPT-4 might get integrated into Copilot. The legal issue might go away but the billions of parameters that GPT-4 is trained on—we don’t know how many yet—might still run into the same problems.
OpenAI and Microsoft are seemingly pushing their customers to use GPT-4. Microsoft also integrated GPT-4 to Office 365 recently.
This also means that all the technology that the two companies have been developing in partnership might converge into one large multimodal system. The capabilities of GPT-4 already indicate that. Given that people were already shifting towards GPT-3.5 from Codex for many tasks, the capabilities of GPT-4 on Copilot would make it the best choice for developers, even for those who do not know how to code, while avoiding all the copyright issues.
Seems like a well thought out and smart move by Microsoft and OpenAI.
But even then, just a three-day notice before shutting down the service looks like a rushed move from OpenAI’s end. For this, the company has been getting some criticism as well.
The models that would be affected by this move are code-cushman:001, code-cushman:002, code-davinci:001, and code-davinci:002. Thus, these models will also be discontinued. This can be quite cumbersome for a lot of apps that utilise the model for their workflow—all the data might have to be regenerated, thereby rendering the older ones useless.
Amid the discussion on HackerNews, users have been pointing out several aspects of this move. A user points out that the migration from one to another is easy and, thus, even though someone might question the legality of doing this, it is acceptable by the community. Moreover, the terms of services of Codex clearly state that it was released as “free limited Beta”. The companies or apps that are relying on it might have to choose a different path now.
Conversely, however, Beta implies that it is expected to improve and not vanish overnight. What this means is that all customers, including the ones who have already paid for the service and are using the API are expected to transition to a different model.
To this, the company says that they understand the inconvenience but the move is to increase investment in their latest and most capable models. This definitely signifies that the company is brewing up something new.
What if it is just a push towards GitHub Copilot? There is no doubt that the research behind GPT-4 was emphatically supported by Microsoft, which also owns GitHub. Copilot was being powered by OpenAI Codex for suggesting code. Now, they can just shift to GPT-4 and ditch Codex. This might actually be a win against the legal trouble that the company has been going through—using other developers’ code. That would probably be ideal for both the companies, presuming that GPT-4 does not go through the same legal issues and is not trained in a manner similar to Codex, which is highly unlikely.
What Can Researchers Do For Now?
Interestingly, this announcement is made by just OpenAI and not Microsoft. Codex might still be available via Azure services. But OpenAI’s recent pricing was almost ten times lower than what Azure offers. Moreover, Codex has been free all this while. This means that shifting to Azure might not be the first thought that comes to developers’ and researchers’ minds.
The news is perhaps the worst for research groups. A lot of papers that were based on Codex models are now going to be rendered completely irreproducible. A lot of researchers might not have the capability of shifting to Microsoft’s Azure services and even if they do, they might not want to pay for what they were accessing for free all this while.
Some users on Twitter pointed out that the release of GPT-3.5 had actually made them shift to it, completely moving away from Codex. Maybe the traffic for Codex was actually dropping as the other services were producing the same, if not better, output.