Active Hackathon

What’s the whole buzz about Salesforce’s CodeGen

Salesforce does clarify that knowing the basics about coding does help with asking appropriate follow-up questions.

Towards the end of last month, Salesforce released a large scale language model called CodeGen, which many in the AI community anticipated to be a “Github Copilot killer.” While we don’t yet know if this is possible until the reviews for CodeGen fly in, it is safe to say that we are in the middle of a paradigm shift in how we interact with our computers. If it was hard to imagine that a user could just develop an app by telling the machine in simple language what the app does, that is exactly what CodeGen does. 

Coding knowledge 


Sign up for your weekly dose of what's up in emerging technology.

CodeGen is a leap from code generators like GitHub copilot because its application is even simpler than writing instructions – it is simply talking. Github Copilot, on the other hand, was launched as a tool that autocompletes snippets of code. Copilot also performs other functions, like analysing code that is already written, generating matching code and including specific functions that were called previously. 

Salesforce’s CodeGen wanted to remove the constraints that come along with writing a new coding language. These include:

  • The time that is required to learn a coding language properly and then apply it well. 
  • The level of difficulty involved might deter some people from the training process. 
  • The costs involved with learning coding from either schools or even online courses.

With CodeGen, an AI assistant basically translates English descriptions and instructions into executable Python code. This opens up the playing field and allows anyone to write code, even if they don’t know anything about programming. 

This condition comes with a caveat, though. Salesforce does clarify that knowing the basics about coding does help with asking appropriate follow-up questions. This could help the language expand into new areas while the user is still figuring out how to build the code, like whether the language should have recursion or not, etc. This is even more important when it comes to more complex problems. Some coding knowledge may help the user suggest alternative approaches so that the software lands on a working solution. 

Advantages of CodeGen

Released in June last year, GitHub Copilot was built on OpenAI’s Codex, an algorithm that is like, a descendant of OpenAI’s flagship language-generating algorithm, GPT-3. Salesforce trained CodeGen, a 16-billion parameter auto-regressive language model, on a large corpus of natural as well as programming languages. 

Salesforce claims that CodeGen’s efficiency increases as the number of model parameters are scaled in proportion to the number of training samples. The output that CodeGen promises to give is executable code, meaning high-quality code that can be executed by a programmer without any revisions being made to it. 

While holding a conversation is one of the most trivial tasks for humans, for a machine to be able to hold a realistic conversation is one of the biggest challenges in AI. According to the research done by Salesforce, the first test saw that the model responded in a formal language, but eventually, the model will have to work with a mix of natural language and formal language. A conversation of several consecutive questions asked by the user in natural language and then the answer given by the machine is a part of a single long sequence. An auto-regressive decoder model samples the next response based on the context of the previous conversation. 

The big picture

CodeGen is mainly a part of the push that no-code platforms are witnessing. With the growing demand for coding and the increase in the complexity of programming languages, no-code solutions like CodeGen are a potential blessing. This democratises coding and levels the playing field for people with limited access. However, there is no danger of CodeGen putting developers out of work. 

Also, in many ways, conversational AI seems to be at the forefront of the biggest innovations in AI. CodeGen’s usage of conversational programming pushes this and has been open-sourced to speed up studies in this area.  

More Great AIM Stories

Poulomi Chatterjee
Poulomi is a Technology Journalist with Analytics India Magazine. Her fascination with tech and eagerness to dive into new areas led her to the dynamic world of AI and data analytics.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM