CarperAI Unveils New Version of Code Synthesis Library OpenELM

OpenELM is based on the OpenAI’s research paper "Evolution through Large Models (ELM)
Listen to this story

Democratised AI research team of EleutherAI research collective, ‘CarperAI’ has introduced version 0.2 of OpenELM, an open-source library combining large language models with evolutionary algorithms for code synthesis.

CarperAI has also unveiled a set of differential (diff) models that can predict changes in code. These models have been trained on millions of GitHub commits. The three models, namely diff-codegen-350m, diff-codegen-2b, and diff-codegen-6b, have been fine-tuned from Salesforce’s CodeGen code synthesis models.

In order to create complex code, the models use a description of a change to generate diffs for editing existing code. This can help the model be better at correcting bugs, especially if the commit message is accurate.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

OpenELM is based on the OpenAI’s research paper titled, ‘Evolution through Large Models (ELM)’, which shows how large language models can function as intelligent mutation operators in an evolutionary algorithm, enabling diverse and excellent code output in domains that are not included in the language model’s training set.

Besides the initial features, the latest version includes integration with the triton inference server, which can speed up the inference times of codegen models by ten times. Additionally, it also supports diff models, which allows for code mutation within a loop by presenting a code segment and a commit message that describes the change.

MAP-Elites for generated code—either from a diff model or from prompt engineering an existing language model—the Sodarace 2D environment along with a number of other baseline environments were all included in the initial release of OpenELM (version 1). It also comprises benchmarking of mutation LLMs using a play environment and a sandbox employing gVisor, a Docker container, and Flask to securely run code created by language models.

According to the OpenAI paper, LLMs have performed well in automated code generation when trained on code datasets like OpenAI’s Codex. Evolutionary algorithms, on the other hand, offer a means of generating code by introducing mutations to well-known, or “seed”, programmes in situations when we are interested in a class of programmes that is hardly ever encountered in the training distribution. An LLM trained on code can recommend intelligent mutations for genetic programming (GP) algorithms, as demonstrated by the ELM method. LLMs offer a method of encoding this domain knowledge and directing the genetic algorithm towards intelligent exploration of the search area. Genetic algorithms often need to be substantially customised with domain knowledge to allow them to make desirable changes. The fundamental process is generate, evaluate, and fine-tune. Everything has been put into practise so far, except for the conditional reinforcement learning part.

Shritama Saha
Shritama (she/her) is a technology journalist at AIM who is passionate to explore the influence of AI on different domains including fashion, healthcare and banks.

Download our Mobile App


AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIM Research

Pioneering advanced AI market research

Request Customised Insights & Surveys for the AI Industry


Strengthen Critical AI Skills with Trusted Corporate AI Training

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox