MITB Banner

CarperAI Unveils New Version of Code Synthesis Library OpenELM

OpenELM is based on the OpenAI’s research paper "Evolution through Large Models (ELM)

Share

Listen to this story

Democratised AI research team of EleutherAI research collective, ‘CarperAI’ has introduced version 0.2 of OpenELM, an open-source library combining large language models with evolutionary algorithms for code synthesis.

CarperAI has also unveiled a set of differential (diff) models that can predict changes in code. These models have been trained on millions of GitHub commits. The three models, namely diff-codegen-350m, diff-codegen-2b, and diff-codegen-6b, have been fine-tuned from Salesforce’s CodeGen code synthesis models.

In order to create complex code, the models use a description of a change to generate diffs for editing existing code. This can help the model be better at correcting bugs, especially if the commit message is accurate.

OpenELM is based on the OpenAI’s research paper titled, ‘Evolution through Large Models (ELM)’, which shows how large language models can function as intelligent mutation operators in an evolutionary algorithm, enabling diverse and excellent code output in domains that are not included in the language model’s training set.

Besides the initial features, the latest version includes integration with the triton inference server, which can speed up the inference times of codegen models by ten times. Additionally, it also supports diff models, which allows for code mutation within a loop by presenting a code segment and a commit message that describes the change.

MAP-Elites for generated code—either from a diff model or from prompt engineering an existing language model—the Sodarace 2D environment along with a number of other baseline environments were all included in the initial release of OpenELM (version 1). It also comprises benchmarking of mutation LLMs using a play environment and a sandbox employing gVisor, a Docker container, and Flask to securely run code created by language models.

According to the OpenAI paper, LLMs have performed well in automated code generation when trained on code datasets like OpenAI’s Codex. Evolutionary algorithms, on the other hand, offer a means of generating code by introducing mutations to well-known, or “seed”, programmes in situations when we are interested in a class of programmes that is hardly ever encountered in the training distribution. An LLM trained on code can recommend intelligent mutations for genetic programming (GP) algorithms, as demonstrated by the ELM method. LLMs offer a method of encoding this domain knowledge and directing the genetic algorithm towards intelligent exploration of the search area. Genetic algorithms often need to be substantially customised with domain knowledge to allow them to make desirable changes. The fundamental process is generate, evaluate, and fine-tune. Everything has been put into practise so far, except for the conditional reinforcement learning part.

Share
Picture of Shritama Saha

Shritama Saha

Shritama (she/her) is a technology journalist at AIM who is passionate to explore the influence of AI on different domains including fashion, healthcare and banks.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India