MITB Banner

DeepMind’s AlphaCode is Now Available on GitHub

AlphaCode is a transformer-based language model that consists of 41.4 billion parameters.

Share

Listen to this story

Recently, Google-backed DeepMind announced the launch of its generator model, AlphaCode, on GitHub, where it has made the dataset and code available. 

Click here to view.

With this latest announcement, the company has also included extensive tests on the platform to ensure the programmes that pass these tests are correct—a critical feature current datasets lack. 

Earlier this year, AlphaCode had made waves through its potential to beat computer programmers by analysing the algorithm and generating complex programmes. 

Simplifying computer programming 

The developers at DeepMind tested the potential of AlphaCode by testing it in competitive programming websites where human developers are given programming problems and ranked on the basis of their results.

One of them was a competitive coding competition on Codeforces, a popular platform for hosting coding competitions. A selection of ten varied test problems from different stages of development was given to AlphaCode. 

The AI tool achieved an estimated rank within the top 54 percentile of participants that attended the contest, thus proving that AlphaCode’s code generation system has achieved results at a competitive level. 

AlphaCode vs Codex 

AlphaCode is a transformer-based language model that consists of 41.4 billion parameters. It is a language model four times the size of GitHub Copilot’s language model Codex that parses 12 billion parameters only. The architecture of AlphaCode is based on three parts:

  • Data: The AI tool is fed data by public GitHub repositories. 
  • Learning: The tool then trains on the datasets and calibrates them to the task’s requirements (e.g., competitive programming at Codeforces).
  • Sampling and evaluation: Here, the AI tool performs large-scale sampling of variations of programmes for each problem. Then, through the process of filter and cluster, the programmes are ranked into a small subset of ten solutions that are submitted for external assessment.

AlphaCode’s AI system is pre-trained in various programming languages that include C++, C#, Go, Java, JavaScript, Lua, PHP, TypeScript, Ruby, Scala, Rust and Python. This dataset consists of approximately 715 GB of codes along with their descriptions.

Through AlphaCode, DeepMind has been able to fill the gap that is lacking in AI models like Codex, which is problem-solving skills. AlphaCode has not only been trained to “understand” natural language but also to design complex programmes and algorithms and implement them in code. 

AI expert Alberto Romero said in an article that the company created five sizes of AlphaCode models, which included parameters spanning 300M, 1B, 3B, 9B, and 41B. All these are named AlphaCode, but the one that the organisation refers to in their communications is an ensemble of the 9B and 41B models combined with clustering. 

Romero further said that they built models of different sizes to compare the effects of scale, training times, and compute efficiency, among other factors  He also said that the model tends to program better in Python than C++ and generates a similar amount of dead code to humans.

Share
Picture of Aparna Iyer

Aparna Iyer

Aparna Iyer has covered various sectors spanning education, wildlife, culture and law for close to a decade. She now writes on technology and is keen to unearth its capability for public good.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.