Copilot vs AlphaCode: The race for coding supremacy

Deepmind's AlphaCode made headlines by testing in the top 54% of human coders. Can GitHub's Copilot keep up with AlphaCode's automated programming?

A study conducted by Cambridge University reveals that a majority of a developer’s time is spent on debugging. This time-consuming task costs the software industry around 300 billion USD every year. Deepmind’s latest artificial intelligence-based code development and analysis tool boasts of reducing such costs by automating the routine and time-consuming tasks of the developers.

In contrast with GitHub Copilot, which suggests code, AlphaCode is capable of analysing the algorithm and generating complex programmes at a competitive level that is not only devoid of error but also corresponding to its description.

The developers at DeepMind tested the potential of AlphaCode by testing it in competitive programming websites where human developers are given programming problems and ranked based on their results.

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

AlphaCode – The autonomous programmer

AlphaCode is a transformer-based language model that consists of 41.4 billion parameters. It is a language model that is four times the size of GitHub Copilot’s language model Codex that parses 12 billion parameters only. The architecture of AlphaCode is based on three parts:

  1. Data – The AI tool is fed data by public GitHub repositories. 
  2. Learning – The tool then trains on the datasets and calibrates them to the requirements of the task (for e.g., competitive programming at Codeforces).
  3. Sampling and evaluation – Here, the AI tool performs large scale sampling of variations of programs for each problem. Then through the process of filter and cluster, the programs are ranked into a small subset of 10 solutions that is submitted for external assessment.
Description of Alphacode's working process

Figure: Flowchart of the working of AlphaCode


Download our Mobile App



Source: deepmind.com

AlphaCode’s AI system is pre-trained in a variety of programming languages that are C++, C#, Go, Java, JavaScript, Lua, PHP, TypeScript, Ruby, Scala, Rust and Python. This dataset consists of approximately 715GB of codes along with their descriptions.

AlphaCode put to the test

The AI tool was entered into a competitive coding competition on Codeforces, a popular platform for hosting coding competitions. The platform shares problems weekly and ranks the participants with the help of an algorithm written in similar lines with the Elo rating system used to rank chess players. A selection of 10 varied test problems from different stages of development was given to AlphaCode. The AI tool achieved an estimated rank within the top 54 percentile of participants that attended the contest, thus proving that AlphaCode’s code generation system has achieved results at a competitive level. AlphaCode’s capability of generating code is demonstrated in an example of one of Codeforces’ problems given below:

Problem that Alphacode faced

Figure: The problem presented to AlphaCode, is to figure out the possibility of converting one phrase to another by pressing the backspace instead of writing.

Solution provided by Alphacode

Figure: The solution generated by AlphaCode after reading the logic of the problem and producing code that meets the expectations.

Mike Mirzayanov, the founder of Codeforces, expresses his surprise as he states: “I was sceptical because even in simple competitive problems, it is often required not only to implement the algorithm but also (and this is the most difficult part) to invent it. AlphaCode managed to perform at the level of a promising new competitor.” Mike further added that “I can safely say the results of AlphaCode exceeded my expectations.”

Can Copilot keep up?

OpenAI’s AI code suggestion tool GitHub Copilot runs on the natural language processing (NLP) model Codex, a boosted version of GPT-3. While it is built with the vision to achieve goals that are similar to that of AlphaCode, Copilot seems to have a difficult road ahead. Here are some of the differences between the two code generation tools.

  • Training – GitHub Copilot’s AI Codex is trained to identify 12 billion parameters as compared to AlphaCode’s AI-based code generation model that is trained with 40 billion parameters. This improves AlphaCode’s performance four-fold.
  • Suggestion vs Generation: While GitHub Copilot is built to assist programmers in writing the rudimentary sections of code, AlphaCode is capable of generating complete complex programs.
  • Complexity – While both AI tools are in the beginning stages of development, GitHub Copilot suggests basic code involving simple logic, whereas AlphaCode is tested to produce complex algorithms at a competitive level.

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Kartik Wali
A writer by passion, Kartik strives to get a deep understanding of AI, Data analytics and its implementation on all walks of life. As a Senior Technology Journalist, Kartik looks forward to writing about the latest technological trends that transform the way of life!

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

Council Post: Evolution of Data Science: Skillset, Toolset, and Mindset

In my opinion, there will be considerable disorder and disarray in the near future concerning the emerging fields of data and analytics. The proliferation of platforms such as ChatGPT or Bard has generated a lot of buzz. While some users are enthusiastic about the potential benefits of generative AI and its extensive use in business and daily life, others have raised concerns regarding the accuracy, ethics, and related issues.