How An AI Code Autocompleter Works?

An average smartphone OS contains more than 10 million lines of code. A million lines of code takes 18000 pages to print which is equal to Tolstoy’s War and Peace put together, 14 times!

Though the number of lines of code is not a direct measure of the quality of a developer, it indicates the quantity which has been generated over the years. 

There is always a simpler, shorter version of the code and also a longer more exhaustive version. What if there is a tool which uses machine learning algorithms to pick out the most suitable code and prompt with a drop down menu? There is one now—-Deep TabNine.

The developers behind TabNine have introduced Deep TabNine, which is created as a language-agnostic autocompleter. 

The core idea here is to index the code and detect statistical patterns to make better suggestions while writing code.

This brings additional gains in responsiveness, reliability, and ease of configuration because TabNine doesn’t need to compile the code.

GPT-2 Powered TabNine 

The above picture shows how typing ‘ex’, makes the IDE  to prompt for related options.

TabNine is an autocompleter that helps the developers write code faster. To improve the suggestion quality, the team behind TabNine added a deep learning model.

Deep TabNine is based on GPT-2, which uses the Transformer network architecture. GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset of 8 million web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text.

GPT-2 generates synthetic text samples in response to the model being primed with an arbitrary input. It adapts to the style and content of the conditioning text. This allows the user to generate realistic and coherent continuations about a topic of their choosing.

Semantic completion is provided by external software which TabNine communicates with using the Language Server Protocol. TabNine comes with default install scripts for several common language servers, and this is fully configurable, so one can use a different language server or add semantic completion for a new language. 

Deep TabNine uses  subtle clues that are difficult for traditional tools to access. For example, the return type of app.get_user() is assumed to be an object with setter methods, while the return type of app.get_users() is assumed to be a list

Although modeling code and modeling natural language might appear to be unrelated tasks, modeling code requires understanding English in some unexpected ways.

Developers, instead of worrying about missing a trivial syntax or defining a class for a task specific functionality, can now proceed with their work at a higher level with Deep TabNine powered by OpenAI’s GPT-2.

Deep TabNine requires a lot of computing power and running the model on a laptop would have latency. To address this challenge,  the team is now offering a service that will allow developers to useTabNine’s servers for GPU-accelerated autocompletion. It’s called TabNine Cloud.

Why Should One Opt For TabNine?

  1. TabNine works for all programming languages.
  2. TabNine does not require any configuration in order to work.
  3. TabNine does not require any external software (though it can integrate with it).
  4. Since TabNine does not parse the code, it will never stop working because of a mismatched bracket.
  5. If the language server is slow, TabNine will provide its own results while querying the language server in the background. TabNine typically returns its results in 20 milliseconds.

Supported languages:

Deep TabNine supports Python, JavaScript, Java, C++, C, PHP, Go, C#, Ruby, Objective-C, Rust, Swift, TypeScript, Haskell, OCaml, Scala, Kotlin, Perl, SQL, HTML, CSS, and Bash.

Get hands on with Deep TabNine here.

More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

More Stories


8th April | In-person Conference | Hotel Radisson Blue, Bangalore

Organized by Analytics India Magazine

View Event >>

30th Apr | Virtual conference

Organized by Analytics India Magazine

View Event >>

Yugesh Verma
All you need to know about Graph Embeddings

Embeddings can be the subgroups of a group, similarly, in graph theory embedding of a graph can be considered as a representation of a graph on a surface, where points of that surface are made up of vertices and arcs are made up of edges

Yugesh Verma
A beginner’s guide to Spatio-Temporal graph neural networks

Spatio-temporal graphs are made of static structures and time-varying features, and such information in a graph requires a neural network that can deal with time-varying features of the graph. Neural networks which are developed to deal with time-varying features of the graph can be considered as Spatio-temporal graph neural networks. 

Yugesh Verma
A guide to explainable named entity recognition

Named entity recognition (NER) is difficult to understand how the process of NER worked in the background or how the process is behaving with the data, it needs more explainability. we can make it more explainable.

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM