Active Hackathon

How An AI Code Autocompleter Works?

An average smartphone OS contains more than 10 million lines of code. A million lines of code takes 18000 pages to print which is equal to Tolstoy’s War and Peace put together, 14 times!

Though the number of lines of code is not a direct measure of the quality of a developer, it indicates the quantity which has been generated over the years. 


Sign up for your weekly dose of what's up in emerging technology.

There is always a simpler, shorter version of the code and also a longer more exhaustive version. What if there is a tool which uses machine learning algorithms to pick out the most suitable code and prompt with a drop down menu? There is one now—-Deep TabNine.

The developers behind TabNine have introduced Deep TabNine, which is created as a language-agnostic autocompleter. 

The core idea here is to index the code and detect statistical patterns to make better suggestions while writing code.

This brings additional gains in responsiveness, reliability, and ease of configuration because TabNine doesn’t need to compile the code.

GPT-2 Powered TabNine 

The above picture shows how typing ‘ex’, makes the IDE  to prompt for related options.

TabNine is an autocompleter that helps the developers write code faster. To improve the suggestion quality, the team behind TabNine added a deep learning model.

Deep TabNine is based on GPT-2, which uses the Transformer network architecture. GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset of 8 million web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text.

GPT-2 generates synthetic text samples in response to the model being primed with an arbitrary input. It adapts to the style and content of the conditioning text. This allows the user to generate realistic and coherent continuations about a topic of their choosing.

Semantic completion is provided by external software which TabNine communicates with using the Language Server Protocol. TabNine comes with default install scripts for several common language servers, and this is fully configurable, so one can use a different language server or add semantic completion for a new language. 

Deep TabNine uses  subtle clues that are difficult for traditional tools to access. For example, the return type of app.get_user() is assumed to be an object with setter methods, while the return type of app.get_users() is assumed to be a list

Although modeling code and modeling natural language might appear to be unrelated tasks, modeling code requires understanding English in some unexpected ways.

Developers, instead of worrying about missing a trivial syntax or defining a class for a task specific functionality, can now proceed with their work at a higher level with Deep TabNine powered by OpenAI’s GPT-2.

Deep TabNine requires a lot of computing power and running the model on a laptop would have latency. To address this challenge,  the team is now offering a service that will allow developers to useTabNine’s servers for GPU-accelerated autocompletion. It’s called TabNine Cloud.

Why Should One Opt For TabNine?

  1. TabNine works for all programming languages.
  2. TabNine does not require any configuration in order to work.
  3. TabNine does not require any external software (though it can integrate with it).
  4. Since TabNine does not parse the code, it will never stop working because of a mismatched bracket.
  5. If the language server is slow, TabNine will provide its own results while querying the language server in the background. TabNine typically returns its results in 20 milliseconds.

Supported languages:

Deep TabNine supports Python, JavaScript, Java, C++, C, PHP, Go, C#, Ruby, Objective-C, Rust, Swift, TypeScript, Haskell, OCaml, Scala, Kotlin, Perl, SQL, HTML, CSS, and Bash.

Get hands on with Deep TabNine here.

More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Our Upcoming Events

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Council Post: Enabling a Data-Driven culture within BFSI GCCs in India

Data is the key element across all the three tenets of engineering brilliance, customer-centricity and talent strategy and engagement and will continue to help us deliver on our transformation agenda. Our data-driven culture fosters continuous performance improvement to create differentiated experiences and enable growth.

Ouch, Cognizant

The company has reduced its full-year 2022 revenue growth guidance to 8.5% – 9.5% in constant currency from the 9-11% in the previous quarter