Top 10 AI Innovations Of 2021 So Far

OpenAI and Microsoft’s GitHub Copilot is an AI-based tool for programmers to write better code

Share

Published on August 6, 2021

by Avi Gopani

AI is a complex and ever-evolving field where organisations and individuals are constantly focused 0n finding novel solutions to pressing challenges. The year has been full of path-breaking innovations which have pushed the boundaries and made way for better outcomes. In this article, we list the top ten AI innovations of 2021 so far.

GitHub’s Copilot

OpenAI and Microsoft’s GitHub Copilot is an AI-based tool for programmers to write better code. The programmer can describe a function to the Copilot in plain English as a comment, and the machine will convert it to actual code. OpenAI Codex lays a foundation for Copilot, it provides an AI system trained on a dataset made up of a huge set of public source code. It works on a broad set of frameworks and languages, and is ideal for languages like Python, JavaScript, TypeScript, Go, and Ruby. The team has claimed Copilot to be far more advanced than the existing code assistants.

Unified Transformer

Researchers from Facebook AI Research introduced a new Transformer model, Unified Transformer (UniT). UniT has an encoder-decoder architecture that handles multiple tasks and domains in a single model with fewer parameters; as per Facebook’s team, UniT is a step towards general intelligence.

OpenAI’s DALL.E & CLIP

DALL. E is OpenAI’s 12-billion parameter version. It is a transformer that can generate images from text prompts. The model can work with multiple objects in an image to either render an image or alter it based on text prompts.

The OpenAI research team has also demonstrated a neural network called Contrastive Language-Image Pre-training or CLIP. This neural network has been trained on 400 million pairs of images and text. CLIP is also similar to GPT family and can learn to perform tasks such as object character recognition (OCR), geo-localisation, action recognition, etc.

Blender Bot 2

Facebook’s BlenderBot 2 is a first of its kind open-source chatbot with long term memory. Facebook has been working to make the AI more empathetic, knowledgeable and capable. The BlenderBot 2.0 can build long term memory for continuous access. It does so while simultaneously searching for information on the internet and holding conversations on nearly any topic.

Google’s Translatotron 2

In 2019, Google released Translatotron, an end-to-end speech-to-speech translation model. It was then the first end-to-end framework which could translate speech from one language into speech to another, directly.

The system was used to create synthesised translations of voices to ensure the sound of the original speaker is intact. But this feature had the potential to be misused to generate speech in a different voice and create deep fake voices.

This year, Google released Translatotron 2, an updated version where the trained model is restricted to retain the source speaker’s voice. Unlike the previous version, it cannot generate speech in different voices, thereby mitigating potential misuse for creating spoofing audio artefacts.

Vertex AI

Google introduced Vertex AI, a managed machine learning platform for deploying and maintaining AI models, at this year’s Google I/O conference. The new platform brings AutoML and AI Platform together into a unified API, client library and user interface.

Earlier, researchers would be required to run millions of test images for training algorithms, but now, they can rely on Vertex technology stack to do the heavy lifting.

FLAML

Microsoft’s FLAML is a python package that can tell us the best-fit ML model for low computation. It helps eliminate the manual process of choosing the best model and best parameter.

This AutoML system is mainly focused on–model selection, hyperparameter tuning, feature engineering, neural architecture search, and model compression.

MusicBERT

MusicBERT is Microsoft’s Large Scale Pre-Trained Model For Symbolic Music Understanding. It covers applications such as emotion classification, genre classification, and music piece matching. Microsoft has created this model using an OctupleMIDI method, bar-level masking strategy, along with a large scale symbolic music corpus containing more than 1 million music tracks.

Microsoft’s neural TTS

Microsoft’s neural text to speech software (TTS) enables developers to create custom synthetic voices. The AI is structured in three layers: text analyser, neural acoustic model, and neural vocoder.

The text analyser converts plain text to pronunciations, the acoustic model converts pronunciations to acoustic features and finally, the vocoder generates waveforms.

Tensorflow 3D

Google’s TensorFlow 3D is a highly modular library to bring 3D deep learning capabilities to TensorFlow. While the previous TensorFlow was not enough to understand the environment, the 3D update provides a set of operations, loss function, data processing tools, metrics, and other models for developing, training, and deploying state-of-art 3D scene understanding models.

Access all our open Survey & Awards Nomination forms in one place

Avi Gopani

Avi Gopani is a technology journalist that seeks to analyse industry trends and developments from an interdisciplinary perspective at Analytics India Magazine. Her articles chronicle cultural, political and social stories that are curated with a focus on the evolving technologies of artificial intelligence and data analytics.