From Deep Nostalgia to major Python releases, the first quarter of 2021 was chock-full of promising roll-outs. While Hyper Automation, RPA deployments, growth in MLOps tools, machine learning as a service dominated the trends, the ML domain witnessed breakthroughs in natural language processing, computer vision, conversational AI etc.
In this article, we have listed the major ML innovations in the first quarter of 2021.
Major Releases – JupyterLab 3.0, Tensorflow 3D, Java Version 16 & More
The year started with the release of JupyterLab 3.0, a popular web-based interface for Jupyter Notebooks. JupyterLab offers many intuitive features that enable data scientists and developers to work with notebooks, code, and data efficiently. The features include Code consoles, Kernel-backed documents, and multiple views of appointments.
Earlier this year, Google released Tensorflow 3D, which addresses significant challenges in computer vision. TensorFlow 3D is a highly modular library that brings 3D deep learning capabilities to TensorFlow. TF 3D can be leveraged for different 3D deep learning research types like quick prototyping and deploying real-time inference systems.
Oracle’s Java Version 16 has been released with 17 new enhancements, including language improvements. The latest Java Development Kit (JDK) includes features like Pattern Matching, Records, packaging tools, elastic metaspace and more. The additions are aimed at helping machine learning developers to accelerate tasks.
Recently, Python’s popular open-source machine learning library, PyTorch announced its new performance debug profiler, PyTorch Profiler, along with its 1.8.1 version release. With a new module namespace torch.profiler, PyTorch Profiler is the successor of PyTorch autograd profiler. The tool uses a new GPU profiling engine — built using the NVIDIA CUPTI APIs — and can capture the GPU kernel events with high fidelity. PyTorch includes a simple profiler API that is useful when determining the most expensive operators in the model.
Lately, Google Announced TensorFlow Quantum 0.5.0 at the first anniversary of TensorFlow Quantum (TFQ), a library for rapid prototyping of hybrid quantum-classical ML models. It integrates hybrid quantum with classical machine learning models that help researchers excel in quantum research.
Low-Code Programming Language, Power Fx
At the recent Microsoft Ignite event, the company announced a new open-source low-code programming language, Power Fx. A strong-typed, general-purpose, declarative and functional programming language, Power Fx can directly edit apps in text editors like Visual Studio Code or an Excel-like formula bar. The language is based on Microsoft Excel.
H2O.ai introduced an open-source automatic ML package for wave apps that builds and integrates predictive AI/ML models into Wave apps. Wave ML lets users rapidly develop and deploy interactive, predictive and decision-support applications over the web. H2O’s Wave ML makes machine learning easy for developers to solve business problems with the power of AutoML.
MPD Ventures announced a ‘Machine Learning Cloud’ initiative featuring AMD Instinct MI100 accelerators along with the AMD ROCm open software platform. The platform will be initially hosted at AMPD’s DC1 data centre in Vancouver, British Columbia, and might expand into other territories in the future. It unlocks various open compute languages, compilers, libraries, and tools designed from the ground up.
Infosys launched Infosys Cobalt to democratise AI within the workforce and drive business transformation. Built on NVIDIA DGX A100 systems, it can offer massive compute density.
Deep Nostalgia uses deep learning algorithms to create animated videos from old images. Deep Nostalgia can very accurately apply the drivers to a face in a still photo and animate it. It can make a picture smile, blink etc.
Geoff Hinton’s GLOM
Geoff Hinton’s GLOM makes neural networks smarter. In GLOM architecture, the scene-level top-down neural network converts the scene vector and the image location into an appropriate object vector for that location. It includes information about the 3-D pose of the object relative to the camera. In GLOM, wrote Dr Hinton, a percept is a field, and the shared embedding vector that represents a whole is very different from the shared embedding vectors that represent the parts.
Hugging Face gained a lot of traction after it raised $40 million in Series B funding led by Addition. What started as a chatbot company soon transformed into an open-source provider of NLP technologies to companies such as Microsoft Bing. It extends access to conversational AI by creating abstraction layers for developers. It helps users to adopt conversational AI technologies such as BERT, XLNET, and GPT-2.
Google introduced the ToTTo Dataset to overcome the hallucination problem. The ToTTo dataset consists of 121,000 training examples and has 7,500 examples each for development and test. The team at Google claimed ToTTo is a suitable benchmark for research in high precision text generation. Hallucination refers to generating text not faithful to the source.
In most cases, the hallucination occurs due to divergence between the source and reference. Hallucination happens when the system catches on to wrong correlations between different training data parts.
ImageNet, one of the world’s most influential AI datasets, decided to blur people’s faces in its database to respect user privacy. ImageNet started sourcing WordNet, a database of English words categorised by synonyms. It used amazon Mechanical Turk Workers to collect images of thousands of objects and people without their explicit consent. Since then, the database has expanded to over 1.5 million images categorised under 1000 words.
Researchers from Facebook AI Research (FAIR) introduced a new Transformer model called Unified Transformer (UniT). It can learn tasks across multiple domains in a simultaneous manner. The uniT can also take images and texts as inputs and train the input on various tasks ranging from visual perception and language understanding to vision and language reasoning.
Researchers created GPT-Neo, an open-source cure for GPT-3, a language model released last year with 175 billion parameters. Since then, researchers have been looking to create an open-source version of GPT-3. GPT-Neo is the brainchild of EleutherAI. The stated goal of the project is to replicate a GPT‑3 DaVinci-sized model and open-source it for free.
Google researchers showcased that language models can pre-trained up to trillion parameters. Switch Transformers, a technique to train language models, can create a system that would increase the parameter count while maintaining the floating-point operations (FLOPS) per input constant.
Decoding-enhanced BERT with dis-entangled attention or DeBERTa is a new and improved BERT model architecture that claims to improve the performance of Google’s BERT and Facebook’s RoBERTa models. Microsoft’s DeBERTa model outperformed T5 with 11 billion parameters on the SuperGLUE benchmark and surpassed the human baseline. DeBERTa is essentially a new Transformer-based neural language model that proposes a disentangled self-attention mechanism.
Multimodal Neural Networks
Researchers at OpenAI discovered neural networks within AI systems that resemble the neural network inside the human brain. These neural networks can respond to a cluster of abstract concepts rather than just specific visual features. It can respond to a range of emotions, animals, photographs, drawings and learn visual concepts from natural language supervision. Further, this general-purpose vision system can match the performance of a ResNet-50 and outperform existing vision systems on the most challenging datasets.
New Benchmark For Meta Reinforcement Learning
DeepMind and University College London researchers released a principled benchmark for meta-reinforcement learning (meta-RL) research, known as Alchemy. The combination of structural richness and structural transparency increases the flexibility and sample efficiency of reinforcement learning. It addresses hurdles such as scarcity of adequate benchmark tasks; lack of support for principled analysis etc.
OpenAI released a 12-billion parameter version of GPT-3, called DALL.E, a transformer that can generate images from text prompts. The name is a portmanteau of painter Salvador Dali and Pixar movie, WALL.E. DALL.E can render an image from scratch and alter aspects of an image based on text prompts. DALL.E model is also trained for working with multiple objects in an image. The OpenAI team has tested DALL.E’s capabilities against other specific situations, such as generating 3D imagery, cross-sectional views, and images based on contextual text caption.
IBM’s Molecule Generation Experience (MolGX)
Molecule Generation Experience (MolGX) is a cloud-based, AI-driven platform to design novel molecular designs. It usually takes over ten years and $10–100 million in funding to discover a new molecule. With the MolGX platform, researchers can bring down the time required for the process. MolGX is based on the idea of reverse designing, where AI is used to draw realistic images of landscapes or portraits of people that don’t even exist.
AI Reviews Scientific Papers
The machine learning community churns out a lot of research papers at international conferences. However, the review system for these papers is not up to the mark. Researchers at Carnegie Mellon University used natural language processing to review them.
Subscribe to our NewsletterGet the latest updates and relevant offers by sharing your email.
Srishti currently works as Associate Editor at Analytics India Magazine. When not covering the analytics news, editing and writing articles, she could be found reading or capturing thoughts into pictures.