Uber’s centre for advanced artificial intelligence research and platforms powers applications in computer vision, natural language processing, deep learning, advanced optimization methods, and intelligent location and sensor processing across the company. The organisation has been doing a lot of researches under these areas for quite a time now.
For instance, improving location accuracy with sensing and perception, leveraging computer vision to make Uber safer and efficient, enhancing real-time forecasting, utilising conversational AIs and much more. One important thing to build these models is that they are data-hungry models. Also, producing a large amount of human labelled data is both time-consuming and costlier in manner.
Sign up for your weekly dose of what's up in emerging technology.
To mitigate such a dilemma, AI researchers at Uber started working to find solutions to whether an algorithm can be used to generate a large amount of data. Recently, the researchers at Uber created a learning algorithm which can automatically generate training data, learning environments, as well as curricula to help AI agents rapidly learn.
The Learning Algorithm
Generative Teaching Networks or GTN is a meta-learning approach for creating synthetic data and it mainly focuses on supervised learning.
GTN training has two nested training loops:
- an inner loop to train a learner network
- an outer-loop to train a generator network that produces synthetic training data for the learner network.
How It works
Fig: Overview of Generative Teaching Networks
GTNs are deep neural networks which generate data as well as training environments that a learner such as a newly initialized neural network trains before being tested on a target task. The next step is to differentiate through the entire learning process via meta-gradients in order to update the GTN parameters to improve performance on the target task.
This approach is a little different from Generative Adversarial Network (GAN) because unlike a GAN, in this approach, the two networks cooperate rather than compete since their interests have been aligned towards having the learner perform well on the target task when trained on data produced by the GTN. The generator and the learner networks are trained with meta-learning via nested optimization that consists of inner and outer training loops.
Some of the key benefits of using GTNs are mentioned below
- GTNs have the ability to rapidly train new neural network models and are particularly useful when training of many independent models is required.
- GTNs have the beneficial property that they can theoretically generate any type of data or training environment, making their potential impact large.
- The synthetic data in GTNs is not only agnostic to the weight initialisation of the learner network but is also agnostic to the learner’s architecture.
- GTN-produced synthetic data can be used as a drop-in replacement for real data when evaluating a candidate architecture’s performance.
- GTNs can generate virtually any learning environment for the learning algorithm.
- GTNs has the ability to teach networks on-demand to realise particular trade-offs between accuracy, inference time, and memory requirements.
GTN outperforms real data (training learners with random mini-batches of real data) as well as distillation data (training learners with synthetic data) and produces state-of-the-art (SOTA) performance in training a learner with much more accuracy rate. One of the strong points of using GTNs is since the models can rapidly train new architectures, GTNs could be used to create neural networks on-demand which can meet specific design constraints such as a given balance of performance, speed, and energy usage and it can also have a specific subset of skills.
The researchers further accelerated Neural Architecture Search (NAS) with GTNs using CIFAR10 in order to find a high-performing Convolution Neural Network (CNN) architecture for the CIFAR10 image-classification task with limited compute costs. GTN-NAS improves the NAS state of the art by finding higher-performing architectures when controlling for the search proposal mechanism.