Active Hackathon

How To Make Meta-learning More Effective

Meta-learning was introduced to make machine learning models to learn new skills and adapt to the ever changing environments in the presence of finite training precedents. The main objective of this approach is to find model agnostic solutions.

One highly successful meta-learning algorithm has been Model Agnostic Meta-Learning (MAML). This algorithm, with deep neural networks as the underlying model, has been highly influential, with significant follow on work, such as first order variants, probabilistic extensions and augmentation with generative modelling.

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

The above figure illustrates training of a simple model and a task-agnostic algorithm, Model-Agnostic Meta-Learning (MAML). The training of the model’s parameters is done in such a way so that a small number of gradient updates will lead to faster learning on a new task.

Model Agnostic Meta-Learning optimizes for a set of parameters such that when a gradient step is taken for a specific task say i, the parameters are close to the optimal parameters θ*(i) for task i.

Model agnostic meta-learning or for any machine learning model eventually runs into issues like unlabeled data. The model will be starved on data and is forced to learn on less data. In this scenario, two approaches are widely considered: 

  • Rapid learning and 
  • Feature Reuse

Evaluating Rapid Learning And Feature Reuse

Rapid learning is the use of large, efficient changes in the representations. Whereas,  feature reuse involves meta initialization with existing high quality features.

In rapid learning, large representational and parameter changes occur during adaptation to each new task as a result of favorable weight conditioning from the meta-initialization. In feature reuse, the meta-initialization already contains highly useful features that can mostly be reused as is for new tasks, so little task-specific adaptation occurs.

As can be seen in the above figure, in Rapid Learning, outer loop training leads to a parameter setting that is well-conditioned for fast learning, and inner loop updates result in significant task specialization. In Feature Reuse, the outer loop leads to parameter values corresponding to reusable features, from which the parameters do not move significantly in the inner loop.

To know whether MAML systems benefit more from rapid learning or feature reuse, researchers from MIT, Google and Cornell University have collaborated to evaluate.

They have performed two sets of experiments:

  1. We evaluate few-shot learning performance when freezing parameters after MAML training, without test time inner loop adaptation.
  2. We use representational similarity tools to directly analyze how much the network features and representations change through the inner loop. 

MiniImageNet dataset was used for the experiments. The results show feature reuse to be a dominating factor in improving the effectiveness of meta learning algorithms.

The authors in their paper state that for all layers except the head of the neural network, the meta-initialization learned by the outer loop of MAML results in very good features that can be reused as is on new tasks. 

And, inner loop adaptation does not significantly change the representations of these layers, even from early on in training. So, they suggest a simplification of the MAML algorithm: ANIL (Almost No Inner Loop) algorithm.

The researchers claim that ANIL algorithm significantly speeds up both training and inference.

Key Takeaways

  • Researchers find that feature reuse is the dominant component in MAML’s efficacy
  • Introduce ANIL(almost no inner loop) algorithm a simplification of MAML that has identical performance on standard image classification and reinforcement learning benchmarks
  • Results show that lower layers of a network is sufficient for few-shot classification at test time instead of final layer

The applications of meta-learning are not limited only to semi-supervised tasks but can be taken advantage in tasks such as item recommendation, density estimation, and reinforcement learning tasks.

More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM
MOST POPULAR

The curious case of Google Cloud revenue

Porat had earlier said that Google Cloud was putting in money to make more money, but even with the bucket-loads of money that it was making, profitability was still elusive.

Global Parliaments can do much more with Artificial Intelligence

The world is using AI to enhance the performance of its policymakers. India, too, has launched its own machine learning system NeVA, which at the moment is not fully implemented across the nation. How can we learn and adopt from the advancement in the Parliaments around the world? 

Why IISc wins?

IISc was selected as the world’s top research university, trumping some of the top Ivy League colleges in the QS World University Rankings 2022

[class^="wpforms-"]
[class^="wpforms-"]