MITB Banner

TSCL Is The Newest Addition To Deep Learning Algorithms

Share
rope-3716659_1280

One of the newest developments in deep learning is the curriculum learning, where algorithms are trained to learn on a meaningful order in increasing complexity rather than just examples being fed to them.  

A new study by researchers who worked with OpenAI has dug deep into curriculum learning. Their model, called as Teacher-Student Curriculum Learning (TSCL), aims to be a game changer in learning subtasks associated with major deep learning tasks. This article looks into the technicality surrounding TSCL.

What Is TSCL?

In the context of curriculum learning, a Teacher algorithm gives subtasks to the Student algorithm in the increasing order of complexity, while the Student performs them and returns a score. This is gradually repeated until all tasks are performed by the Student successfully.

So, as and when the Student learns and masters a particular task, the Teacher assigns more probability on the subsequent task ahead thus focussing less on the current one as it has been learnt completely.

Repetition of performing tasks here makes learning faster. One important point to be noted here is the Teacher algorithm also learns information simultaneously along with the Student algorithm. This forms the basis for TSCL algorithm.

In fact, TSCL is specified as a Partially Observable Markov Decision Process (POMDP) in the study. Two cases of POMDP, one for reinforcement learning (simple training) and the other, for supervised learning (batch training), are charted out. The reason POMDP is chosen here is to optimise the Teacher algorithms’ rewards in line with the Student algorithms’ sub-task performance.

Matiisen et.al, the creators of TSCL, say “While an obvious choice for optimization criteria would have been the performance in the final task, initially the Student might not have any success in the final task and this does not provide any meaningful feedback signal to the Teacher. Therefore we choose to maximize the sum of performances in all tasks. The assumption here is that in curriculum learning the final task includes the elements of all previous tasks, therefore good performance in the intermediate tasks usually leads to good performance in the final task”

On this front, PODMPs are generally solved using RL but the training itself takes time and becomes iterative. Thus, researchers derive insights from the popular ‘multi armed non-stationary bandit problem’ and bring out the following new algorithms to incorporate in TSCL.

  1. Online Algorithm
  2. Naive Algorithm
  3. Window Algorithm
  4. Sampling Algorithm

All these algorithms are tweaked with respect to improving scores as well as keeping a check on the number of times a task has been performed in the Teacher-Student setup.

TSCL In Decimal Number Addition And Minecraft

There are many research-oriented applications under curriculum learning. Decimal number addition through LSTM is one notable work, where a sequence-to-sequence model was implemented. Although this technique has found success in supervised learning, it has faced setbacks either in learning performance or end up using too much memory for addition.

Thus, Matiisen et.al, consider this problem for analysis. As mentioned earlier, batch training POMDP is taken as the TSCL method here. The addition is carried along two parameters: 1-dimensional curriculum teaching and 2-dimensional curriculum teaching. In the former, tasks related to finding the maximum number of digits in the number obtained after addition, while the latter includes another criterion i.e., taking the length of numbers separately on top of finding decimal digits.

Results show that TSCL fares better than conventional LSTM. In fact, with more expected return, it is even faster.

Popular video game Minecraft, was also experimented with respect to reinforcement learning strategies. By using Microsoft’s Project Malmo with OpenAI Gym, a 5-step Curriculum Learning is created by Matiisen and team. This generates random mazes in Minecraft where the learning agent carefully navigates and learns the maze environment. (A detailed account of the Minecraft training can be found here.)

Even in this case, Minecraft agent learns faster with every run iteration. If a five-step curriculum is performed without considering each step, the agent terribly fails in learning the environment, which supports the researchers’ critique on selecting only the final task.

Comment

While this study has paved way for using TSCL in a handful of applications, it is again yet to stand with standard reinforcement learning and supervised learning algorithms. Nonetheless, TSCL will alleviate complexities in algorithms at every stage (dividing into sub-tasks etc.) thus reducing the burden on computing power and similar resources.

PS: The story was written using a keyboard.
Share
Picture of Abhishek Sharma

Abhishek Sharma

I research and cover latest happenings in data science. My fervent interests are in latest technology and humor/comedy (an odd combination!). When I'm not busy reading on these subjects, you'll find me watching movies or playing badminton.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India