There are several use cases for machine learning when data is insufficient. N-shot learning is when a deep learning model can be trained to classify an image using not more than five images. An N-shot learning field includes an ‘n’ number of labelled samples of each ‘K’ class. The entire support set ‘S’ includes N*K total samples. N-shot learning can be divided into three categories: zero-shot learning, one-shot learning and few-shot learning. The choice of application between the three depends upon the availability of training samples. N-shot learning is also used when the dataset is huge, and labelling the data can prove to be costly. Or, when several samples are available, it could be hard to add specific features for each task.
Zero-shot learning is the challenge of learning modelling without using data labelling. Zero-shot learning involves little human intervention, and the models depend on previously trained concepts and additional existing data. This method reduces the time and effort that data labelling takes. Instead of giving training examples, zero-shot learning gives a high-level description of new categories so that the machine can relate it to existing categories that the machine has learned about. Zero-shot learning methods can be used in computer vision, natural language processing and machine perception.
Zero-shot learning essentially is made up of two stages: training and inference. In training, the intermediate layer of semantic attributes are captured, then in the inference stage, this knowledge is used to predict categories among a new set of classes. At this level, the second layer models the relationship between the attributes and the classes and fixes the categories using the initial attribute signature of the classes. For example, if a child is asked to recognise a Yorkshire terrier, he or she may know it is a type of a dog, with added information about it from Wikipedia.
With a growing amount of research in instances where the model uses as little data as possible with fewer annotations, zero-shot learning has found applications in critical areas like healthcare for medical imaging and COVID-19 diagnosis using chest x-rays as well as for unknown object detection used in autonomous vehicles. Hugging Face transformers use zero-shot classification for more than 60 per cent of its transformers.
Zero-shot learning has two different ways in which it approaches modelling problems:
- Embedding-based approach: This maps the semantic attributes with the image features into a common embedding space. It uses a projection function that it has learned from deep networks. The model aims to use data from seen categories during training to find a projection function from visual to semantic space. The projection function is learned as a deep neural network as neural networks are used as function approximators.
- Generative model-based approach: This approach aims to generate image features for unseen categories using semantic attributes. This approach tackles the bias and domain shift that the embedding-based approach has.
One-shot learning performs classification tasks using past data. Facial recognition technology, including facial verification and identification, usually uses one-shot learning. Facial recognition systems learn face embedding, which is a rich low-dimensional feature representation. One-shot learning has been using the Siamese network approach. Eventually, Siamese networks were compared to comparative loss functions, after which the triplet loss function was proven to be better and the FaceNet system began using them. Contrastive loss and triplet loss functions are now used for high-quality face embeddings, which have become the foundation for modern facial recognition.
Few-shot learning, also known as low-shot learning, uses a small set of examples from new data to learn a new task. The process of few-shot learning deals with a type of machine learning problem specified by say E, and it consists of a limited number of examples with supervised information for a target T. Few shot learning is commonly used by OpenAI as GPT3 is a few-shot learner.
A study in 2019 titled ‘Meta-Transfer Learning for Few Shot Learning’ addressed the challenges that few-shot settings faced. Since then, few-shot learning is also known as a meta learning problem.
There are two ways to approach few-shot learning:
- Data-level approach: According to this process, if there is insufficient data to create a reliable model, one can add more data to avoid overfitting and underfitting. The data-level approach uses a large base dataset for additional features.
- Parameter-level approach: Parameter-level method needs to limit parameter space and use regularisation and proper loss functions to resolve the overfitting problem in few-shot learning. This will generalise the limited training samples. This approach can also improve model performance by directing it to the extensive parameter space. A parameter-level approach is useful because a smaller amount of training data will not give reliable results.