Active Hackathon

Understanding Direct Domain Adaptation in Deep Learning

To fill the gap between Source data (train data) and Target data (Test data) a concept called domain adaptation is used. It is the ability to apply an algorithm that is trained on one or more source domains to a different target domain.

Nowadays machine learning is acting as a heavy loader tool to help us solve problems related to the computer vision department such as image classification, segmentation, processing and many others. Most of these kinds of applications rely on training the neural network in a supervised passion where the labels are available. When it comes to synthetically generated data, these labels are determined by human interpretation. The main challenge is that training our neural network on such synthetic data does not always generalize on real data i,e the targeted data. 

Even at this point, we can also have a synthetic model which can perform better on real data but that requires careful construction of training set and inclusion of real noise and some features from the real dataset, but practically synthetic and real data are drawn from a different distribution like synthetic data can be generated by using GAN’s which is a function of latent space and it is very essential for the success of neural network that both data should have drawn from the same distribution. 

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

On other hand, you might be thinking that training the neural network on real data would result in a better model in overall terms. Well, these models are good as the accuracy of labels being mapped to data and that is being manually done by human or automated algorithms without supervision. Hence the problem still persists.

The Domain Adaptation

To fill the gap between Source data (train data) and Target data (Test data) a concept called domain adaptation is used. It is the ability to apply an algorithm that is trained on one or more source domains to a different target domain. It is a subcategory of transfer learning. In domain adaptation, the source and target data have the same feature space but from different distributions, while transfer learning includes cases where target feature space is different from source feature space. 

Image Source

The change in distribution is usually referred to as Domain shift or distributional shift, which is a simple change in data. Say we have trained our model for diagnosing different diseases present at the time here when we apply this model on new unlabeled data to detect COVID-19. So this change in data is called Domain Shift.

Types of Domain Adaptation

There are several contexts of domain adaptation and they differ with information considered for the target task. 

  • In unsupervised domain adaptation, learning data contains a set of labelled source examples, a set of unlabeled source examples and a set of unlabeled target examples. 
  • In semi-supervised domain adaptation along with unlabeled target examples there, we also take a small set of target labelled examples
  • And in supervised approach, all the examples are supposed to be labelled one

Well, a trained neural network generalizes well on when the target data is represented well as the source data, to accomplish this a researcher from King Abdullah University of Science and Technology, Saudi Arabia proposed an approach called ‘Direct Domain Adaptation’ (DDA)

The method mainly forced on to injecting the features of the real dataset into synthetic training data. This is achieved by utilizing a combination of linear operations including cross-correlation, auto-correlations and convolution between the source and target input features.  

This operation results in the distribution of source input features and target input features closer to each other which further helps the trained model generalization well at the inference stage. This process is completely explicit and the model under training and its architecture is not being affected. This method is specifically used in the data domain. 

Architecture of DDA

This method has achieved nearly 70% of accuracy when a simple CNN model is trained on MNIST data and validated over the MNIST-M dataset. 

From the below methodology, the transformation Ts and Tt help to reduce the difference between the data distribution of target and source dataset and provide new input features to the neural network (NN) to train the network to reduce loss L and then apply it to a real dataset. 

Below, P is the probability distribution and it shows the semantic versions of it for the source and target data given by the samples of MNIST and MNIST-M datasets. 

Image Source

Methodology of DDA

The methodology can be explained effectively with help of below picture; 

Image Source

On the left, the proposed technique is used for producing training data where the input feature corresponding to original MNIST labelled data and on the right, the proposed technique is used for producing the target or application or testing data on the MNIST-M dataset during this process the data is assumed to be labelled free and labels are used only for accuracy assessment. 

More importantly, as discussed earlier, the linear operation like cross-correlation is denoted by the circled cross symbols and as usual, the star symbols denote the convolution operation. 

Note that, after transformation, the resulting images of the same digit look similar for the two processes because this approach is relatively direct which does not require optimization,  processing eigenvalues that’s why it is referred to as Direct Domain Adaptation.    

Adaptation Process

While applying DDA to the MNIST and MNIST-M dataset, the above picture shows the effect of DDA on an example image of digits 0 to 3. The left column of each figure shows the source sample transformation and on the right, each shows the target samples at the inference stage. 

For each digit, it is shown at top row colour images and corresponding individual RGB channels at bottom row which constitute the actual input to the classification network. If you observe clearly the final transformed input features for the target and source data look similar and this similarity is more clear in the 3 channel representation which are actual inputs. 

Evaluation of DDA

The notebooks present in the official repository of DDA explores the performance against several setups when transferring the features from MNIST to the MNIST-M dataset. The results of those experiments are as follows which shows the accuracy for both the dataset evaluated against standard CNN network and CNN+DDA network. 

CNNCNN+DDA
MNIST0.9960.990
MNIST-M0.3440.734

Conclusion

The DDA is a direct and explicit method used to precondition the input features for supervised neural network optimization so ultimately the trained model works better on available label-free test data. In short, the DDA method is based on incorporating the test data input features into the training without changing the source input features which are very crucial for prediction.  

In this article, we have understood what domain adaptation is and its basic types including the intuition part. For practical implementation of DDA, one can refer to notebooks from the official repository.  

References:

More Great AIM Stories

Vijaysinh Lendave
Vijaysinh is an enthusiast in machine learning and deep learning. He is skilled in ML algorithms, data manipulation, handling and visualization, model building.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM
MOST POPULAR

The curious case of Google Cloud revenue

Porat had earlier said that Google Cloud was putting in money to make more money, but even with the bucket-loads of money that it was making, profitability was still elusive.

Global Parliaments can do much more with Artificial Intelligence

The world is using AI to enhance the performance of its policymakers. India, too, has launched its own machine learning system NeVA, which at the moment is not fully implemented across the nation. How can we learn and adopt from the advancement in the Parliaments around the world? 

Why IISc wins?

IISc was selected as the world’s top research university, trumping some of the top Ivy League colleges in the QS World University Rankings 2022

[class^="wpforms-"]
[class^="wpforms-"]