Active Hackathon

What is Face Identity Disentanglement and How it outperformed GANs?

Face Identity Disentanglement via Latent Space Mapping becomes the state-of-the-art in face image generation by surpassing existing GANs

Face Identity Disentanglement via Latent Space Mapping becomes the state-of-the-art in face image generation by greatly surpassing existing Generative Adversarial Networks such as StyleGAN. Generative Adversarial Networks, simply known as GANs, nowadays find a prominent place in deep learning with wide applications including high-resolution image synthesis, image-to-image translation, video-to-video translation, image inpainting, and video inpainting. StyleGAN and other competing methods are well known for their face image generation abilities. However, they need excessive supervision and training, and compromised quality which make generalization difficult. Face Identity Disentanglement via Latent Space Mapping is a method that learns how to represent image data in disentangled latent representations, with minimal supervision, manifested using available pre-trained generative networks such as StyleGAN. By learning to map into latent space, state-of-the-art quality as well as rich-expressive latent space are achieved. Disentangled latent representations allow generative models to control and compose the disentangled factors in the image generation process.

Disentanglement is a generative model’s ability to solely control a single feature without affecting the other features. For instance, in face generation, disentanglement helps either generate faces of the same identity but with different attributes such as pose, expression and illumination, or generate faces of the same pose but with different identities. Disentanglement is considered a non-trivial task in machine learning. The current framework demonstrates high-quality disentanglement of face identity from all other attributes, capable of generating high-resolution faces of different identity and/or attributes. This framework’s key idea is to map the disentangled latent representation to the latent space of a pre-trained generator such as StyleGAN. This Face Identity Disentanglement framework is developed by Yotam Nitzan, Amit Bermano, Daniel Cohen-Or of Tel-Aviv University and Yangyan Li of Alibaba Cloud Intelligence Business Group.


Sign up for your weekly dose of what's up in emerging technology.
Disentanglement framework with Latent Space Mapping

This disentanglement framework uses two encoders to generate the latent representation ????, consisting of a description of the property of interest, and all the rest. Here, the first encoder generates a latent representation of the identity of the face and the second encoder generates a latent representation of facial attributes such as pose, expression and illumination. The latent representation is then mapped to the latent space W of the pre-trained generator ????. This decouples the tasks of learning quality image generation and disentanglement. Due to disentanglement, the two parts of latent representations are mutually exclusive and carry entirely different information. Therefore, this approach mapping is trained solely to successfully disentangle provided input information and extract useful representation that can be combined in the generator to synthesize high-quality target images. 

In this disentanglement framework, three inputs are used to generate 3 by 3 image-matrix by preserving face identity along with columns and preserving facial attribute along rows.

Human faces possess many independent, high-dimensional features, and high photometric, geometric and kinematic complexities. This disentanglement framework concentrates on image synthesis with disentangled control over face identity while preserving the other facial attributes. This type of control is highly useful in applications such as reenactment and de-identification. The output quality is directly determined by the pre-trained generator employed. Hence this framework incorporates the state-of-the-art StyleGAN as the pre-trained generator. 

The above illustration depicts the dataflow and losses of the framework. Two input images, one for face identity feature, ???????????? and another for facial attributes, ???????????????????? are fed to respective encoders. The latent representations are mapped to latent space, which is then fed into the generator. An adversarial loss L???????????? ensures proper mapping to the W space. Identity preservation is encouraged using L???????? , that penalizes differences in identity between ????????????, ???????????????? . Attributes preservation is encouraged using L???????????? , L???????????? , that penalizes pixel-level and facial landmarks differences respectively, between ???????????????????? , ????????????????.

Python Implementation of Disentanglement Framework


The following command imports necessary source codes, files and datasets from the official Github repository. Make sure that CUDA GPU runtime is enabled on the local machine or Colab or Jupyter notebook.

!git clone


Confirm the proper file download using the command

!ls ID-disentanglement/



Activate the conda environment on the local machine. If Anaconda-3 is not installed on the machine or if the user uses Colab, the following command installs Anaconda-3 distribution.

For 64-bit machine,


For 32-bit machine,



Various pre-trained generators are available forr training and inference. Users can opt for any of the available generators from the corresponding Github repository. Here FFHQ_StyleGAN_256x256 model is used in Colab. Since models are stored in a shared directory in Google Drive, the necessary setup in Colab to download files from Google Drive must be enabled using the following commands and codes.

 !pip install -U -q PyDrive
 import os
 from pydrive.auth import GoogleAuth
 from import GoogleDrive
 from google.colab import auth
 from oauth2client.client import GoogleCredentials 

Users must authenticate access to Google Drive via Google Cloud Storage by generating the following codes’ verification code.

 gauth = GoogleAuth()
 gauth.credentials = GoogleCredentials.get_application_default()
 drive = GoogleDrive(gauth) 

Finally, pre-trained StyleGAN can be downloaded using the codes

 local_download_path = os.path.expanduser('~')
 except: pass
 file_list = drive.ListFile(
     {'q': "'1OgLvUhd9FX9_mPXrfqAWaLZsceQzE9l4' in parents"}).GetList()
 for f in file_list:
   # 3. Create & download by id.
   print('title: %s, id: %s' % (f['title'], f['id']))
   fname = os.path.join(local_download_path, f['title'])
   print('downloading to {}'.format(fname))
   f_ = drive.CreateFile({'id': f['id']})
 with open(fname, 'rb') as f:


The Face Identity Disentanglement framework is designed to use Tensorflow 2.X on python (3.7), using cuda 10.1 and cudnn 7.6.5. Following commands create a conda environment that has the needed dependencies.

!exec bash

Within the shell, content run the following command

conda create -n environment.yml



Dataset for training and inference can locally be created using the commands

 cd ID-disentanglement/utils
 python \ 
     --resolution N \
     --batch_size BATCH_SIZE \
     --output_path OUTPUT_PATH \
     --pretrained_models_path PRETRAINED_MODELS_PATH \
     --num_images NUM_IMAGES \
     --gpu GPU 


The Face Identity Disentanglement framework can be trained using the following command

     --resolution N
     --pretrained_models_path PRETRAINED_MODELS_PATH
     --dataset BASE_DATASET_DIR
     --batch_size BATCH_SIZE
     --cross_frequency 3
     --train_data_size 70000
     --results_dir RESULTS_DIR        


Inference on the trained model with the downloaded test dataset can be performed using the following commands. 

     --pretrained_models_path PRETRAINED_MODELS_PATH \
     --load_checkpoint PATH_TO_WEIGHTS \
     --id_dir DIR_OF_IMAGES_FOR_ID \
     --attr_dir DIR_OF_IMAGES_FOR_ATTR \
     --output_dir DIR_FOR_OUTPUTS \
     --test_func infer_on_dirs 

to test performance on two sets of images, one for preserving face identity and another for preserving facial attributes.

     --pretrained_models_path PRETRAINED_MODELS_PATH \
     --load_checkpoint PATH_TO_WEIGHTS \
     --input_dir PARENT_DIR \
     --output_dir DIR_FOR_OUTPUTS \
     --test_func interpolate 

to test performance on three sets of images, one for preserving face identity and the other two sets for sequential interpolation of facial attributes.

Performance Evaluation of Disentanglement Framework

Qualitative and the quantitative performance of the disentanglement framework areevaluated using the Flickr-Faces-HQ images (FFHQ).

Input images from FFHQ image dataset. Face identity is preserved along with columns while other facial attributes are preserved along rows.
Both input and output images are generated using StyleGAN generator incorporating Face Identification Disentanglement Framework
Qualitative comparison of Disentanglement framework with existing state-of-the-arts FSGAN and FaceShifter. 

Disentanglement approach extraordinarily exceeds performances of state-of-the-arts in identity-preserved face generation such as FaceShifter, FSGAN, ALAE, and pSp..

Notable applications of the Disentanglement framework


Sequential interpolation of a given image between two different input images of different  attributes. The identity of a given image is maintained throughout the interpolation. Here the input image of the identity source is not shown.


Sequential interpolation of two different input images of different identities and attributes. Both identity and attributes are matched to the input images at both ends of interpolation.

Note: Images and illustrations other than the code outputs are obtained from the original research paper.

References and further reading:

More Great AIM Stories

Rajkumar Lakshmanamoorthy
A geek in Machine Learning with a Master's degree in Engineering and a passion for writing and exploring new things. Loves reading novels, cooking, practicing martial arts, and occasionally writing novels and poems.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Council Post: How to Evolve with Changing Workforce

The demand for digital roles is growing rapidly, and scouting for talent is becoming more and more difficult. If organisations do not change their ways to adapt and alter their strategy, it could have a significant business impact.

All Tech Giants: On your Mark, Get Set – Slow!

In September 2021, the FTC published a report on M&As of five top companies in the US that have escaped the antitrust laws. These were Alphabet/Google, Amazon, Apple, Facebook, and Microsoft.

The Digital Transformation Journey of Vedanta

In the current digital ecosystem, the evolving technologies can be seen both as an opportunity to gain new insights as well as a disruption by others, says Vineet Jaiswal, chief digital and technology officer at Vedanta Resources Limited

BlenderBot — Public, Yet Not Too Public

As a footnote, Meta cites access will be granted to academic researchers and people affiliated to government organisations, civil society groups, academia and global industry research labs.