Last updated February 11, 2021
In AI Mysteries

Guide to GANSpace: Discovering Interpretable GAN Control

Share

Illustration by GANSpace

Published on February 11, 2021

by Aishwarya Verma

Over the years, the image quality produced by GAN models has improved at a tremendous rate but the interpretability and edibility of the generated output image are not at the same pace with it. We are often in a dilemma of how to fine-grain control over the output produced by these models. GANs have these input latent spaces from where the random vector is taken out. Latent space is typically an isotropic Gaussian distribution, this means the shape of input distribution does not really tell useful information about the model. Here is an example of GAN – generating two completely different pictures from two different points in latent space where the input is less informative and output is very complex.

It is also possible to identify directions in this latent space that will modify the generated images. For example: generating two same images with slightest of change(modified in a meaningful way), shown below.

In this write-up, we are going to discuss how to find these direction lanes that can generate some useful changes to the output image. Existing methods to find these directions are supervised in learning. Not long ago, researchers of Aalto University, Adobe Research and NVIDIA submitted an unsupervised learning method of interpretable edits to 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada. The research paper is called GANSpace: Discovering Interpretable GAN Controls and presented by Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, Sylvain Paris.

The paper discusses the usage of pre-trained GAN models for styling out the output images. Along with that, it shows that with some modification in BigGAN, we can generate similar style edits as those produced by StyleGAN. An example of GANSpace, is shown below.

GANSpace uses Principle Component Analysis(PCA), to find the direction lanes in an isotropic gaussian distribution. More details on the method are discussed here.

Installation & Requirements

Clone the repository for using pre-trained models.

# Clone git
!git clone https://github.com/harskish/ganspace
%cd ganspace

Use TensorFlow 1.x.

%tensorflow_version 1.x

Now, install the remaining packages.

!pip install fbpca boto3
!git submodule update --init --recursive
!python -c "import nltk; nltk.download('wordnet')"
\

Demo of GANSpace using Pre-trained model.

First step is to set up the above requirements and installation.
If you are using a colab notebook, mount your drive.
Download the pre-trained StyleGAN2 model for FFHQ dataset.

!gdown --id 1UlDmJVLLnBD9SnLSMXeiZRO6g-OMQCA_ -O /content/ffhq.pkl

Clone StyleGAN repository.

%cd "/content"
!git clone https://github.com/skyflynil/stylegan2
%cd ganspace

Convert the downloaded weights into Pytorch(for GANSpace) .

!python /content/ganspace/models/stylegan2/stylegan2-pytorch/convert_weight.py --repo="/content/stylegan2/" "/content/ffhq.pkl" #convert weights

Here, –repo takes two arguments the path to your TensorFlow StyleGAN2 repo and path to the downloaded model.

Copy the PyTorch model to your drive.

!cp "/content/ganspace/ffhq.pt" "/content/drive/My Drive/ML/stylegan_models" #copy pytorch model to your drive

Do Principal Component Analysis(PCA) and save those components in your Google drive. But before that define the model, dataset and number of components to use.

%cd ../ganspace/
model_name = 'StyleGAN2' 
model_class = 'ffhq' #this is the name of your model in the configs
num_components = 80

Check all the layers available for the PCA by running this command.

#Check layers available for analysis by passing dummy name
!python visualize.py --model $model_name --class $model_class --use_w --layer=dummy_name

Add the chosen layer from above.

!python visualize.py --model $model_name --class $model_class --use_w --layer=style -c $num_components

Create a video of the generated components.

# Create videos of StyleGAN wikiart components (saved to ./out)
!python visualize.py --model=$model_name --class=$model_class --use_w --layer="style" -b=500 --batch --video #add -video to generate videos

Now visualize the components and save it into your drive as a .npz file.

## Visualize StyleGAN2 ffhq W principal components
!python visualize.py --model=StyleGAN2 --class=ffhq --use_w --layer=style -b=10000
!zip -r samples.zip "/content/ganspace/out/StyleGAN2-ffhq" #zip up samples for download
%cp -r "/content/ganspace/cache/components" "/content/drive/My Drive/ML/stylegan2/comps" #copying components over to google drive

Now, we will explore directions in the latent space. For this, we have to import some modules. The code snippet for importing all the modules is given here. After that, specify the model as mentioned below.

# Specify model to use
config = Config(
  model='StyleGAN2',
  layer='style',
  output_class='ffhq',
  components=80,
  use_w=True,
  batch_size=5_000, # style layer quite small
)

Next, is to compute the latent directions.

#separate the parameters and pass it to the model
inst = get_instrumented_model(config.model, config.output_class,
                              config.layer, torch.device('cuda'), use_w=config.use_w)
## Return cached results or commpute if needed
# Pass existing InstrumentedModel instance to reuse it
path_to_components = get_or_compute(config, inst)
 
model = inst.model
 
named_directions = {} #init named_directions to save direct dictions

With the help of UI, we can explore the latent direction and give it a name, for example: here we have taken raise_eyebrows appended to the UI which will then be appended to named_directions. Then you can generate animations for a given number of samples. You can then save all the image files and animations and it will be available at ./out directory. At last, you can check the UI interface in colab to see the editing of images. The code for this is available here.

You can check the full demo here.

Conclusion

In this article, we have covered what GANSpace is, why we need it and showed a small demo of using the pre-trained GANSpace model. GanSpace is an unsupervised method for learning latent direction(for some meaningful change). The key method in GANSpace is PCA. Here is the link to colab Notebook:

Note: Except the output images from code, all other images are sourced from official documents.