MITB Banner

Guide to GANSpace: Discovering Interpretable GAN Control

Share

GANSpace

Illustration by GANSpace

Over the years, the image quality produced by GAN models has improved at a tremendous rate but the interpretability and edibility of the generated output image are not at the same pace with it. We are often in a dilemma of how to fine-grain control over the output produced by these models. GANs have these input latent spaces from where the random vector is taken out. Latent space is typically an isotropic Gaussian distribution, this means the shape of input distribution does not really tell useful information about the model. Here is an example of GAN – generating two completely different pictures from two different points in latent space where the input is less informative and output is very complex.

It is also possible to identify directions in this latent space that will modify the generated images. For example: generating two same images with slightest of change(modified in a meaningful way), shown below.

In this write-up, we are going to discuss how to find these direction lanes that can generate some useful changes to the output image. Existing methods to find these directions are supervised in learning. Not long ago, researchers of Aalto University, Adobe Research and NVIDIA submitted an unsupervised learning method of interpretable edits to 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada. The research paper is called GANSpace: Discovering Interpretable GAN Controls and presented by Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, Sylvain Paris.

The paper discusses the usage of pre-trained GAN models for styling out the output images. Along with that, it shows that with some modification in BigGAN, we can generate similar style edits as those produced by StyleGAN. An example of GANSpace, is shown below.

GANSpace uses Principle Component Analysis(PCA), to find the direction lanes in an isotropic gaussian distribution. More details on the method are discussed here.

Installation & Requirements

Clone the repository for using pre-trained models.

# Clone git
!git clone https://github.com/harskish/ganspace
%cd ganspace

Use TensorFlow 1.x.

%tensorflow_version 1.x

Now, install the remaining packages.

!pip install fbpca boto3
!git submodule update --init --recursive
!python -c "import nltk; nltk.download('wordnet')"
\

Demo of GANSpace using Pre-trained model.

  1. First step is to set up the above requirements and installation.
  2. If you are using a colab notebook, mount your drive.
  3. Download the pre-trained StyleGAN2 model for FFHQ dataset.

!gdown --id 1UlDmJVLLnBD9SnLSMXeiZRO6g-OMQCA_ -O /content/ffhq.pkl

  1. Clone StyleGAN repository.
%cd "/content"
!git clone https://github.com/skyflynil/stylegan2
%cd ganspace
  1. Convert the downloaded weights into Pytorch(for GANSpace) .

!python /content/ganspace/models/stylegan2/stylegan2-pytorch/convert_weight.py --repo="/content/stylegan2/" "/content/ffhq.pkl" #convert weights

Here, –repo takes two arguments the path to your TensorFlow StyleGAN2 repo and path to the downloaded model.

  1. Copy the PyTorch model to your drive.

!cp "/content/ganspace/ffhq.pt" "/content/drive/My Drive/ML/stylegan_models" #copy pytorch model to your drive

  1. Do Principal Component Analysis(PCA) and save those components in your Google drive. But before that define the model, dataset and number of components to use.
%cd ../ganspace/
model_name = 'StyleGAN2' 
model_class = 'ffhq' #this is the name of your model in the configs
num_components = 80

    Check all the layers available for the PCA by running this command.

#Check layers available for analysis by passing dummy name
!python visualize.py --model $model_name --class $model_class --use_w --layer=dummy_name

Add the chosen layer from above.

!python visualize.py --model $model_name --class $model_class --use_w --layer=style -c $num_components

    Create a video of the generated components.

# Create videos of StyleGAN wikiart components (saved to ./out)
!python visualize.py --model=$model_name --class=$model_class --use_w --layer="style" -b=500 --batch --video #add -video to generate videos

Now visualize the components and save it into your drive as a .npz  file.

## Visualize StyleGAN2 ffhq W principal components
!python visualize.py --model=StyleGAN2 --class=ffhq --use_w --layer=style -b=10000
!zip -r samples.zip "/content/ganspace/out/StyleGAN2-ffhq" #zip up samples for download
%cp -r "/content/ganspace/cache/components" "/content/drive/My Drive/ML/stylegan2/comps" #copying components over to google drive
  1. Now, we will explore directions in the latent space. For this, we have to import some modules. The code snippet for importing all the modules is given here. After that, specify the model as mentioned below.
# Specify model to use
config = Config(
  model='StyleGAN2',
  layer='style',
  output_class='ffhq',
  components=80,
  use_w=True,
  batch_size=5_000, # style layer quite small
)

Next, is to compute the latent directions.

#separate the parameters and pass it to the model
inst = get_instrumented_model(config.model, config.output_class,
                              config.layer, torch.device('cuda'), use_w=config.use_w)
## Return cached results or commpute if needed
# Pass existing InstrumentedModel instance to reuse it
path_to_components = get_or_compute(config, inst)
 
model = inst.model
 
named_directions = {} #init named_directions to save direct dictions

With the help of UI, we can explore the latent direction and give it a name, for example: here we have taken raise_eyebrows appended to the UI which will then be appended to named_directions. Then you can generate animations for a given number of samples. You can then save all the image files and animations and it will be available at ./out directory. At last, you can check the UI interface in colab to see the editing of images. The code for this is available here.

You can check the full demo here.

Conclusion

In this article, we have covered what GANSpace is, why we need it and showed a small demo of using the pre-trained GANSpace model. GanSpace is an unsupervised method for learning latent direction(for some meaningful change). The key method in GANSpace is PCA. Here is the link to colab Notebook:

Note: Except the output images from code, all other images are sourced from official documents.

Official Resources are available at:

You can check out other article related to GAN, here.

Share
Picture of Aishwarya Verma

Aishwarya Verma

A data science enthusiast and a post-graduate in Big Data Analytics. Creative and organized with an analytical bent of mind.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.