Table of contents
Introduction
Style-based Age Manipulation (SAM) is a method used to perform fine-grained age transformation in digital image processing and computer vision tasks using a single facial image as an input. It was introduced by Yuval Alalu, Or Patashnik and Daneil Cohen-Or of Tel-Aviv University in February 2021 (research paper). This article gives an overview of SAM along with its demonstration using Python code.
Before going into the details of SAM, let us first understand what is meant by the ‘age transformation’ task.
THE BELAMY
Sign up for your weekly dose of what's up in emerging technology.
What is age transformation?
Age transformation is a process of representing the change in a person’s appearance across different ages while preserving his identity. In order to model such a process over a single input facial image, the change in head shape and texture must be captured while the identity and other key facial attributes of the input face must be preserved. The complexity of the task increases when modelling lifelong ageing where significant age modification is desired (e.g. from ages 10 to 90 years).
To avoid explicit modelling of age transformation, Generative Adversarial Networks (GANs) are largely used for generating images in a data-driven manner, especially on facial images.
Overview of SAM
SAM is a method for learning a conditional image generation function which can capture the desired change in age but preserve the facial identity. It is an image-to-image translation method i.e. translates a given image of a source domain to a corresponding image of a target domain. It couples the expressiveness of a pre-trained, fixed StyleGAN generator with an encoder architecture. The encoder directly encodes an input facial image into a series of style vectors subject to the desired age shift. These style vectors are then fed into unconditional StyleGAN. The output of StyleGAN represents the desired age transformation. The use of StyleGAN enables leveraging its ability to achieve excellent image quality. A pre-trained, the fixed age regression model is used for generating the latent codes corresponding.
The continuous ageing process is formulated as a regression task between the input age and desired target age, providing fine-grained control over image generated by GAN.
Why the name ‘style-based Age Manipulation’?
Since age transformation in the SAM method is controlled through the intermediate style representations learned by the StyleGAN, it has been named as “Style-based” Age Manipulation. In other words, it does age transformation based on the style of an input facial image.
Practical implementation of SAM
Prerequisites
- Python 3
- Linux or macOS
- NVIDIA GPU + CUDA CuDNN
Pre-trained model
The SAM model pre-trained on the FFHQ dataset can be downloaded from here.
If you wish to train a SAM model from scratch, the following auxiliary models can be used:
- pSp Encoder : taken from pixel2style2pixel (an image-to-image translation framework) trained on the FFHQ dataset for StyleGAN inversion
- FFHQ StyleGAN : StyleGAN model pre-trained on FFHQ taken from rosinality’s repository with 1024×1024 output resolution
- IR-SE50 Model : Pretrained IR-SE50 model taken from TreB1eN (used for identity loss)
- VGG Age Classifier : VGG age classifier from DEX and fine-tuned on the FFHQ-Aging dataset (used for aging loss)
Demo code
Import the os module to interact with the underlying Operating System
import os
Define the code directory name
CODE_DIR = ‘SAM’
Clone the GitHub repository
!git clone https://github.com/yuval-alaluf/SAM.git $CODE_DIR
Download Ninja (a small build system focussed on speed)
!wget https://github.com/ninja-build/ninja/releases/download/v1.8.2/ ninja-linux.zip
Unzip ninja-linux.zip file
!sudo unzip ninja-linux.zip -d /usr/local/bin/
Change from default to alternative Python version
!sudo update-alternatives --install /usr/bin/ninja ninja /usr/local/bin/ninja 1 --force
Change the current working directory to ‘SAM’
os.chdir(f'./{CODE_DIR}')
Import the required standard libraries
from argparse import Namespace import sys import pprint import numpy as np from PIL import Image import torch import torchvision.transforms as transforms
Modify sys.path
sys.path.append(".") sys.path.append("..")
Import the AgeTransformer class
from datasets.augmentations import AgeTransformer
Import tensor2im method for tensor-to-image conversion
from utils.common import tensor2im
Import the pSp class
from models.psp import pSp
Define experiment type
EXPERIMENT_TYPE = 'ffhq_aging'
Get wget download command for downloading the desired model and save to directory ../pretrained_models
def get_download_model_command(file_id, file_name): current_directory = os.getcwd() save_path = os.path.join(os.path.dirname(current_directory), "pretrained_models") if not os.path.exists(save_path): os.makedirs(save_path) url = r"""wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export= download&id={FILE_ID}' -O- | sed -rn 's/.*confirm= ([0-9A-Za-z_]+).*/\1\n/p')&id={FILE_ID}" -O {SAVE_PATH}/{FILE_NAME}&&rm -rf /tmp/cookies.txt""".format(FILE_ID=file_id, FILE_NAME=file_name, SAVE_PATH=save_path) return url
Define model path
MODEL_PATHS = { "ffhq_aging": {"id": "1XyumF6_fdAxFmxpFcmPf-q84LU_22EMC", "name": "sam_ffhq_aging.pt"} }
Initialize model path and download command
path = MODEL_PATHS[EXPERIMENT_TYPE] download_command = get_download_model_command(file_id=path["id"], file_name=path["name"]) !wget {download_command}
Define experiment arguments
EXPERIMENT_DATA_ARGS = { "ffhq_aging": { "model_path": "../pretrained_models/sam_ffhq_aging.pt", "image_path": "notebooks/images/1287.jpg", "transform": transforms.Compose([ transforms.Resize((256, 256)), transforms.ToTensor(), transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]) } } EXPERIMENT_ARGS = EXPERIMENT_DATA_ARGS[EXPERIMENT_TYPE]
Initialize the model path
model_path = EXPERIMENT_ARGS['model_path']
Load the PyTorch model using torch.load()
ckpt = torch.load(model_path, map_location='cpu')
Have a look at the model training options
opts = ckpt['opts'] pprint.pprint(opts)
Update the training options
opts['checkpoint_path'] = model_path
Load SAM model.Use CUDA tensor types which implement functions like CPU tensors but using GPUs.
opts = Namespace(**opts) net = pSp(opts) net.eval() net.cuda() //torch.cuda print('Model successfully loaded')
Open the input image and resize it for display
image_path = EXPERIMENT_DATA_ARGS[EXPERIMENT_TYPE]["image_path"] original_image = Image.open(image_path).convert("RGB") //Image.open()
On executing the above line of code, you will see the input facial image which is as follows:
Download Dlib model shape_predictor_68_face_landmarks.dat.bz2 which has been trained on the ibug 300-W dataset.
!wget http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
Use bzip2 command for file compression and decompression
!bzip2 -dk shape_predictor_68_face_landmarks.dat.bz2
Define function for face alignment
def run_alignment(image_path): import dlib from scripts.align_all_parallel import align_face predictor = dlib.shape_predictor("shape_predictor _68_face_landmarks.dat") aligned_image = align_face(filepath=image_path, predictor=predictor) print("Aligned image has shape: {}".format(aligned_image.size)) return aligned_image
Align the input face
aligned_image = run_alignment(image_path)
Resize the aligned face
aligned_image.resize((256, 256))
Initialize variables for image transformation
img_transforms = EXPERIMENT_ARGS['transform'] input_image = img_transforms(aligned_image)
Run the image on multiple target ages
target_ages = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100] //0-100 years’ age shift age_transformers = [AgeTransformer(target_age=age) for age in target_ages]
Define function to create tensors
def run_on_batch(inputs, net): result_batch = net(inputs.to("cuda").float(), randomize_noise=False, resize=False) return result_batch
For each age transformed age, concatenate the results to display them side-by-side
results = np.array(aligned_image.resize((1024, 1024))) for age_transformer in age_transformers: print(f"Running on target age: {age_transformer.target_age}") with torch.no_grad(): input_image_age = [age_transformer(input_image.cpu()).to('cuda')] input_image_age = torch.stack(input_image_age) //torch.stack result_tensor = run_on_batch(input_image_age, net)[0] result_image = tensor2im(result_tensor) results = np.concatenate([results, result_image], axis=1)
Construct image memory from numerical array representation of the output using Image.fromarray()
results = Image.fromarray(results)
Display the output
results
Output
Source: https://github.com/yuval-alaluf/SAM/blob/master/notebooks/inference_playground.ipynb
The above code can also be experimented for different images by changing the image path appropriately.
We have tried the code on two more images, the results of which are as follows:
Input image 2:
Output:
Input image 3:
Output:
Google colab notebooks
References
Do you want to have a deep-rooted understanding of the SAM technique? Refer to the following sources: