Impersonator++ is a human motion imitation library with state-of-the-art image synthesis within a unified framework, which means if the model once trained it can be used to handle all these tasks. Previous methods use 2D human pose key points to estimate the body structure, but Impersonator++ uses a 3D body Mesh recovery module to extricate the shape and pose of humans, which can further model the joint location and rotation and can characterize the personalized body shape.
Impersonator++ research paper: Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer, and Novel View Synthesis, is been published by the researchers of ShanghaiTech University: Wen Liu, Zhixin Piao, Jie Min, Wenhan Luo, Lin Ma, and Shenghua Gao on Oct 2019. For preserving the data of texture, style, color, and face identity, they purpose a Liquid Warping GAN with Liquid Warping Block(LWB) that propagates those data in both image and feature spaces. And on output, it synthesizes an image with respect to its reference. Also, the researchers build a new dataset namely Impersonator(iPER) dataset, for accurately testing human motion imitation and image synthesis.
Let’s see some of the Image Synthesis techniques for different applications, with the integrated use of a source image and Reference image, before going on to the implementation.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
a.) Human Motion Imitation
Motion Imitation is used to generate an image with texture from source human and pose from reference human. Simply It imitates the pose from the image and integrates a synthesized result.
b.) Novel View Synthesis
Human Novel View Synthesis is all about synthesizing a new image of the human body, captured from different viewpoints as shown in the above picture.
c.) Human Appearance Transfer
Appearance transfer as the above image demonstration is making it clear, is used to generate a human image by preserving reference identity with clothes. Now the different parts of the image may come from different people.
Liquid Warping Block(LWB)
With the recent advancement in GAN technology and many flaws in previous methods like concatenation and texture warp, Imaginaire++ purposed a Liquid Warping Block(lwb) method to preserve the source information like clothes and face identity, it addresses 3 main improvements:
- A denoising convolutional auto-encoder for preserving information.
- LWB takes features of each local part and blended them into a global feature stream to preserve the source details.
- LWB supports multiple-source warping, like as in appearance transfer, warping the features of the head from one source and the body from another, and then aggregating into a global feature stream.
Liquid Warping GAN
Liquid Warping GAN contains three stages:
- Body Mesh recovery module
- Flow composition module
- GAN module with Liquid Warping Block(LWB)
These stages synthesize high-fidelity human images under the desired condition. More specifically, we can it does these three tasks:
1) It synthesizes the background image of the object;
2) It predicts the color of invisible parts based on the visible parts of the image;
3) Also it generates pixels of faces, clothes, hairs, and others out of the reconstruction of SMPL(a parametric statistical human body model)
Impersonator (iPER) dataset
Now at last to really reproduce the result of the purpose methods by an impersonator++, researchers introduce the Impersonator (iPER) dataset, which contains 30 different humans of different shapes, sizes, sex, and height. Everyone is wearing different clothes and performing a pose-video by doing random actions like exercising, jumping, squat, leg-raising, and Tai Chi.
Some of the other feature if the iPER dataset is:
- There are 103 clothes in total as some might wear multiple clothes.
- It contains 206 video files with 241K frames.
- Data is split into train/test set at the ratio of 8:2.
iPER dataset visualization is shown above which tells us about the different insights of dataset like:
(a) shows the class of actions and their number of occurrences like videos of people doing Jumping is 41.
(b) shows the different styles of clothes people wearing.
(c) figure c) is the distribution of the weight of all 30 actors.
(d) are the distributions of the height of the total of 30 actors as maximum people are having a height in between 165-175cm.
This is important for data not to get biased, you can download the dataset by clicking on the following links:
Installation
Before installing let’s see some of the system dependencies on which Impersonator++ is tested.
- Supported Operating System: It is tested on Ubuntu 16.04/18.04 and Windows10
- Needed CUDA 10.1, 10.2, or 11.0 with Nvidia GPU.
- gcc in Linux (C++14) or MSVC++
- Visual Studio 2019(C++14) in Windows.
- ffmpeg (ffprobe) test on 4.3.1+.
If you are not fond of installing, training, and testing Impersonator++ on your local machine, you can simply use Google colab, We are going to use Google Colab for this tutorial so let’s directly jump into coding implementation and for installation into your local machines follow these guides:
Implementation
Before implementation let’s first install all the dependencies needed to run impersonator++ into your Google Colab environment.
Note: set your Runtime to GPU in Colab.
install ffmpeg (ffprobe) and set CUDA_HOME to the system environments
import os !apt-get install ffmpeg os.environ["CUDA_HOME"] = "/usr/local/cuda-10.1" !echo $CUDA_HOME
Clone iPERCore Github Repository and Setup
!git clone https://github.com/iPERDance/iPERCore.git cd /content/iPERCore/ !python setup.py develop
Downloading all pretrained model checkpoints
For users who are trying it on their local machine download the below checkpoints.
- checkpoints: https://download.impersonator.org/iper_plus_plus_latest_checkpoints.zip
- samples: https://download.impersonator.org/iper_plus_plus_latest_samples.zip
!wget -O assets/checkpoints.zip "https://download.impersonator.org/iper_plus_plus_latest_checkpoints.zip" !unzip -o assets/checkpoints.zip -d assets/ !rm assets/checkpoints.zip
Downloading Samples
!wget -O assets/samples.zip "https://download.impersonator.org/iper_plus_plus_latest_samples.zip" !unzip -o assets/samples.zip -d assets !rm assets/samples.zip %cd /content/iPERCore/
Import modules
import os.path as osp import platform import argparse import time import sys import subprocess from IPython.display import HTML from base64 import b64encode
Run Scripts
# the gpu ids gpu_ids = "0" # the image size image_size = 512 # the default number of source images, it will be updated if the actual number of sources <= num_source num_source = 2 # the assets directory. This is very important, please download it from `one_drive_url` firstly. assets_dir = "/content/iPERCore/assets" # the output directory. output_dir = "./results" # the model id of this case. This is a random model name. # model_id = "model_" + str(time.time()) # # This is a specific model name, and it will be used if you do not change it. # model_id = "axing_1" # symlink from the actual assets directory to this current directory work_asserts_dir = os.path.join("./assets") if not os.path.exists(work_asserts_dir): os.symlink(osp.abspath(assets_dir), osp.abspath(work_asserts_dir), target_is_directory=(platform.system() == "Windows")) cfg_path = osp.join(work_asserts_dir, "configs", "deploy.toml")
Let’s Run the Trump Case
In this case, there is only a frontal body image as the source inputs, “donal_trump_3” is a specific model name, and it will be used if you do not change it. This is the case of `trump`
model_id = "donald_trump_2" # the source input information, here \" is escape character of double duote " src_path = "\"path?=/content/iPERCore/assets/samples/sources/donald_trump_2/00000.PNG,name?=donald_trump_2\"" ## the reference input information. There are three reference videos in this case. # here \" is escape character of double duote " # ref_path = "\"path?=/content/iPERCore/assets/samples/references/akun_1.mp4," \ # "name?=akun_2," \ # "pose_fc?=300\"" # ref_path = "\"path?=/content/iPERCore/assets/samples/references/mabaoguo_short.mp4," \ # "name?=mabaoguo_short," \ # "pose_fc?=400\"" ref_path = "\"path?=/content/iPERCore/assets/samples/references/akun_1.mp4," \ "name?=akun_2," \ "pose_fc?=300|" \ "path?=/content/iPERCore/assets/samples/references/mabaoguo_short.mp4," \ "name?=mabaoguo_short," \ "pose_fc?=400\"" print(ref_path) !python -m iPERCore.services.run_imitator \ --gpu_ids $gpu_ids \ --num_source $num_source \ --image_size $image_size \ --output_dir $output_dir \ --model_id $model_id \ --cfg_path $cfg_path \ --src_path $src_path \ --ref_path $ref_path
The result will be saved in ./results/primitives/donald_trump_2/synthesis/imitations/donald_trump_2-mabaoguo_short.mp4
mp4 = open("./results/primitives/donald_trump_2/synthesis/imitations/donald_trump_2-mabaoguo_short.mp4", "rb").read() data_url = "data:video/mp4;base64," + b64encode(mp4).decode() HTML(f""" <video width="100%" height="100%" controls> <source src="{data_url}" type="video/mp4"> </video>""")
Run On custom inputs
What we can do let’s download one person’s image and paste it inside the working directory and then synthesize it with a sample video. Remember to change all the parameters as shown in the below code.
model_id = "yourmodel_name_any_name" # the source input information src_path = "\"path?=/content/your_one_person_image.jpg,name?=person1\"" # src_path = "\"YOU NEED TO REPLACE THIS. FOLLOW THE ABOVE EXAMPLE.\"" ## the reference input information. There are three reference videos in this case. ref_path = "\"path?=/content/iPERCore/assets/samples/references/akun_1.mp4," \ "name?=akun_2," \ "pose_fc?=300\"" # ref_path = "\"YOU NEED TO REPLACE THIS. FOLLOW THE ABOVE EXAMPLE.\"" !python -m iPERCore.services.run_imitator \ --gpu_ids $gpu_ids \ --num_source $num_source \ --image_size $image_size \ --output_dir $output_dir \ --model_id $model_id \ --cfg_path $cfg_path \ --src_path $src_path \ --ref_path $ref_path
Let’s run our model outputs
mp4 = open("./results/primitives/person1/synthesis/imitations/person1-akun_2.mp4", "rb").read() data_url = "data:video/mp4;base64," + b64encode(mp4).decode() HTML(f""" <video width="100%" height="100%" controls> <source src="{data_url}" type="video/mp4"> </video>""")
Conclusion
Impersonator++ is clearly a very easy to use framework, It is an extension of its previous ICCV project impersonator: https://github.com/svip-lab/impersonator and with their new GAN based approaches and the improvised dataset it is gaining popularity, the impersonator community is continuously working on making new approaches like iPER-Dance( video-editing tool for human motion imitation, appearance transfer, and novel view synthesis), for Learning more you can follow these resources:
- Impersonator++ Github Repository
- Impersonator for Windows
- Research Paper
- Project Page
- iPER Dataset
- Above Demonstration Colab Notebook