Advertisement

Meet Impersonator++: A GAN-Based Model That Can Make You A Dancer In Just Seconds

Recently, researchers from Shanghai Tech University introduced a new GAN-based framework that can perform human image synthesis by using a 3D body mesh recovery module, known as Impersonator++. According to the researchers, the Impersonator++ framework tackles human image synthesis, including human motion imitation, appearance transfer, and novel view synthesis.

With the gaining prominence of adversarial methods, researchers around the globe have been working on human image synthesis. Human image synthesis aims to make believable and photorealistic images of humans, including motion imitation, appearance transfer, novel view synthesis, among others. The researchers stated that human image synthesis has vast potential applications in character animation, reenactment, virtual clothes to try-on, movie or game making, etc.

According to them, the existing task-specific methods mainly use 2D key points or pose in order to estimate the human body structure. These methods only take care of the layout locations and ignore the personalised shape and limb (joints) rotations, which are even more essential than layout locations in human image synthesis. This means they can only express the position information of the human body structure with no abilities to characterise the personalised shape of the person and model the limb rotations. 

In order to mitigate such issues, the researchers proposed the new GAN-based framework to use a 3D body mesh recovery module to disentangle the pose and shape. It has been claimed that the framework can model not only the joint location and rotation but also characterise the personalised body shape.

Behind Impersonator++

Impersonator++ is an Attentional Liquid Warping GAN with Attentional Liquid Warping Block (AttLWB) that preserves crucial details like texture, style, colour, and face identity. The researchers proposed this framework in order to address the loss of the source information from the mentioned key aspects, which are:

  1.  A denoising convolutional auto-encoder, which is used to extract useful features that preserve the source information, including texture, colour, style and face identity.
  2. The source features of each local part are blended into a global feature stream by the proposed LWB and AttLWB to preserve the source details.
  3. Further, it supports multiple-source warping, such as in the appearance transfer that supports to warp the features of a head (local identity) from one source and that of a body from another and aggregate them into a global feature stream.
  4. A one/few-shot learning strategy is utilised to improve the generalisation of the network.

The whole approach includes three main modules, a body mesh recovery, a flow composition, and a GAN module with the Liquid Warping Block (LWB) or the Attentional Liquid Warping Block (AttLWB).

During the process, the researchers first used a parametric statistical human body model, SMPL to disentangle a human body into the pose (joint rotations) and the shape. It is a 3D body model that outputs a 3D mesh (without clothes) rather than the layouts of joints and parts.

They applied the Attentional Liquid Warping Block (AttLWB) that claims to have learned the similarities of the global features among all multiple sources features. It then fuses the multiple sources features by a linear combination of the learned similarities and the multiple sources in the feature spaces.

Lastly, inspired by the SinGAN and the Few-Shot Adversarial Learning, the researchers applied a one/few-shot adversarial learning to push the network to focus on the individual input with several steps of adaptation, namely personalisation.

According to the researchers, based on the SMPL model and the Liquid Warping Block (LWB), this method can be further extended into other tasks, including human appearance transfer and novel view synthesis for free and one model can handle these three tasks.

Dataset Used

The researchers built a new dataset called Impersonator (iPER) dataset, for the evaluation of human motion imitation, appearance transfer and novel view synthesis. The dataset includes diverse styles of clothes in videos, and there are a total of 30 subjects of different conditions of shape, height, and gender. The whole dataset contains 206 video sequences with 241,564 frames. 

Contributions Of This Research

The contributions made by the researchers are mentioned below:

  • The researchers proposed a Liquid Warping Block (LWB) and an Attentional Liquid Warping Block (AttLWB) that propagate and address the loss of the source information, such as texture, style, colour, and face identity, in both the image and the feature space.
  • According to them, by taking advantage of both the LWB (AttLWB) and the 3D parametric model, the method is a unified framework for human motion imitation, appearance transfer, and novel view synthesis.
  • Due to the limitations of the previously available datasets, they built a new dataset for these tasks, especially for human motion imitation in the video, and released all codes and datasets for further research convenience in the community.

Download our Mobile App

Ambika Choudhury
A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Upcoming Events

15th June | Online

Building LLM powered applications using LangChain

17th June | Online

Mastering LangChain: A Hands-on Workshop for Building Generative AI Applications

Jun 23, 2023 | Bangalore

MachineCon 2023 India

26th June | Online

Accelerating inference for every workload with TensorRT

MachineCon 2023 USA

Jul 21, 2023 | New York

Cypher 2023

Oct 11-13, 2023 | Bangalore

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR