MITB Banner

Guide to PyMAF: Pyramidal Mesh Alignment Feedback

PyMAF is a regression-based approach for human pose 3D mesh recovery. It introduces a new mesh alignment feedback loop that leverages different scales of spatial information obtained from a feature pyramid.

Share

PyMAF

Generating 3D pose meshes from monocular images is a computer vision problem, aiming to automate a tedious and time-consuming aspect of Visual Effects. Modelling objects with long and complex kinematic chains, such as the human body, is labour intensive as the VFX artist has to go frame by frame to rotoscope different sections of the kinematic chain. 

Existing approaches for automating these tasks fall under two broad paradigms: optimization-based and regression-based. Optimization-based approaches directly fit the models to 2D data and produce accurate mesh-image alignments but are slow and sensitive to the initialization. Regression-based approaches directly map raw pixels to model parameters to create parametric models in a feed-forward manner via neural networks. 

These models are sensitive to minor deviations in parameters which often leads to misalignment between the generated meshes and the image evidence. In their paper, “3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop”, Hongwen Zhang, Yating Tian, et al. proposed a new feedback loop that utilizes a feature pyramid to rectify the parameters explicitly based on mesh-image alignment.

Architecture & Approach

PyMAF architecture

Feature Pyramid for Human Mesh Regression 

The PyMAF image encoder produces a pyramid of spatial features that provide information of the human pose in the image at different scale levels. This allows the subsequent deep regressor to leverage multi-scale alignment contexts. The point-wise features extracted by the encoder then go through a multi-layer perceptron for dimensionality reduction and are concatenated together to form a feature vector.  The pose parameters are represented as relative rotations along kinetic chains and are thus sensitive to minor parameter errors. To deal with such misalignments, the parameter regressor uses 2D supervisions on the 2D key-points projected from the estimated mesh and additional 3D supervisions on 3D joints and model parameters when ground truth 3D labels are available. 

Mesh Alignment Feedback Loop

Regressing mesh parameters in a single pass is challenging; to overcome this limitation existing approaches have employed an Iterative Error Feedback (IEF) loop to update parameters iteratively. Although this approach reduces parameter errors, it uses the same global features each time for parameter update. These global features lack fine-grained information and are not responsive to new, more current predictions. PyMAF introduces a new Mesh Alignment Feedback (MAF) loop that leverages mesh-aligned features. In contrast to using uniformly sampled grid features or global features, the mesh-aligned features provide alignment details of the current estimation, which is more useful for parameter optimization. 

Auxiliary Pixel-wise Supervision

Spatial features can easily be affected by noise in images, as can be seen in the image above. To tackle noise caused by occlusion and illumination difference, PyMAF utilizes an auxiliary pixel-wise loss on the spatial features at the last level. This auxiliary supervision provides mesh-image association cues for the image encoder to preserve the most relevant information in the spatial feature maps. 

Creating Human Pose Meshes From Monocular Images Using PyMAF

The following code has been taken from the official demo Colab notebook available here.

  1. Clone the PyMAF GitHub repository and navigate into the master directory.
 !git clone https://github.com/HongwenZhang/PyMAF.git
 !cd PyMAF 
  1. Install PyTorch, Torchvision and other requirements.
 !pip3 install -U https://download.pytorch.org/whl/cu100/torch-1.1.0-cp37-cp37m-linux_x86_64.whl
 !pip3 install -U https://download.pytorch.org/whl/cu100/torchvision-0.3.0-cp37-cp37m-linux_x86_64.whl
 !pip install -r requirements.txt 
  1. Run the demo.py script to generate the 3D mesh for your video; make sure to replace ./sample_video.mp4 with the path to your video file.
!CUDA_VISIBLE_DEVICES=0 python3 demo.py --checkpoint=data/pretrained_model/PyMAF_model_checkpoint.pt --vid_file ./sample_video.mp4
3D human pose mesh created by PyMAF

Last Epoch

PyMAF has improved mesh-image alignment

This article went through PyMAF, a regression-based approach for human pose 3D mesh recovery. It introduced a new mesh alignment feedback loop that leverages different scales of spatial information obtained from a feature pyramid. Model parameters are optimized by the feedback loop based on the alignment status of the currently estimated meshes. In addition to that, an auxiliary supervision task is imposed on the spatial feature maps during the training of the regressor. This pixel-wise supervision makes the regressors less susceptible to noise in the images and improves the reliability of the mesh-aligned features. PyMAF was evaluated on both indoor and in-the-wild datasets, and it consistently improved the mesh image alignment performance over previous regression-based methods. 

References

All images, except the output, has been taken from the PyMAF paper.

Share
Picture of Aditya Singh

Aditya Singh

A machine learning enthusiast with a knack for finding patterns. In my free time, I like to delve into the world of non-fiction books and video essays.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.