Machine Learning Is Getting Better At Video Editing: Makes People To Disappear

The researchers at Virginia Tech and Facebook AI have come with an improved technique that allows machine learning to edit videos like never before. With their paper titled, “Flow-edge Guided Video Completion”, they have presented a new flow-based video completion algorithm.

Video completion in this context refers to filling up a pre-recorded video with newly synthesised content. The use cases of a successful video completion algorithm are plenty. From automating VFX workflows to removing watermarks, they can be quite handy.  

Previous methods on video completion tasks have used colours among local flow connections between adjacent frames. However, because the motion boundaries form impenetrable barriers, not all missing regions in a video can be reached in this way. So, the researchers in their method, try to address this problem by introducing non-local flow connections to temporally distant frames, which can propagate video content over motion boundaries. The whole experiment is validated on the DAVIS dataset.


Sign up for your weekly dose of what's up in emerging technology.

So far, the ML techniques could not synthesise sharp flow edges, especially in complex situations. It is challenging to keep the output temporally coherent with respect to the dynamic motion of the camera. In this work, the researchers somehow seem to have managed to perform video completion seamlessly.

How It Works

As shown above, the algorithm works as follows:

Download our Mobile App

  • A binary mask is applied to the colour video to figure out which parts need to be synthesised.
  • Forward and backward flow are computed between adjacent and non-adjacent frames.
  • Flow edges are extracted and completed
  • These completed edges act as a guide for piecewise-smooth flow completion.
  • Candidate pixels are computed for each missing pixel by estimating a confidence score as well as a binary validity indicator. 
  • A frame with most missing pixels is chosen and is filled with image inpainting.
  • This is repeated until there is no missing pixel.

As illustrated below, flow completion requires Optical flow estimation on the input video. Missing

regions given have zero value (white). This is followed by edge extraction, and later piecewise-smooth completed flow, using the edges as guidance.

For evaluation, the researchers used the DAVIS dataset, which contains a total of 150 video sequences. Following the evaluation protocol, 60 sequences in 2017-test-dev and 2017-test-challenge were used for training the flow edge completion network. 

Masks were adopted from NVIDIA Irregular Mask Dataset testing split. During training, wrote the authors, edge images and corresponding flow magnitude images were first cropped to 256×256 patches. Then they are corrupted with a randomly chosen mask, which is resized to 256×256. ADAM optimiser with a learning rate of 0.001 was used, and training the network on a single NVIDIA P100 GPU took 12 hours.

Why Is This Work Important

To make an algorithm to differentiate between two objects in a dynamic setting is tricky. Imagine a person walking; the background will be visible even in the sweeping motion of the foot movement. Video completion algorithms have to fill in for the missing pieces of information. The applications of these techniques can be extended to removing scratches from videos, for video editing and special effects workflows (removing unwanted objects), watermark and logo removal, and video stabilisation (filling the exterior after shake removal instead of cropping). 

Though the work gives a new direction for automated VFX workflows, there are limitations such as frame rate and detection of objects in fast-paced environments (think: boxing match). 

This work by Chen Gao and his peers was also presented at the recently concluded 16th edition of European Conference on Computer Vision (ECCV). 

Know more about this work here.

Support independent technology journalism

Get exclusive, premium content, ads-free experience & more

Rs. 299/month

Subscribe now for a 7-day free trial

More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

AIM Upcoming Events

Early Bird Passes expire on 3rd Feb

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox