“Anatomical actions such as heartbeat, blood flow, or breathing, create subtle changes that are not visible to the eye but still detectable computationally.”
A mere suspicion of doctored video can topple governments. It happened in Gabon, when the President recorded a video message that was awkwardly shot. The inconsistency was mistaken for being a product of deep fake and that led to a coup as the opposition assumed the death of the President.
Though AI-generated imagery has great potential, malicious usage is equally as damning. Now the researchers are finding ways to detect deep fakes in images and videos as their implications on the upcoming US elections are significant. There have been a couple of decent attempts to spot the fakes.
Sign up for your weekly dose of what's up in emerging technology.
Now the researchers from Binghamton University and Intel Corporation have come up with a novel solution. In a paper titled, “How do the heart of deep fakes beat”, they propose an approach to separate deep fakes from real videos and discover the specific generative model behind a deep fake. According to the authors, the intuition here is that the residuals from generation contain more information and can reveal these manipulation artifacts by disentangling them with biological signals.
Chasing Biological Cues
As illustrated above, the procedure can be summarised as follows:
- From real videos, several generators create deep fakes with residuals specific to each model.
- Model extracts face ROIs and biological signals to create PPG cells.
- It trains on PPG cells and by aggregating window predictions it classifies fake and authentic images.
So, what are these PPG cells?
Anatomical actions, stated the paper, such as heartbeat, blood flow, or breathing, result in subtle changes that stay obscure to the naked eye. But, run the images through an algorithm, and these biological cues will be picked up. For example, when blood moves through the veins, the color of skin or its reflectance changes over time due to the hemoglobin content in the blood. Photoplethysmography (PPG) signals are key here. Many techniques have been developed to track them.
For this experiment, the researchers extracted PPG cells from real and fake videos and fed them to a state-of-the-art classification network for detecting the deep fakes in a video.
“We can interpret these biological signals as fake heartbeats that contain a signature transformation.”
In order not to miss the characteristics of biological signals consistently, the researchers introduced a novel spatiotemporal block into the architecture called the PPG cell. The generation of these PPG cells starts with finding the face in every frame using a face detector. These PPG cells combine several raw PPG signals and their power spectra, extracted from a fixed window.
Facial movements, illumination variations, and facial occlusions are key biological indications. In order to extract these areas robustly, the researchers have used the face region between eye and mouth regions, maximising the skin exposure.
The above picture consists of sample frames on the top and their PPG cells (bottom). The left ones are real videos, and it’s deep fakes per generative model on the right.
The model architecture is basically a convolutional neural network with VGG blocks. The system implementation is done on Python utilising OpenFace library for face detection, OpenCV for image processing and Keras for neural network implementations. The researchers stated that most of the training and testing is performed on a desktop with a single NVIDIA GTX 1060 GPU.
The learning setting is built on the FaceForensics++ (FF) dataset with a 70%-vs-30% split; 700 real videos and 2800 deep fakes for training and for testing; 300 real videos and 1200 deep fakes.
The paper claims that their approach can detect fake videos with 97.29% accuracy, and the source model with 93.39% accuracy.
- Introduction of a novel approach for deep fake source detection.
- The researchers come up with the idea that projecting generative noise into biological signal space can create unique signatures per model, and
- They have also introduced an advanced general deep fake detector that can outperform current approaches in finding fakes while also predicting the source generative model.
Read the original paper here.