Deepfake technology is gaining prominence in the criminal underworld, Europol warned. In the UAE, deepfake technology was used to steal USD 35 billion by cloning a company director’s voice. According to a Sensity report published in 2021, non-consensual and harmful deepfake videos double every six months. As of December 2020, nearly 85,047 deepfake videos were detected.
“We have reached a point where there is hardly any distinction between reality and augmentation, making it high time to take the necessary steps that ensure no harm is done via augmented reality in the coming years,” Aishwarya Srinivasan, an AI & ML Innovation Leader at IBM, said.
Therefore, it is becoming more vital by the day to have reliable techniques for deepfake detection. A research team from the University of Tokyo is doing exactly that.
The team has developed a method to detect deepfakes using Self-Blended Images (SBIs). This unique synthetic training data methodology has outperformed state-of-the-art techniques on unseen manipulations, the research paper said.
Through extensive experiments, the team has learned that this method improves the model generalisation to unknown manipulations, particularly on DFDC and DFDCP. This is because existing methods suffer from the domain gap between the training and test sets. The novel approach has outperformed the baseline by 4.90 per cent and 11.78 per cent points in the cross-dataset evaluation.
Most of the existing methods we know perform well in detecting known manipulations. However, when it comes to unknown manipulations, some methods do tend to be ineffective. So, how do we deal with this? One of the most effective approaches is training models with synthetic data. It encourages models to learn generic features for face forgery detection.
Method
The study’s primary purpose is to detect statistical inconsistencies or anomalies in the deepfakes between altered faces and background images. To train their detectors, the team generated fake synthetic samples that consist of common forgery traces that are difficult to recognise. These samples will be further used to train more reliable detectors.
Deepfake generation techniques will continue to improve; hence, GAN-synthesised source images will be even closer to pristine target images in their properties, for example, facial landmarks and pixel statistics.
Research paper: Samples of pristine images (top row) and their SBIs (bottom row)
It involves the following three steps:
(1) A source-target generator generates false source and target images, which will be used later for blending.
(2) A mask generator then creates a grey-scale mask image with some deformations.
(3) Lastly, the source and target images get blended with the mask to obtain an SBI.
Research paper: Overview of generating an SBI
In their cross-dataset evaluation, the team has trained their model on FF++ and evaluated it on CDF, DFD, DFDC, DFDCP, and FFIW with standard protocols. The purpose of this is to create a situation where the detectors are exposed to unseen domains. “Our approach surpasses or is at least comparable to the state-of-the-art methods on all test sets despite its simplicity,” the paper said.
The model has also been tested against discriminative attention models (DAM) and fully temporal convolution networks (FTCN).
Limitations
Even though the results in cross-dataset and cross-manipulation evaluations are expected to be beneficial, there are some limitations associated. Firstly, this model is incapable of capturing temporal inconsistencies across video frames. This means some sophisticated deepfake generation techniques with fewer spatial artefacts may not get detected. Further, this model is not compatible with whole-image synthesis. This is because it defines a fake image as an image where the face region or background is manipulated. The model has been evaluated using a 20k image set sampled from the FFHQ dataset and StyleGAN synthesis, and its AUC is only 69.11 per cent, the report said.
Research Paper: Typical artefacts on forged faces
Detection of deepfakes more crucial by the day
As AI keeps making groundbreaking advancements, the potential for criminal exploitation also increases. Currently, deepfakes are mostly being used for entertainment purposes, but there is a darker side to this evolving technology. According to a report, the cost associated with deepfake scams exceeded USD 250 million in 2020. Deepfake tech has the potential to be a great threat in the political sphere and cybersecurity. A lot of emphasis is hence being given to the development of its detection methodologies.
Delving into the identification of deepfakes, Srinivasan said, “Considering the advanced technology needed to even classify them as fakes, laymen, if caught up in deep fake treachery, would not even be able to prove themselves easily.”
Last year, the Chinese government lost as much as USD 76 million to criminals who manipulated personal data and fed the facial recognition systems with deepfake videos. This was just one example of how a biometric hack could lead to catastrophic results.