Last updated February 28, 2024
In Tech & AI Blend

Bengaluru Researcher’s Model Accurately Spots AI-generated Profile Pictures

The research collaboration by UC Berkeley and LinkedIn used light-weight, low-dimensional models with relatively minimal training data

Share

Published on June 28, 2023

by K L Krithika

Listen to this story

IIT-Kharagpur alumnus, Shivansh Mundra, who currently works at LinkedIn, along with researchers from the University of California and Berkeley, recently came up with a new technique to accurately identify fake profile images generated by GAN (generative adversarial network). Mundra told AIM, “Upon studying thousands of profile pictures, we found specific structural patterns in the ones generated by GANs. This information was really useful to us, and what we did is we created a very simple linear model.”

The research collaboration by UC Berkeley and LinkedIn used light-weight, low-dimensional models with relatively minimal training data that mathematically computes the definite pattern of StyleGAN faces from real profile images.

The Rise of Fake Profile Images

LinkedIn, which hosts more than 930 million profiles professionally connecting each other, is also home to hundreds of thousands of fake profiles masquerading as real ones. They prey on unsuspecting users, offering jobs that don’t exist, fake tech support in exchange for money or just traditional phishing.

The first step in social media scams is falling for the profile picture. There are tools that help in finding if the picture is GAN-generated or real but they’re only about 60% accurate. The research collaboration by UC Berkeley and LinkedIn accurately identifies artificially generated profile pictures 99.6% of the time while misidentifying genuine pictures as fake only 1%.

How does it work?

Most models so far solve the problem of identifying fake images using convolutional neural networks or CNN, a type of deep learning model which involves a filter. The image is passed through this filter to pick out the distinct structure or pattern. By using multiple filters you can pick out various specific features. This is effective only when there is a large training data for the model to learn the common features of a GAN image.

The downside to this is that GAN, with its adversarial features, outperforms itself. “The generated images have patterns which repeat themselves, and there is less diversity in the 100,000 images we studied. The synthetic images are very structural and this information was really useful to us. With this in mind, we created a very simple model, as simple as a linear model,” explained Mundra.

Other methods

When it comes to addressing this issue, there are generally two forensic methods that can be used. The first is a hypothesis-based approach, which detects irregularities in synthetically created faces. However, this method faces challenges when dealing with advanced synthesis engines that can mimic genuine features. The other, data-driven methods, such as machine learning, can distinguish between natural faces and computer-generated ones. Nonetheless, they may struggle when confronted with unfamiliar images.

The proposed approach in this paper gets hybrid, combining both methods. It starts by identifying a distinctive geometric attribute in computer-generated faces and then employs data-driven techniques for measurement and detection. This approach relies on a lightweight and easily trainable classifier, requiring training on a small set of synthetic faces.

Mundra said that to accomplish this, they created 41,500 synthetic faces using five different synthesis engines, in addition to an extra dataset containing 100,000 real LinkedIn profile pictures. The previous accuracy rate extends only to human faces produced by GAN generators though they’re working on models that would pick up fake images generated by Stable Diffusion and other tools.

What’s next?

“This is just for the GAN-based images. In other methods like diffusion or these new transformer-based methods like Stable Diffusion, which have different types of structural patterns built in them, we weren’t able to detect fake images with these simple linear models. But there are other methods we are working on which we’ll probably put out soon,” Shivansh explained.

Access all our open Survey & Awards Nomination forms in one place