Last updated February 2, 2021
In AI Mysteries

How To Deter Adversarial Attacks In Computer Vision Models

Share

Published on August 15, 2020

by Srishti Deoras

While computer vision has become one of the most used technologies across the globe, computer vision models are not immune to threats. One of the reasons for this threat is the underlying lack of robustness of the models. Indrajit Kar, who is the Principal Solution Architect at Accenture, took through a talk at CVDC 2020 on how to make AI more resilient to attack.

As Kar shared, AI has become the new target for attackers, and the instances of manipulation and adversaries have increased dramatically over the last few years. From companies such as Google and Tesla to startups are affected by adversarial attacks.

“While we celebrate advancements in AI, deep neural networks (DNNs)—the algorithms intrinsic to much of AI—have recently been proven to be at risk from attack through seemingly benign inputs. It is possible to fool DNNs by making subtle alterations to input data that often either remain undetected or are overlooked if presented to a human,” he said.

Type Of Adversarial Attacks

Alterations to images that are so small as to remain unnoticed by humans can cause DNNs to misinterpret the image content. As many AI systems take their input from external sources—voice recognition devices or social media upload, for example—this ability to be tricked by adversarial input opens a new, often intriguing, security threat. This has called for an increase in cybersecurity which is coming together to address the crevices in computer vision and machine learning.

One of the ways to deal with this is to not just make AI but a trustworthy AI. A lot of data and data sets available such as medical images, X-rays, pose a security threat. There are also instances of patching the images to obstruct the actual image. Therefore creating trustworthy AI will help models be more robust and less prone to attacks.

Some of the ways of adversarial attacks are:

Circumventing web filters by perturbed images: These apply mainly to social media providers, online marketplaces with business models that depend on external data. Here alterations such adversarial patches applied to the edge of an image might camouflage the faces from AI.
Evading fake new detection: It involves attacking the machine learning models which finds duplicates of debunked stories both photo-based and article-based misinformation.
Adversarial attack on search: It can cause targets to be miss associated with something that could be damaging.
Camouflage from real-time surveillance: An adversary might aim to avoid facial detection in real-time by the security system and cameras.
Adversarial attack on autonomous systems
The DDOS attack, and more…

Attacks can be broadly classified into — location-specific attacks, knowledge-specific and intent-specific attacks, which are further subdivided into:

Training attack: Aims to increase the number of misclassified samples at test time by injecting a small fraction of carefully designed adversarial samples into training data.
Inference attack: The attacker manipulates input samples to evade a trained classifier at test or inference time.
White-box attack: It exploits model-internal information. It assumes complete knowledge of the targeted model, including its parameter values, architecture, training methods and more.
Black-box attack: Attackers have no knowledge about the model under attack
Targeted attack: The adversary tries to produce inputs that force the output of the classification model to be a specific target class.
Non-targeted attack: It causes misclassification as opposed to causing classification into a specific incorrect class.

Some of the other types of attacks are made on the auxiliary model, training data extraction model, model extraction attack, model inversion attack, strategically-timed attack, sparse-evasion attack, source-target misclassification attack and more. There may be attacks in the form of input gradient, high dimensions, boundary attacks and more.

How To Made Computer Vision Models More Robust Against Adversarial Inputs

Kar said that making the computer vision model robust is the best way to keep the attacks at bay. While some of the ways such as model hardening which includes steps such as dimensionality reduction, denoise images, dropout and more; image augmentation; shifting and cropping are some of the ways to make model robust these may not always be reliable.

Some of the more effective ways are:

Adversarial training with perturbation or noise: It reduces classification errors
Gradient masking: It denies the attacker access to the useful gradient
Input regularisation: It can be used to avoid large gradients on the inputs that make networks vulnerable to attacks
Defence distillation: It is an adversarial training technique where the target model is used to train a smaller model that exhibits a smoother output surface.
Ensemble adversarial learning: Multiple classifiers are trained together and combined to improve robustness
Feature squeezing: This reduces the search space available to an adversary by coalescing samples that correspond to many different feature vectors in the original space into a single sample.

Kar shared that apart from this, one of the most effective ways that companies are following nowadays is to have a Blue team and a Red team. While the Red team creates adversarial attacks by trying various methods to find crevices in the model, the Blue team comes up with the defence strategies on how to overcome these attacks. “It is a continuous process, and companies must adapt it to make their computer vision models more robust and safe from adversarial attacks,” he said on a concluding note.

Access all our open Survey & Awards Nomination forms in one place