Hands-On Guide To Adversarial Robustness Toolbox (ART): Protect Your Neural Networks Against Hacking

The Adversarial Robustness Toolbox(ART) is a Python library which is one of the complete resources providing developers and researchers for evaluating the robustness of deep neural networks against adversarial attacks

Machine Learning models can be exposed to the threat to jeopardise with the predictions. Such attacks on deployment ends have been seen time and again and thus needed to be addressed accurately. AI security is most necessary for enterprise AI systems where data storage is mostly in tabular forms, and data privacy policies are at stake. The Adversarial Robustness Toolbox(ART) is a Python library which is one of the complete resources providing developers and researchers for evaluating the robustness of deep neural networks against adversarial attacks. Open-sourced by IBM, ART provides support to incorporate techniques to prevent adversarial attacks for deep neural networks written in TensorFlow, Keras, PyTorch, sci-kit-learn, MxNet, XGBoost, LightGBM, CatBoost and many more deep learning frameworks. It can be applied to all kinds of data from images, video, tables, to audio, and many more. It is cross-platform and supports various machine learning tasks such as classification, speech recognition, object detection, generation, certification, etc.

ART has attracted many developers since its first release. Its latest version v1.5 allows it to evaluate and defend AI models and applications against the 4 adversarial threats of inference, extraction, poisoning and evasion with a single unified library.

The new version extends the supported ML tasks to include object detection, automatic speech recognition (ASR), generative-adversarial networks (GAN), and robustness certification in addition to simple classification models and is compatible with more popular ML frameworks to prevent users from being under the hood of one framework. The threats of extraction, where the attacker is liable to steal a model via model queries, and inference, thereby allowing the attacker to acquire private information in a model’s training data. There are three different types of inference attacks that can disrupt different features of the privacy of data in training. In membership inference, ART allows reproducing a malicious attacker attempting to acquire information of a specific record, e.g. of a person, has been part of training data in an ML model or not. Such attacks can be harmful as they expose sensitive private information from just having access to a trained ML model. The attribute inference attack aims at extracting the original attribute values of an existing record in the training data, which can be only accessed by the trained model and knowing a few of the other features. For example, an ML model trained on demographic data is attacked with attribute inference could expose information about a person’s exact DOB and wages. Lastly, the model invasion where attackers can invert a trained ML model by reconstructing representative averages of features from the record.


Sign up for your weekly dose of what's up in emerging technology.

Adversarial Action Recognition Attack

This demonstrates the usage of ART library to impose an adversarial attack on video action recognition. First, it uses GluonCV and MXNet for video action recognition. MXNet pre-trained models are used for classification tasks. Specifically, the pre-trained i3d_resnet50_v1_ucf101 model is used. The video clip of a basketball action taken from the UCF101 dataset. To show how to classify the following short video clip correctly.

# Initial working stages  

  • the sample basketball to be downloaded 
  • the pre-trained action recognition model is to be loaded
  • To show that the model can correctly classify the video action as playing basketball.

# Loading Model and Basketball Sample

 import os
 import tempfile
 import decord
 from gluoncv import utils
 from gluoncv.data.transforms import video
 from gluoncv.model_zoo import get_model
 from gluoncv.utils.filesystem import try_import_decord
 import imageio
 from matplotlib.image import imsave
 import matplotlib.pyplot as plt
 import mxnet as mx
 from mxnet import gluon, nd, image
 from mxnet.gluon.data.vision import transforms
 import numpy as np
 from art.attacks.evasion import FastGradientMethod, FrameSaliencyAttack
 from art import config
 from art.defences.preprocessor import VideoCompression
 from art.estimators.classification import MXClassifier 

# setting global variables

 PRETRAINED_MODEL_NAME = 'i3d_resnet50_v1_ucf101'
 VIDEO_SAMPLE_URI = 'https://github.com/bryanyzhu/tiny-ucf101/raw/master/v_Basketball_g01_c01.avi' 

# setting seed

 def predict_top_k(video_input, model, k=5, verbose=True):
     pred = model(nd.array(video_input))
     classes = model.classes    
     ind = nd.topk(pred, k=k)[0].astype('int')
     if verbose:
         msg = "The sample video clip is"
         for i in range(k):
             msg += f"\n\t[{classes[ind[i].asscalar()]}], with probability {nd.softmax(pred)[0][ind[i]].asscalar():.3f}."
     return ind
 def sample_to_gif(sample, output="sample.gif", path=config.ART_DATA_PATH, postprocess=None):
     frame_count = sample.shape[1]
     output_path = os.path.join(path, output)
     with tempfile.TemporaryDirectory() as tmpdir, imageio.get_writer(output_path, mode='I') as writer:
         for frame in range(frame_count):
             file_path = os.path.join(tmpdir, f"{frame}.png")
             imsave(file_path, np.transpose(sample[:,frame,:,:], (1,2,0)))
     return output_path 

# downloading sample video

 decord = try_import_decord()
 video_fname = utils.download(VIDEO_SAMPLE_URI, path=config.ART_DATA_PATH);
 video_reader = decord.VideoReader(video_fname)
 frame_id_list = range(0, 64, 2)
 video_data = video_reader.get_batch(frame_id_list).asnumpy()
 video_sample_lst = [video_data[vid, :, :, :] for vid, _ in enumerate(frame_id_list)] 

# preprocessing the benign video sample

 transform_fn = video.VideoGroupValTransform(size=224, mean=[0.475, 0.465, 0.475], std=[0.220, 0.200, 0.225])
 sample = np.stack(transform_fn(video_sample_lst),  axis=0)
 sample = sample.reshape((-1,) + (32, 3, 224, 224))
 sample = np.transpose(sample, (0, 2, 1, 3, 4))
 print(f"`{video_fname}` has been downloaded and preprocessed.") 

`/home/hessel/.art/data/v_Basketball_g01_c01.avi` has been downloaded and preprocessed.

# loading pretrained model

 model = get_model(PRETRAINED_MODEL_NAME, nclass=101, pretrained=True)
 print(f"`{PRETRAINED_MODEL_NAME}` model was successfully loaded.") 

`i3d_resnet50_v1_ucf101` model was successfully loaded.

# evaluating model on basketball video sample

_ = predict_top_k(sample, model)

 The video sample clip is 
 [Basketball], with probability 0.725.
 [TennisSwing], with probability 0.212.
 [VolleyballSpiking], with probability 0.036.
 [SoccerJuggling], with probability 0.012.
 [TableTennisShot], with probability 0.007. 

For the given video sample, it is seen that the model correctly classified it as playing basketball.

# Creating Adversarial Attack

Now we can include the ART library for the adversarial attack via the Fast Gradient Method. The attack is incorporated to corrupt the video sample so that it could be misclassified. Also, the adversarial example is converted into a GIF 

Adversarial Basketball

# preprocessing the adversarial sample video input

 transform_fn_unnormalized = video.VideoGroupValTransform(size=224, mean=[0, 0, 0], std=[1, 1, 1])
 adv_sample_input = np.stack(transform_fn_unnormalized(video_sample_lst),  axis=0)
 adv_sample_input = adv_sample_input.reshape((-1,) + (32, 3, 224, 224))
 adv_sample_input = np.transpose(adv_sample_input, (0, 2, 1, 3, 4)) 

# wrapping model in a ART classifier

 model_wrapper = gluon.nn.Sequential()
 with model_wrapper.name_scope():

# preparing the mean and std arrays for ART classifier preprocessing

 mean = np.array([0.485, 0.456, 0.406] * (32 * 224 * 224)).reshape((3, 32, 224, 224), order='F')
 std = np.array([0.229, 0.224, 0.225] * (32 * 224 * 224)).reshape((3, 32, 224, 224), order='F')
 classifier_art = MXClassifier(
     input_shape=(3, 32, 224, 224),
     preprocessing=(mean, std),
     clip_values=(0, 1),

# verifying whether the ART classifier predictions are consistent with the original model:

 pred = nd.array(classifier_art.predict(adv_sample_input))
 ind = nd.topk(pred, k=5)[0].astype('int')
 msg = "The video sample clip is classified"
 for i in range(len(ind)):
     msg += f"\n\t[{model.classes[ind[i].asscalar()]}], with probability {nd.softmax(pred)[0][ind[i]].asscalar():.3f}."
 The video sample clip is classified to be
 [Basketball], with probability 0.725.
 [TennisSwing], with probability 0.212.
 [VolleyballSpiking], with probability 0.036.
 [SoccerJuggling], with probability 0.012.
 [TableTennisShot], with probability 0.007. 

# crafting adversarial attack with FGM

 epsilon = 8/255
 fgm = FastGradientMethod(
 adv_sample = fgm.generate(

CPU times: user 2min 44s, sys: 826 ms, total: 2min 45s

Wall time: 2min 45s

# printing results

_ = predict_top_k((adv_sample-mean)/std, model)

 The video sample clip is classified to be
 [ThrowDiscus], with probability 0.266.
 [Hammering], with probability 0.244.
 [TennisSwing], with probability 0.155.
 [HulaHoop], with probability 0.082.
 [JavelinThrow], with probability 0.055. 

# saving adversarial example to gif:

 adversarial_gif = sample_to_gif(np.squeeze(adv_sample), "adversarial_basketball.gif")
 print(f"`{adversarial_gif}` has been successfully created.") 

`/home/hessel/.art/data/adversarial_basketball.gif` has been successfully created.

# Creating Sparse Adversarial Attack

Using the Frame Saliency Attack, now it’s time to create a sparse adversarial example. The final result is shown in the GIF. Here only one frame is needed to be perturbed to achieve a misclassification.


# Frame Saliency Attack. Note: we specify here the frame axis, which is 2.

 fsa = FrameSaliencyAttack(
     frame_index = 2
 adv_sample_sparse = fsa.generate(

CPU times: user 5min 54s, sys: 700 ms, total: 5min 55s

Wall time: 5min 56s

# printing the resulting predictions:

_ = predict_top_k((adv_sample_sparse-mean)/std, model)

 The video sample clip is classified to be
 [TennisSwing], with probability 0.497.
 [Basketball], with probability 0.425.
 [VolleyballSpiking], with probability 0.040.
 [SoccerJuggling], with probability 0.014.
 [TableTennisShot], with probability 0.009. 

# Again saving the adversarial example to gif:

 adversarial_sparse_gif = sample_to_gif(np.squeeze(adv_sample_sparse), "adversarial_basketball_sparse.gif")
 print(f"`{adversarial_sparse_gif}` has been successfully created.") 

`/home/hessel/.art/data/adversarial_basketball_sparse.gif` has been successfully created.

# counting the number of perturbed frames:

 x_diff = adv_sample_sparse - adv_sample_input
 x_diff = np.swapaxes(x_diff, 1, 2)
 x_diff = np.reshape(x_diff, x_diff.shape[:2] + (np.prod(x_diff.shape[2:]), ))
 x_diff_norm = np.sign(np.round(np.linalg.norm(x_diff, axis=-1), decimals=4))
 print(f"Number of perturbed frames: {int(np.sum(x_diff_norm))}") 

Number of perturbed frames: 1

# Applying H.264 compression defence

Next VideoCompression is applied as a simple input preprocessing defence mechanism. This defence is intended to correct predictions when applied to both the original and the adversarial video input.

# initializing VideoCompression defense

video_compression = VideoCompression(video_format="avi", constant_rate_factor=30, channels_first=True)

# applying defense to the original input

adv_sample_input_compressed = video_compression(adv_sample_input * 255)[0] / 255

# applying defense to the sparse adversarial sample

adv_sample_sparse_compressed = video_compression(adv_sample_sparse * 255)[0] / 255

# printing the resulting predictions on compressed original input

_ = predict_top_k((adv_sample_input_compressed-mean)/std, model)

# printing the resulting predictions on sparse adversarial sample

_ = predict_top_k((adv_sample_sparse_compressed-mean)/std, model)

 The video sample clip is classified to be
 [Basketball], with probability 0.512.
 [TennisSwing], with probability 0.439.
 [VolleyballSpiking], with probability 0.021.
 [TableTennisShot], with probability 0.012.
 [SoccerJuggling], with probability 0.008.
 The video sample clip is classified to be
 [Basketball], with probability 0.711.
 [TennisSwing], with probability 0.223.
 [VolleyballSpiking], with probability 0.028.
 [SoccerJuggling], with probability 0.012.
 [TableTennisShot], with probability 0.009. 


Here’s a detailed video from IBM Research explaining Adversarial Robustness Toolbox working. 

More Great AIM Stories

Jayita Bhattacharyya
Machine learning and data science enthusiast. Eager to learn new technology advances. A self-taught techie who loves to do cool stuff using technology for fun and worthwhile.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM