Machine Learning models can be exposed to the threat to jeopardise with the predictions. Such attacks on deployment ends have been seen time and again and thus needed to be addressed accurately. AI security is most necessary for enterprise AI systems where data storage is mostly in tabular forms, and data privacy policies are at stake. The Adversarial Robustness Toolbox(ART) is a Python library which is one of the complete resources providing developers and researchers for evaluating the robustness of deep neural networks against adversarial attacks. Open-sourced by IBM, ART provides support to incorporate techniques to prevent adversarial attacks for deep neural networks written in TensorFlow, Keras, PyTorch, sci-kit-learn, MxNet, XGBoost, LightGBM, CatBoost and many more deep learning frameworks. It can be applied to all kinds of data from images, video, tables, to audio, and many more. It is cross-platform and supports various machine learning tasks such as classification, speech recognition, object detection, generation, certification, etc.
ART has attracted many developers since its first release. Its latest version v1.5 allows it to evaluate and defend AI models and applications against the 4 adversarial threats of inference, extraction, poisoning and evasion with a single unified library.
The new version extends the supported ML tasks to include object detection, automatic speech recognition (ASR), generative-adversarial networks (GAN), and robustness certification in addition to simple classification models and is compatible with more popular ML frameworks to prevent users from being under the hood of one framework. The threats of extraction, where the attacker is liable to steal a model via model queries, and inference, thereby allowing the attacker to acquire private information in a model’s training data. There are three different types of inference attacks that can disrupt different features of the privacy of data in training. In membership inference, ART allows reproducing a malicious attacker attempting to acquire information of a specific record, e.g. of a person, has been part of training data in an ML model or not. Such attacks can be harmful as they expose sensitive private information from just having access to a trained ML model. The attribute inference attack aims at extracting the original attribute values of an existing record in the training data, which can be only accessed by the trained model and knowing a few of the other features. For example, an ML model trained on demographic data is attacked with attribute inference could expose information about a person’s exact DOB and wages. Lastly, the model invasion where attackers can invert a trained ML model by reconstructing representative averages of features from the record.
Adversarial Action Recognition Attack
This demonstrates the usage of ART library to impose an adversarial attack on video action recognition. First, it uses GluonCV and MXNet for video action recognition. MXNet pre-trained models are used for classification tasks. Specifically, the pre-trained i3d_resnet50_v1_ucf101 model is used. The video clip of a basketball action taken from the UCF101 dataset. To show how to classify the following short video clip correctly.
# Initial working stages
- the sample basketball to be downloaded
- the pre-trained action recognition model is to be loaded
- To show that the model can correctly classify the video action as playing basketball.
# Loading Model and Basketball Sample
import os import tempfile import decord from gluoncv import utils from gluoncv.data.transforms import video from gluoncv.model_zoo import get_model from gluoncv.utils.filesystem import try_import_decord import imageio from matplotlib.image import imsave import matplotlib.pyplot as plt import mxnet as mx from mxnet import gluon, nd, image from mxnet.gluon.data.vision import transforms import numpy as np from art.attacks.evasion import FastGradientMethod, FrameSaliencyAttack from art import config from art.defences.preprocessor import VideoCompression from art.estimators.classification import MXClassifier
# setting global variables
PRETRAINED_MODEL_NAME = 'i3d_resnet50_v1_ucf101' VIDEO_SAMPLE_URI = 'https://github.com/bryanyzhu/tiny-ucf101/raw/master/v_Basketball_g01_c01.avi'
# setting seed
np.random.seed(123) def predict_top_k(video_input, model, k=5, verbose=True): pred = model(nd.array(video_input)) classes = model.classes ind = nd.topk(pred, k=k)[0].astype('int') if verbose: msg = "The sample video clip is" for i in range(k): msg += f"\n\t[{classes[ind[i].asscalar()]}], with probability {nd.softmax(pred)[0][ind[i]].asscalar():.3f}." print(msg) return ind def sample_to_gif(sample, output="sample.gif", path=config.ART_DATA_PATH, postprocess=None): frame_count = sample.shape[1] output_path = os.path.join(path, output) with tempfile.TemporaryDirectory() as tmpdir, imageio.get_writer(output_path, mode='I') as writer: for frame in range(frame_count): file_path = os.path.join(tmpdir, f"{frame}.png") imsave(file_path, np.transpose(sample[:,frame,:,:], (1,2,0))) writer.append_data(imageio.imread(file_path)) return output_path
# downloading sample video
decord = try_import_decord() video_fname = utils.download(VIDEO_SAMPLE_URI, path=config.ART_DATA_PATH); video_reader = decord.VideoReader(video_fname) frame_id_list = range(0, 64, 2) video_data = video_reader.get_batch(frame_id_list).asnumpy() video_sample_lst = [video_data[vid, :, :, :] for vid, _ in enumerate(frame_id_list)]
# preprocessing the benign video sample
transform_fn = video.VideoGroupValTransform(size=224, mean=[0.475, 0.465, 0.475], std=[0.220, 0.200, 0.225]) sample = np.stack(transform_fn(video_sample_lst), axis=0) sample = sample.reshape((-1,) + (32, 3, 224, 224)) sample = np.transpose(sample, (0, 2, 1, 3, 4)) print(f"`{video_fname}` has been downloaded and preprocessed.")
`/home/hessel/.art/data/v_Basketball_g01_c01.avi` has been downloaded and preprocessed.
# loading pretrained model
model = get_model(PRETRAINED_MODEL_NAME, nclass=101, pretrained=True) print(f"`{PRETRAINED_MODEL_NAME}` model was successfully loaded.")
`i3d_resnet50_v1_ucf101` model was successfully loaded.
# evaluating model on basketball video sample
_ = predict_top_k(sample, model)
The video sample clip is [Basketball], with probability 0.725. [TennisSwing], with probability 0.212. [VolleyballSpiking], with probability 0.036. [SoccerJuggling], with probability 0.012. [TableTennisShot], with probability 0.007.
For the given video sample, it is seen that the model correctly classified it as playing basketball.
# Creating Adversarial Attack
Now we can include the ART library for the adversarial attack via the Fast Gradient Method. The attack is incorporated to corrupt the video sample so that it could be misclassified. Also, the adversarial example is converted into a GIF
Adversarial Basketball
# preprocessing the adversarial sample video input
transform_fn_unnormalized = video.VideoGroupValTransform(size=224, mean=[0, 0, 0], std=[1, 1, 1]) adv_sample_input = np.stack(transform_fn_unnormalized(video_sample_lst), axis=0) adv_sample_input = adv_sample_input.reshape((-1,) + (32, 3, 224, 224)) adv_sample_input = np.transpose(adv_sample_input, (0, 2, 1, 3, 4))
# wrapping model in a ART classifier
model_wrapper = gluon.nn.Sequential() with model_wrapper.name_scope(): model_wrapper.add(model)
# preparing the mean and std arrays for ART classifier preprocessing
mean = np.array([0.485, 0.456, 0.406] * (32 * 224 * 224)).reshape((3, 32, 224, 224), order='F') std = np.array([0.229, 0.224, 0.225] * (32 * 224 * 224)).reshape((3, 32, 224, 224), order='F') classifier_art = MXClassifier( model=model_wrapper, loss=gluon.loss.SoftmaxCrossEntropyLoss(), input_shape=(3, 32, 224, 224), nb_classes=101, preprocessing=(mean, std), clip_values=(0, 1), channels_first=True, )
# verifying whether the ART classifier predictions are consistent with the original model:
pred = nd.array(classifier_art.predict(adv_sample_input)) ind = nd.topk(pred, k=5)[0].astype('int') msg = "The video sample clip is classified" for i in range(len(ind)): msg += f"\n\t[{model.classes[ind[i].asscalar()]}], with probability {nd.softmax(pred)[0][ind[i]].asscalar():.3f}." print(msg)
The video sample clip is classified to be [Basketball], with probability 0.725. [TennisSwing], with probability 0.212. [VolleyballSpiking], with probability 0.036. [SoccerJuggling], with probability 0.012. [TableTennisShot], with probability 0.007.
# crafting adversarial attack with FGM
epsilon = 8/255 fgm = FastGradientMethod( classifier_art, eps=epsilon, ) %%time adv_sample = fgm.generate( x=adv_sample_input )
CPU times: user 2min 44s, sys: 826 ms, total: 2min 45s
Wall time: 2min 45s
# printing results
_ = predict_top_k((adv_sample-mean)/std, model)
The video sample clip is classified to be [ThrowDiscus], with probability 0.266. [Hammering], with probability 0.244. [TennisSwing], with probability 0.155. [HulaHoop], with probability 0.082. [JavelinThrow], with probability 0.055.
# saving adversarial example to gif:
adversarial_gif = sample_to_gif(np.squeeze(adv_sample), "adversarial_basketball.gif") print(f"`{adversarial_gif}` has been successfully created.")
`/home/hessel/.art/data/adversarial_basketball.gif` has been successfully created.
# Creating Sparse Adversarial Attack
Using the Frame Saliency Attack, now it’s time to create a sparse adversarial example. The final result is shown in the GIF. Here only one frame is needed to be perturbed to achieve a misclassification.
adversarial_basketball_sparse.gif
# Frame Saliency Attack. Note: we specify here the frame axis, which is 2.
fsa = FrameSaliencyAttack( classifier_art, fgm, "iterative_saliency", frame_index = 2 ) %%time adv_sample_sparse = fsa.generate( x=adv_sample_input )
CPU times: user 5min 54s, sys: 700 ms, total: 5min 55s
Wall time: 5min 56s
# printing the resulting predictions:
_ = predict_top_k((adv_sample_sparse-mean)/std, model)
The video sample clip is classified to be [TennisSwing], with probability 0.497. [Basketball], with probability 0.425. [VolleyballSpiking], with probability 0.040. [SoccerJuggling], with probability 0.014. [TableTennisShot], with probability 0.009.
# Again saving the adversarial example to gif:
adversarial_sparse_gif = sample_to_gif(np.squeeze(adv_sample_sparse), "adversarial_basketball_sparse.gif") print(f"`{adversarial_sparse_gif}` has been successfully created.")
`/home/hessel/.art/data/adversarial_basketball_sparse.gif` has been successfully created.
# counting the number of perturbed frames:
x_diff = adv_sample_sparse - adv_sample_input x_diff = np.swapaxes(x_diff, 1, 2) x_diff = np.reshape(x_diff, x_diff.shape[:2] + (np.prod(x_diff.shape[2:]), )) x_diff_norm = np.sign(np.round(np.linalg.norm(x_diff, axis=-1), decimals=4)) print(f"Number of perturbed frames: {int(np.sum(x_diff_norm))}")
Number of perturbed frames: 1
# Applying H.264 compression defence
Next VideoCompression is applied as a simple input preprocessing defence mechanism. This defence is intended to correct predictions when applied to both the original and the adversarial video input.
# initializing VideoCompression defense
video_compression = VideoCompression(video_format="avi", constant_rate_factor=30, channels_first=True)
# applying defense to the original input
adv_sample_input_compressed = video_compression(adv_sample_input * 255)[0] / 255
# applying defense to the sparse adversarial sample
adv_sample_sparse_compressed = video_compression(adv_sample_sparse * 255)[0] / 255
# printing the resulting predictions on compressed original input
_ = predict_top_k((adv_sample_input_compressed-mean)/std, model)
# printing the resulting predictions on sparse adversarial sample
_ = predict_top_k((adv_sample_sparse_compressed-mean)/std, model)
The video sample clip is classified to be [Basketball], with probability 0.512. [TennisSwing], with probability 0.439. [VolleyballSpiking], with probability 0.021. [TableTennisShot], with probability 0.012. [SoccerJuggling], with probability 0.008. The video sample clip is classified to be [Basketball], with probability 0.711. [TennisSwing], with probability 0.223. [VolleyballSpiking], with probability 0.028. [SoccerJuggling], with probability 0.012. [TableTennisShot], with probability 0.009.
EndNotes
Here’s a detailed video from IBM Research explaining Adversarial Robustness Toolbox working.