Computer vision systems are often seen to be inconsistent in many cases and here safety issues for a real-world scenario come in as a concern. For instance, an automatic car’s signalling system may disrupt during adverse climatic conditions such as fogs or heavy snowing. Previously applied techniques might have proven to improve the performance of such systems, but they fail at many instances too, like the corrupted, or unforeseen data inputs that came across during deployment. In a recent work by Microsoft Research, a new framework is introduced which can address these problems of data models to create “unadversarial objects,” inputs that are optimized particularly for more robust model performance. This newly proposed approach for image recognition/classification methods helps in predicting better in the case of unforeseen corruptions or distribution shifts.
Below shown is a video how unadversarial patch helps in better predictions
How to design these robust objects?
Modern computer vision models lack to incorporate the perturbations and specifically to adversarial examples of their inputs. Some of these models have end crossed human intelligence levels but on testing with benchmarks show significant degradation in performance when placed in unfamiliar situations. Adversarial examples are induced in the training data to purposely cause failures and thus being aware of this fact researchers use them to train their systems to become more robust to such attacks. Such transformations are designed by solving a complex optimization problem, which intends to maximize the loss of the model concerning the input. In unadversarial, the approach is not keeping these misleading inputs for the model, rather optimize the inputs that disrupt performance, thereby incorporating these unadversarial examples, or better to call robust objects. In order to achieve these results, the optimization equation problem needs to be solved.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
Following are the ways of creating unadversarial examples:
- By adding unadversarial patch to the object. The left picture is of a grey toy jet containing unadversarial patch shown in bright colours towards the back of the body. Here data is trained along with an unadversarial patch, and in every epoch, sampling is performed for image-label pairs (x, y) from the train set and finally place the patch on the image with arbitrary orientation and position.
- By unadversarially altering of the texture of the object. The right picture shows a 3D rendering of a toy jet designed in the form of unadversarial texture, which is white with some bright colours all along the body. For unadversarial texture, we need to train the images with an unadversarial texture, and we also separately require a 3D mesh of objects along with a set of background images. On every epoch, a renderer is used this could be a Mitsuba used for mapping the object’s to corresponding texture and finally cast the rendering onto the arbitrary background image.
For both the techniques, the output image then has to go through a computer vision model, and run on projected gradient descent (PGD) to solve the optimization problem being the texture or the patch to be unadversarial. The resulting patch or texture has a unique form that is then associated with the object class.
Evaluation for this specified method is tested on the standard benchmark datasets ImageNet and CIFAR-10 and the robustness-based benchmark datasets ImageNet-C and CIFAR-10-C to show improved efficiency. Following is a performance measurement analysis.
Code Snippet
GitHub repo – https://github.com/microsoft/unadversarial/
import pathlib import sys from torchvision.utils import save_image curr_path = pathlib.Path(__file__).parent.absolute() sys.path.insert(0, str(curr_path / 'better_corruptions')) import argparse import os from pathlib import Path import cox.store import cox.utils import dill import json import numpy as np import torch as ch from robustness import datasets, defaults, loaders, model_utils, train from robustness.tools import breeds_helpers from torch import nn from torchvision import models from torchvision.datasets import CIFAR10 from . import boosters, constants from .utils import custom_datasets, LinearModel from uuid import uuid4 BOOSTING_FP = 'boosting.ch'
# Custom arguments
parser.add_argument('--boost', choice=['none', 'class_consistency', '3d'],default='class_consistency',help='Dataset') parser.add_argument('--augmentations', type=str, default=None, help='e.g. fog,gaussian_noise') parser.add_argument('--dataset', choices=['cifar', 'imagenet', 'entity13', 'living17', 'solids', 'city'], default='imagenet') parser.add_argument('--training-mode', type=str, choices = ['joint', 'model','booster']) parser.add_argument('--arch', type=str, default='resnet18') parser.add_argument('--lr', type=float, default=0.005) parser.add_argument('--patch-lr', type=float, default=0.005) parser.add_argument('--pytorch-pretrained', action='store_true')
# Lighting
parser.add_argument('--min-light', type=float, default=0.5, help="Minimum lighting (darkest)") parser.add_argument('--max-light', type=float, default=0.5, help="Maximum lighting (lightest)")
def get_dataset_and_loaders(args): if args.dataset == 'solids': ds = datasets.ImageNet(args.data, custom_class=custom_datasets.SolidColors, custom_class_args={'image_size': constants.DS_TO_DIM[args.dataset]}) elif args.dataset == 'city': ds = datasets.ImageNet(args.data) elif args.dataset == 'cifar': ds = datasets.CIFAR('/tmp') elif args.dataset == 'imagenet': ds = datasets.ImageNet(args.data) else: raise NotImplementedError train_loader, val_loader = ds.make_loaders(batch_size=args.batch_size, val_batch_size = args.batch_size,workers=args.workers,data_aug=True) return ds, (train_loader, val_loader) if arch == 'linear': arch = LinearModel(num_classes, constants.DS_TO_DIM[args.dataset]) kwargs = {'arch': arch, 'dataset': ds, 'resume_path': args.model_path, 'add_custom_forward': is_pt_model or args.arch=='linear', 'pytorch_pretrained': args.pytorch_pretrained} model, _ = model_utils.make_and_restore_model(**kwargs)
# Wrapping model with DataAugmentedModel despite no corruptions.
# For consistency when loading from checkpoints
model = boosters.DataAugmentedModel(model, ds.ds_name , args.augmentations.split(',') if args.augmentations else [])
# passing the checkpoint to train_model is avoided and also resuming for epoch, optimizers and other other parameters are avoided
if args.boosting == 'class_consistent': boosting_path = Path(args.out_dir) / BOOSTING_FP if boosting_path.exists(): booster = ch.load(boosting_path) else: dim = constants.DS_TO_DIM[args.dataset] model = boosters.BoostedModel(model, booster, args.training_mode) elif args.boosting == '3d': boosting_path = Path(args.out_dir) / BOOSTING_FP if boosting_path.exists(): booster = ch.load(boosting_path) else: dim = constants.DS_TO_DIM[args.dataset] render_options = { 'min_zoom': args.min_zoom, 'max_zoom': args.max_zoom, 'min_light': args.min_light, 'max_light': args.max_light, 'samples': args.render_samples } corruptions = constants.THREE_D_CORRUPTIONS if args.add_corruptions else None booster = boosters.ThreeDBooster(num_classes=num_classes, tex_size=args.patch_size, image_size=dim, batch_size = args.batch_size,render_options=render_options,num_texcoords=args.num_texcoord_renderers,num_gpus=ch.cuda.device_count(),debug=args.debug, forward_render=args.forward_render,custom_file=args.custom_file, corruptions=corruptions) model = boosters.BoostedModel(model, booster, args.training_mode) elif args.boosting == 'none': model = boosters.BoostedModel(model, None, args.training_mode) else: raise ValueError(f'boosting not found: {args.boosting}') return model.cuda() def main_trainer(args, store): ds, (train_loader, val_loader) = get_dataset_and_loaders(args) if args.single_class is not None: print(f"Boosting towards a single class {args.single_class}")
# Transforming to same label
class_x = lambda t, y: (t, ch.ones_like(y) * args.single_class) train_loader = loaders.LambdaLoader(train_loader, class_x) val_loader = loaders.LambdaLoader(val_loader, class_x) model = get_boosted_model(args, ds)
# Resuming training from checkpoint of the boosted model
resume_path = os.path.join(args.out_dir, args.exp_name, 'checkpoint.pt.latest') checkpoint = None if args.resume and os.path.isfile(resume_path): print('[Resuming training BoostedModel from a checkpoint...]') checkpoint = ch.load(resume_path, pickle_module=dill) sd = checkpoint['model'] model.load_state_dict(sd) print(f"Dataset: {args.dataset} | Model: {args.arch}") if args.eval_only: print('==>[Evaluating the model]') return train.eval_model(args, model, val_loader, store=store) parameters = [model.dummy] if args.training_mode in ['joint', 'model']: parameters = model.boosted_model.parameters() def iteration_hook(model, i, loop_type, inp, target): if loop_type == 'val' or model.module.booster is None: return if args.training_mode in ['booster', 'joint']: model.module.booster.step_booster(lr=args.patch_lr) if i % args.save_freq == 0: save_dir = Path(store.save_dir)
#Moving this part inside the 2D boosters because the corrupt boosted images need to be saved
if args.boosting != '3d': inp, target = inp.cuda(), target.cuda() example_boosted = model.module.booster(inp, target) bs_path = save_dir / f'boosted_{i}.jpg' save_image(example_boosted[:4], bs_path) example_adversaried = model.module.boosted_model.apply(example_boosted) inp_path = save_dir / f'inp_{i}.jpg' adv_path = save_dir / f'adv_{i}.jpg' save_image(inp[:4], inp_path) save_image(example_adversaried[:4], adv_path) else: if not args.save_only_last: save_dir = save_dir / f'iteration_{i}' os.makedirs(save_dir) with ch.no_grad(): model(inp, target, save_dir=save_dir) if i == 0: print(f'Saved in {store.save_dir}') args.iteration_hook = iteration_hook return train.train_model(args, model, (train_loader, val_loader), store=store, checkpoint=checkpoint, update_params=parameters)
if __name__ == "__main__": args = parser.parse_args() if args.json_config is not None: print("Overriding args with JSON...") new_args = json.load(open(args.json_config))
# automatic job stops on the cluster for the preemptions.
if args.exp_name == 'random': args.exp_name = str(uuid4()) print(f"Experiment name: {args.exp_name}") assert args.exp_name != None
# Preprocessing arguments
else "imagenet" args = defaults.check_and_fill_args( args, defaults.CONFIG_ARGS, datasets.DATASETS[default_ds]) if not args.eval_only: args = defaults.check_and_fill_args( args, defaults.TRAINING_ARGS, datasets.DATASETS[default_ds]) if False and (args.adv_train or args.adv_eval): args = defaults.check_and_fill_args( args, defaults.PGD_ARGS, datasets.DATASETS[default_ds]) args = defaults.check_and_fill_args( args, defaults.MODEL_LOADER_ARGS, datasets.DATASETS[default_ds]) store = cox.store.Store(args.out_dir, args.exp_name) if 'metadata' not in store.keys: args_dict = args.__dict__ schema = cox.store.schema_from_dict(args_dict) store.add_table('metadata', schema) store['metadata'].append_row(args_dict) else: print('[Found existing metadata in store. Skipping this part.]') print(args) main_trainer(args, store)
EndNotes
Unadversarial objects have been implemented in some applications and have proven to give better performance for e.g drone landing, QR codes, high fidelity 3D simulators. The results from unadversarial prove significant reliance and out-of-distribution robustness of computer vision models.