TorchIO is a PyTorch based deep learning library written in Python for medical imaging. It is used for 3D medical image loading, preprocessing, augmenting, and sampling. This project is supported by the School of Biomedical Engineering & Imaging Sciences (BMEIS) (King’s College London) and the Wellcome / EPSRC Centre for Interventional and Surgical Sciences (WEISS) (University College London). It has been developed being inspired by a previously existing Tensorflow library NiftyNet, which is no longer maintained and has shifted its development mostly towards a new project named MONAI.
A group of researchers namely Fernando Pérez-García, Rachel Sparks, Sebastien Ourselin released it in their paper named “TorchIO: a Python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning”.
THE BELAMY
Sign up for your weekly dose of what's up in emerging technology.
Paper link – https://arxiv.org/pdf/2003.04696.pdf
GitHub repo – https://github.com/fepegar/torchio
Documentation – https://torchio.readthedocs.io/
To efficiently manage large 3D images, Torchio uses popular medical image processing libraries SimpleITK and NiBabel. It helps in creating complete medical imaging pipelines by providing different features such as intensity and spatial transforms, multiple generic, magnetic-resonance-imaging-specific operations, random affine transformations, domain-specific simulation of intensity artefacts for MRI magnetic field inhomogeneity or k-space motion artefacts.
A medical image is a representation of 3D tensor containing voxel data and a 2D matrix of the spatial information. These datasets are stored in the Neuroimaging Informatics Technology Initiative (NIfTI) or Data Imaging and Communications in Medicine (DICOM) formats, and generally read and processed by medical imaging frameworks such as SimpleITK or NiBabel. Deep learning methods typically require large amounts of annotated data, which is hard to gather in case of clinical data due to patient privacy, the financial and time-consuming collecting data and annotating them. Data augmentation can be used to virtually increase the size of the training dataset by applying different transformation techniques to each training sequence while preserving annotations. But this is not the case for 3D image reading and transformations. Moreover, medical images are mostly in grayscale; hence applying colourspace transforms is not an option. Some other options being cropping and scaling, but applying these can be tricky and may destroy important spatial relationships.
Installation
The latest version is pip installable: pip install torchio
To get plots, install Matplotlib along with Torchio: pip install torchio[plot]
3D U-Net to perform brain segmentation from T1-weighted MRI using the Information eXtraction from Images (IXI) dataset, a publicly available dataset with 600 subjects.
# importing libraries
import torch import torch.nn.functional as F from torchvision.utils import make_grid, save_image torch.manual_seed(seed) import torchio as tio from torchio import AFFINE, DATA import numpy as np import nibabel as nib from unet import UNet from scipy import stats import SimpleITK as sitk import matplotlib.pyplot as plt from IPython import display from tqdm.notebook import tqdm
# Load Dataset
dataset_url = 'https://www.dropbox.com/s/ogxjwjxdv5mieah/ixi_tiny.zip?dl=0' dataset_path = 'ixi_tiny.zip' dataset_dir_name = 'ixi_tiny' dataset_dir = Path(dataset_dir_name) histogram_landmarks_path = 'landmarks.npy'
A subject is a data structure used to store images and associated metadata.
images_dir = dataset_dir / 'image' labels_dir = dataset_dir / 'label' image_paths = sorted(images_dir.glob('*.nii.gz')) label_paths = sorted(labels_dir.glob('*.nii.gz')) assert len(image_paths) == len(label_paths) subjects = [] for (image_path, label_path) in zip(image_paths, label_paths): subject = tio.Subject( mri=tio.ScalarImage(image_path), brain=tio.LabelMap(label_path), ) subjects.append(subject) dataset = tio.SubjectsDataset(subjects) print('Dataset size:', len(dataset), 'subjects') Dataset size: 566 subjects one_subject = dataset[0] print(one_subject) print(one_subject.mri) show_subject(tio.ToCanonical()(one_subject), 'mri', label_name='brain') Subject(Keys: ('mri', 'brain'); images: 2) ScalarImage(shape: (1, 83, 44, 55); spacing: (2.18, 4.13, 3.95); orientation: SRA+; memory: 784.6 KiB; type: intensity)
#transformation
#spatial transform
fpg = tio.datasets.FPG() print('Sample subject:', fpg) show_fpg(fpg)
For complete implementation of transformations and augmentations visit this notebook.
# normalization – using the transforms to normalize our images intensity.
paths = image_paths if compute_histograms: fig, ax = plt.subplots(dpi=100) for path in tqdm(paths): tensor = tio.ScalarImage(path).data if 'HH' in path.name: color = 'red' elif 'Guys' in path.name: color = 'green' elif 'IOP' in path.name: color = 'blue' plot_histogram(ax, tensor, color=color) ax.set_xlim(-100, 2000) ax.set_ylim(0, 0.004); ax.set_title('Original histograms of all samples') ax.set_xlabel('Intensity') ax.grid() graph = None else: graph = display.Image(url='https://www.dropbox.com/s/daqsg3udk61v65i/hist_original.png?dl=1') graph
# HistogramStandardization
landmarks = tio.HistogramStandardization.train( image_paths, output_path=histogram_landmarks_path, ) np.set_printoptions(suppress=True, precision=3) print('\nTrained landmarks:', landmarks)
100%|██████████| 566/566 [00:05<00:00, 100.76it/s]Trained landmarks: [ 0. 0.002 0.108 0.227 0.467 2.014 15.205 34.297 49.664
55.569 61.178 74.005 100. ]
landmarks_dict = {'mri': landmarks} histogram_transform = tio.HistogramStandardization(landmarks_dict) if compute_histograms: fig, ax = plt.subplots(dpi=100) for i ,sample in enumerate(tqdm(dataset)): standard = histogram_transform(sample) tensor = standard.mri.data path = str(sample.mri.path) if 'HH' in path: color = 'red' elif 'Guys' in path: color = 'green' elif 'IOP' in path: color = 'blue' plot_histogram(ax, tensor, color=color) ax.set_xlim(0, 150) ax.set_ylim(0, 0.02) ax.set_title('Intensity values of all samples after histogram standardization') ax.set_xlabel('Intensity') ax.grid() graph = None else: graph = display.Image(url='https://www.dropbox.com/s/dqqaf78c86mrsgn/hist_standard.png?dl=1') graph
training whole volumes
For complete implementation visit here.
Brain parcellation with TorchIO and HighRes3DNet
# patch-based to train using image patches randomly extracted from the volumes
patch based sampling
# pretrained model
repo = 'fepegar/highresnet' model_name = 'highres3dnet' model = torch.hub.load(repo, model_name, pretrained=True) device = torch.device('cuda') if torch.cuda.is_available() else 'cpu' print('Device:', device) model.to(device); #inference patch_overlap = 4 patch_size = 128 grid_sampler = tio.inference.GridSampler( preprocessed, patch_size, patch_overlap, ) patch_loader = torch.utils.data.DataLoader(grid_sampler) aggregator = tio.inference.GridAggregator(grid_sampler) model.eval() for patches_batch in tqdm(patch_loader): input_tensor = patches_batch['t1'][tio.DATA].to(device) locations = patches_batch[tio.LOCATION] logits = model(input_tensor) labels = logits.argmax(dim=tio.CHANNELS_DIMENSION, keepdim=True) aggregator.add_batch(labels, locations) output_tensor = aggregator.get_output_tensor() plot_volume( output_tensor.numpy().squeeze(), enhance=False, colors_path='GIFNiftyNet.ctbl', )
Sagittal L-R 80 Coronal P-A 128 Axial I-S 128
For complete implementation visit this notebook.
3D Slicer
Apart from command line tools, TorchIO provides a 3D Slicer extension package which is a no-code platform for quick experimentation and visualization.
End Notes
TorchIO implements CNNs to handle medical imaging using better transformations and patch-based learning which is more efficient than batch learning(which needs more amount of data per batch). As of now, the library has the only implementation of MRIs in future the authors have plans to extend it to computerized tomography(CT) and ultrasound(US).