Exploring ArtLine – To Create Line Art Portraits, Movie Posters & Cartoonize Images in Python

ArtLine uses deep learning algorithms to achieve fine quality line art portraits, movie posters and cartoonize images.

Deep Learning and Computer Vision has evolved and done wonders time and again. Today we are going to talk about one such recently done amazing project called ‘ArtLine’ that uses deep learning algorithms to achieve fine quality line art portraits. Besides that, it also can be used to generate movie posters and cartoonize images. It is currently the most trending topic in both GitHub and paperswithcode. It is created by Vijish Madhavan, a deep learning researcher. 

The model has been built using the APDrawing dataset and Anime line art pair using many different algorithms, derived from some research papers self-attention, progressive resizing and generator loss. It shows how stacking all the methods can generate high-quality results. Primarily PyTorch and Fastai libraries are used. It generates fine lines/edges in the sketch image, which is better than most existing methods. Try out the demo from this Colab Notebook with any portrait picture which is expected in an URL and then converted to image formats. You can clone the repository or tweak the code to use your local image file and within less than 2 minutes (executing with GPU) have a look at the amazing results. 

Get a cartoon version of Tom Hanks


Sign up for your weekly dose of what's up in emerging technology.

Here’s a movie poster generated by ArtLine

As of now, the movie poster and cartoon generating models have not been released. We can soon expect them in the near future. Only the pre-trained line art portrait generating model is available. 

Let’s explore how the model training takes place.

Importing necessary libraries

 import torch
 import torch.nn as nn
 import fastai
 from fastai.vision import *
 from fastai.callbacks import *
 from fastai.vision.gan import *
 from torchvision.models import vgg16_bn
 from fastai.utils.mem import *
 from PIL import Image
 import numpy as np
 from torch.autograd import Variable
 import torchvision.transforms as transforms 
Edge Detection – this function uses the convolutional neural network to detect edges from an image as to features and use it as a gradient.
 def _gradient_img(img):
     img = img.squeeze(0)
     a=np.array([[1, 0, -1],[2,0,-2],[1,0,-1]])
     conv1=nn.Conv2d(1, 1, kernel_size=3, stride=1, padding=1, bias=False) 
Building the neural network and assigning weights 
     b=np.array([[1, 2, 1],[0,0,0],[-1,-2,-1]])
     conv2=nn.Conv2d(1, 1, kernel_size=4, stride=2, padding=2, bias=False)  conv2.weight=nn.Parameter(torch.from_numpy(b).float().unsqueeze(0).unsqueeze(0))
     G=torch.sqrt(torch.pow(G_x,2)+ torch.pow(G_y,2))
     return G
 gradient = TfmPixel(_gradient_img) 

PATH – redirecting to the saved APDrawing dataset and a selective picture from the Anime sketch colourization pair.

 path = Path('/content/gdrive/My Drive/Apdrawing')
 Blended Facial Features
 path_hr = Path('/content/gdrive/My Drive/Apdrawing/draw tiny')
 path_lr = Path('/content/gdrive/My Drive/Apdrawing/Tiny Real')
 Portrait Pair
 path_hr3 = Path('/content/gdrive/My Drive/Apdrawing/drawing')
 path_lr3= Path('/content/gdrive/My Drive/Apdrawing/Real')
 Architecture -  pretrained resnet34 model is used
 arch = models.resnet34 
Detecting Facial Features
 src = ImageImageList.from_folder(path_lr).split_by_rand_pct(0.3, seed=42)
 def get_data(bs,size):
     data = (src.label_from_func(lambda x: path_hr/x.name)
            .transform(get_transforms(xtra_tfms=[gradient()]), size=size, tfm_y=True)
            .databunch(bs=bs,num_workers = 0).normalize(imagenet_stats, do_y=True))
     data.c = 3
     return data 

Progressive resizing by the Fastai library helps gradually increase the size of the image and the adjusting learning rates, thereby generalizing the images as it goes through different stages. 


 bs,size=20, 64
 data = get_data(bs,size)
 data.show_batch(ds_type=DatasetType.Valid, rows=2, figsize=(9,9))
 t = data.valid_ds[0][1].data
 t = torch.stack([t,t])
 def gram_matrix(x):
     n,c,h,w = x.size()
     x = x.view(n, c, -1)
     return (x @ x.transpose(1,2))/(c*h*w)
 base_loss = F.l1_loss
 vgg_m = vgg16_bn(True).features.cuda().eval()
 requires_grad(vgg_m, False)
 blocks = [j-1 for j,o in enumerate(children(vgg_m)) if isinstance(o,nn.MaxPool2d)]
 blocks, [vgg_m[i] for i in blocks] 

Perpetual loss is calculated for image transformations based on the VGG_16 model. It speeds up training. This approach combines both a per-pixel loss between the output and ground-truth images and optimizing perceptual loss functions based on high-level features extracted from pre-trained networks. The results are then used to train a feed-forward network.

 class FeatureLoss(nn.Module):
     def __init__(self, m_feat, layer_ids, layer_wgts):
         self.m_feat = m_feat
         self.losses = [self.m_feat[i] for i in layer_ids]
         self.hooks = hook_outputs(self.losses, detach=False)
         self.wgts = layer_wgts
         self.metrics_name = ['pixel',] + [f'feat_{i}' for i in range(len(layer_ids))] + [f'gram_{i}' for i in range(len(layer_ids))]
     def make_features(self, x, clone=False):
         return [(p.clone() if clone else p) for p in self.hooks.stored]
     def forward(self, input, target):
         out_feat = self.make_features(target, clone=True)
         in_feat = self.make_features(input)
         self.feat_losses = [base_loss(input,target)]
         self.feat_losses += [base_loss(f_in, f_out)*w
         for in, out, w in zip(in_feat, out_feat, self.wgts)]
         self.feat_losses += [base_loss(gram_matrix(in),  gram_matrix(out))*w**2 * 5e3
         for in, out, w in zip(in_feat, out_feat, self.wgts)]
         self.metrics = dict(zip(self.metric_names, self.feat_losses))
         return sum(self.feat_losses)
     def __del__(self): self.hooks.remove()
 feat_loss = FeatureLoss(vgg_m, blocks[2:5], [5,15,2])
 wd = 1e-3
 y_range = (-3.,3.) 

This function uses the self-attention model generator with U-Net and spatial normalization. This is a No GAN training which stabilizes colour images. Here minimal time is spent in direct GAN training instead, separately pretraining the generator and critic. This was introduced in another project named DeOldify. It helps largely in getting accurate facial features.

 def create_gen_learner():
     return unet_learner(data, arch, wd=wd, blur=True, norm_type=NormType.Spectral,self_attention=True, y_range=(-3.0, 3.0),loss_func=feat_loss, callback_fns=LossMetrics)
 learn_gen = create_gen_learner()
 lr = 1-01
 epoch = 5 
fitting the model
 def do_fit(save_name, lrs=slice(lr), pct_start=0.9):
     learn_gen.fit_one_cycle(epoch, lrs, pct_start=pct_start,)
     learn_gen.show_results(rows=1, imgsize=5)
 do_fit('da', slice(lr))
 epoch = 5
 do_fit('db', slice(1E-2)) 

Results for different pixel values


 data = get_data(8,128)
 learn_gen.data = data
 epoch =5
 lr = 1E-03
 epoch = 5
 do_fit('db3', slice(1e-02,1e-5), pct_start=0.3) 


 data = get_data(5,192)
 learn_gen.data = data
 epoch =5
 lr = 1E-06
 epoch = 5
 do_fit('db5', slice(1e-06,1e-4), pct_start=0.3) 
Acquiring data for portrait images
 src = ImageImageList.from_folder(path_lr3).split_by_rand_pct(0.2, seed=42)
 def get_data(bs,size):
     data = (src.label_from_func(lambda x: path_hr3/x.name)
            .transform(get_transforms(max_zoom=2.), size=size, tfm_y=True).databunch(bs=bs,num_workers = 0).normalize(imagenet_stats, do_y=True))
     data.c = 3
     return data 


 data = get_data(8,128)
 learn_gen.data = data
 data.show_batch(ds_type=DatasetType.Valid, rows=2, figsize=(9,9))
 epoch = 5
 lr = 1e-03
 epoch = 5
 do_fit('db7', slice(6.31E-07,1e-5), pct_start=0.3) 


 data = get_data(4,192)
 learn_gen.data = data
 epoch = 5
 lr = 4.37E-05
 epoch = 5
 do_fit('db9', slice(1.00E-05,1e-3), pct_start=0.3) 


Limitations – Needs smooth or plain backgrounds to process and works poorly with lighting or shadows. Works poorly on low-quality images even.

Nevertheless, ArtLine is achieving pretty good state-of-the-art results, and the project is constantly under development. 

More Great AIM Stories

Jayita Bhattacharyya
Machine learning and data science enthusiast. Eager to learn new technology advances. A self-taught techie who loves to do cool stuff using technology for fun and worthwhile.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM