Top books on Transformers in 2022

The Transformer architecture implements an encoder-decoder structure without recurrence and convolutions.

In 2017, the Google Brain team introduced Transformers in a paper called, Attention is all you need. They proposed a new simple network architecture based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Today, transformers have become the model of choice for NLP and computer vision.

Transformers underpins pretrained systems such as BERT and GPT. The Transformer architecture implements an encoder-decoder structure without recurrence and convolutions. Taking cognizance of the importance of Transformers in machine learning and AI, we have listed five books to understand the sequence transduction model better.

Transformers for Natural Language Processing

Author: Denis Rothman


Sign up for your weekly dose of what's up in emerging technology.

The book is divided into three parts:

  1. Introduction to Transformers architectures, from the original Transformer to RoBERTa, BERT, and DistilBERT models. The book covers training methods for smaller Transformers that can outperform GPT-3 in some cases. 
  2. Application of Transformers for Natural Language Generation (NLG) and Natural Language Understanding (NLU).
  3. Details language understanding techniques for optimising social network datasets and fake news identification.

The book helps readers understand transformers from a cognitive science perspective. The book also teaches how to apply pretrained Transformer models to various datasets, build innovative deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, RoBERTa, and more.

Download our Mobile App

Buy here

Mastering Transformers: Build state-of-the-art models from scratch with advanced natural language processing techniques

Author: Savaş Yıldırım and Meysam Asgari-Chenaghlu 

The book explores NLP solutions with the Transformers library and how to train a language model in any language using any Transformer architecture. The book also explains how to fine-tune pre-trained language models to perform several downstream tasks. You will also learn how to pick up the right framework for the training, evaluation, and production of an end-to-end solution. The book takes a problem-solving approach to learning all about Transformers and implementing methodologies. 

Buy here

Natural Language Processing with Transformers: Building Language Applications with Hugging Face 

Author: Lewis Tunstall, Leandro von Werra and Thomas Wolf  

The authors employ a hands-on approach to teach how Transformers work and how to integrate them in applications. This book helps to build, debug, and optimise Transformer models for basic NLP tasks like text classification, named entity recognition, and question answering. The book also has information on how Transformers can be used for cross-lingual transfer learning. Moreover, this book is ideal for developers who want to apply Transformers in a project where labelled data is scarce. The book expands on how to train Transformers from scratch and scale it across distributed environments with multiple GPUs.

Buy here

Learning Deep Learning: Theory and Practice of Neural Networks, Computer Vision, Natural Language Processing, and Transformers Using TensorFlow

Author: Magnus Ekman

In the book, the author demonstrates how to build advanced architectures, including the Transformer. It details the processes to build modern networks for computer vision and NLP, that include Mask R-CNN, GPT, and BERT. Ekman provides well-annotated code examples by using TensorFlow with Keras. It also covers the Python libraries for DL in industry and academia. The book details master core concepts like perceptrons, sigmoid neurons, gradient-based learning, and backpropagation.

Buy here

Transformers for Machine Learning: A Deep Dive

Authors: Uday Kamath, Kenneth L Graham and Wael Emara

The book covers 60 plus Transformers architectures in a comprehensive manner. It teaches how to apply the Transformers techniques in speech, text, computer vision and time series. The authors have provided practical tips and tricks for each architecture and how to deploy them in the real world. The book provides a single entry point for Transformers and is ideal for postgraduate students and researchers. It also has hands-on case studies and code around Transformers.

Buy here

More Great AIM Stories

Meeta Ramnani
Meeta’s interest lies in finding out real practical applications of technology. At AIM, she writes stories that question the new inventions and the need to develop them. She believes that technology has and will continue to change the world very fast and that it is no more ‘cool’ to be ‘old-school’. If people don’t update themselves with the technology, they will surely be left behind.

AIM Upcoming Events

Regular Passes expire on 3rd Mar

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Early Bird Passes expire on 17th Feb

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, Virtual
Deep Learning DevCon 2023
27 May, 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

What went wrong with Meta?

Many users have opted out of Facebook and other applications tracking their activities now that they must explicitly ask for permission.