Last updated February 24, 2022
In AI Origins & Evolution

Top books on Transformers in 2022

The Transformer architecture implements an encoder-decoder structure without recurrence and convolutions.

Share

Illustration by Top books on Transformers in 2022

Published on February 24, 2022

by Meeta Ramnani

In 2017, the Google Brain team introduced Transformers in a paper called, Attention is all you need. They proposed a new simple network architecture based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Today, transformers have become the model of choice for NLP and computer vision.

Transformers underpins pretrained systems such as BERT and GPT. The Transformer architecture implements an encoder-decoder structure without recurrence and convolutions. Taking cognizance of the importance of Transformers in machine learning and AI, we have listed five books to understand the sequence transduction model better.

Transformers for Natural Language Processing

Author: Denis Rothman

The book is divided into three parts:

Introduction to Transformers architectures, from the original Transformer to RoBERTa, BERT, and DistilBERT models. The book covers training methods for smaller Transformers that can outperform GPT-3 in some cases.
Application of Transformers for Natural Language Generation (NLG) and Natural Language Understanding (NLU).
Details language understanding techniques for optimising social network datasets and fake news identification.

The book helps readers understand transformers from a cognitive science perspective. The book also teaches how to apply pretrained Transformer models to various datasets, build innovative deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, RoBERTa, and more.

Buy here

Mastering Transformers: Build state-of-the-art models from scratch with advanced natural language processing techniques

Author: Savaş Yıldırım and Meysam Asgari-Chenaghlu

The book explores NLP solutions with the Transformers library and how to train a language model in any language using any Transformer architecture. The book also explains how to fine-tune pre-trained language models to perform several downstream tasks. You will also learn how to pick up the right framework for the training, evaluation, and production of an end-to-end solution. The book takes a problem-solving approach to learning all about Transformers and implementing methodologies.

Buy here

Natural Language Processing with Transformers: Building Language Applications with Hugging Face

Author: Lewis Tunstall, Leandro von Werra and Thomas Wolf

The authors employ a hands-on approach to teach how Transformers work and how to integrate them in applications. This book helps to build, debug, and optimise Transformer models for basic NLP tasks like text classification, named entity recognition, and question answering. The book also has information on how Transformers can be used for cross-lingual transfer learning. Moreover, this book is ideal for developers who want to apply Transformers in a project where labelled data is scarce. The book expands on how to train Transformers from scratch and scale it across distributed environments with multiple GPUs.

Buy here

Learning Deep Learning: Theory and Practice of Neural Networks, Computer Vision, Natural Language Processing, and Transformers Using TensorFlow

Author: Magnus Ekman

In the book, the author demonstrates how to build advanced architectures, including the Transformer. It details the processes to build modern networks for computer vision and NLP, that include Mask R-CNN, GPT, and BERT. Ekman provides well-annotated code examples by using TensorFlow with Keras. It also covers the Python libraries for DL in industry and academia. The book details master core concepts like perceptrons, sigmoid neurons, gradient-based learning, and backpropagation.

Buy here

Transformers for Machine Learning: A Deep Dive

Authors: Uday Kamath, Kenneth L Graham and Wael Emara

The book covers 60 plus Transformers architectures in a comprehensive manner. It teaches how to apply the Transformers techniques in speech, text, computer vision and time series. The authors have provided practical tips and tricks for each architecture and how to deploy them in the real world. The book provides a single entry point for Transformers and is ideal for postgraduate students and researchers. It also has hands-on case studies and code around Transformers.

Buy here

Access all our open Survey & Awards Nomination forms in one place

Meeta Ramnani

Meeta’s interest lies in finding out real practical applications of technology. At AIM, she writes stories that question the new inventions and the need to develop them. She believes that technology has and will continue to change the world very fast and that it is no more ‘cool’ to be ‘old-school’. If people don’t update themselves with the technology, they will surely be left behind.