In 2017, the Google Brain team introduced Transformers in a paper called, Attention is all you need. They proposed a new simple network architecture based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Today, transformers have become the model of choice for NLP and computer vision.
Transformers underpins pretrained systems such as BERT and GPT. The Transformer architecture implements an encoder-decoder structure without recurrence and convolutions. Taking cognizance of the importance of Transformers in machine learning and AI, we have listed five books to understand the sequence transduction model better.
Transformers for Natural Language Processing
Author: Denis Rothman
Sign up for your weekly dose of what's up in emerging technology.
The book is divided into three parts:
- Introduction to Transformers architectures, from the original Transformer to RoBERTa, BERT, and DistilBERT models. The book covers training methods for smaller Transformers that can outperform GPT-3 in some cases.
- Application of Transformers for Natural Language Generation (NLG) and Natural Language Understanding (NLU).
- Details language understanding techniques for optimising social network datasets and fake news identification.
The book helps readers understand transformers from a cognitive science perspective. The book also teaches how to apply pretrained Transformer models to various datasets, build innovative deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, RoBERTa, and more.
Download our Mobile App
Mastering Transformers: Build state-of-the-art models from scratch with advanced natural language processing techniques
Author: Savaş Yıldırım and Meysam Asgari-Chenaghlu
The book explores NLP solutions with the Transformers library and how to train a language model in any language using any Transformer architecture. The book also explains how to fine-tune pre-trained language models to perform several downstream tasks. You will also learn how to pick up the right framework for the training, evaluation, and production of an end-to-end solution. The book takes a problem-solving approach to learning all about Transformers and implementing methodologies.
Natural Language Processing with Transformers: Building Language Applications with Hugging Face
Author: Lewis Tunstall, Leandro von Werra and Thomas Wolf
The authors employ a hands-on approach to teach how Transformers work and how to integrate them in applications. This book helps to build, debug, and optimise Transformer models for basic NLP tasks like text classification, named entity recognition, and question answering. The book also has information on how Transformers can be used for cross-lingual transfer learning. Moreover, this book is ideal for developers who want to apply Transformers in a project where labelled data is scarce. The book expands on how to train Transformers from scratch and scale it across distributed environments with multiple GPUs.
Learning Deep Learning: Theory and Practice of Neural Networks, Computer Vision, Natural Language Processing, and Transformers Using TensorFlow
Author: Magnus Ekman
In the book, the author demonstrates how to build advanced architectures, including the Transformer. It details the processes to build modern networks for computer vision and NLP, that include Mask R-CNN, GPT, and BERT. Ekman provides well-annotated code examples by using TensorFlow with Keras. It also covers the Python libraries for DL in industry and academia. The book details master core concepts like perceptrons, sigmoid neurons, gradient-based learning, and backpropagation.
Transformers for Machine Learning: A Deep Dive
Authors: Uday Kamath, Kenneth L Graham and Wael Emara
The book covers 60 plus Transformers architectures in a comprehensive manner. It teaches how to apply the Transformers techniques in speech, text, computer vision and time series. The authors have provided practical tips and tricks for each architecture and how to deploy them in the real world. The book provides a single entry point for Transformers and is ideal for postgraduate students and researchers. It also has hands-on case studies and code around Transformers.