MITB Banner

Watch More

How To Get Started With OpenAI’s GPT-2 For Text Generation

OpenAI’s GPT-2 or Generative Pre-Training version 2 is a state-of-the-art language model that can generate text like humans. It is unmatched when it comes to a model that is generalised yet capable of outperforming models trained on specific tasks.

Recently, OpenAI open-sourced the complete model with about 1.5 billion parameters after creating a buzz over security concerns regarding the misuse of the technology. In this article, we will have a glimpse of GPT-2’s capability to generate text.

In this tutorial, we will set up a Docker container with GPT-2 and will generate sample texts from the pre-trained model.

Setting Up Docker

Installing Docker is pretty straight forward. Head to https://hub.docker.com/ and sign up with a Docker ID. Once you are in, you will see the following page.

Click on the Get started with Docker Desktop button.

Click to download the right version for your operating system.

Once the file is downloaded, open it to install Docker Desktop. Follow the standard procedure for installation based on your operating system and preferences. On successful installation, you will be able to see Docker on your taskbar as shown below. 

You can click on the icon to set your Docker preferences and to update it.

If you see a green dot which says Docker Desktop is running we are all set to fire up containers.

Also, execute the following command in the terminal or command prompt to ensure that everything is perfect:

docker --version

If everything is fine, it should return the installed version of the docker.

Output:

Docker version 19.03.4, build 9013bf5

Generating Text With GPT-2

We will follow the below steps to generate text using GPT-2 :

  1. Building a docker image
  2. Downloading the Pre-Trained Models
  3. Running The Container
  4. Running The Text Generator

Building A Docker Image

We will using a neatly packaged repository for gpt-2: https://github.com/nshepperd/gpt-2.

Open your terminal and clone or download the above repository into a directory in your local system say users/user_name/Documents/GPT-2. 

Create an empty file called dockerfile.gpt and copy the following commands and save the file in the directory or edit the existing Dockerfile.cpu to have the following content.

Note: If you want to build a docker image which would by default include the GPT-2 models you can use the default Docker.cpu or Docker.gpu file.

We can now build a docker image using the above docker file. Execute the following command to create a docker image with all the dependencies for the GPT-2 model.

docker build --tag gpt-2:1.0 -f docker.gpt .

Or 

docker build --tag gpt-2:1.0 -f Dockerfile.cpu .

Output:

Let’s check the list of images for our gpt-2:1.0 image

docker images

Voila! We have our image in place. 

Downloading the Pre-Trained Models

Open the terminal and cd to the cloned repository in your local machine. In the repository, you will find a file called download_model.py. Use one of the following commands to download the required GPT-2 model.

python3 download_model.py 124M python3 download_model.py 355M python3 download_model.py 774M python3 download_model.py 1558M

After the download is completed, you will be able to find a directory called models consisting of all the downloaded models.

Running The Container

Let’s fire up a container from the image. Execute the following command line to fire up the container with all the downloaded modules. Here we will mount the cloned repository in our local machine to the container. Mounting helps us share folders between the local machine and the docker container.

Note: Alternatively, you can use the default Dockerfile.cpu to build a docker image containing the repository and the modules by default. Since the process is a bit time consuming I have chosen to mount the modules and repository instead.

docker run -it -v /Users/user_name/Documents/GPT-2/gpt-2/:/gpt-2 gpt-2:1.0 bash

The above command will take us inside the container as shown below.

Running The Text Generator

Inside the docker container’s working directory, in the src folder, we will find two python files called generate_unconditional_samples.py and interactive_conditional_samples.py. The first one generates random samples using the trained model whereas the latter waits for user input and tries to complete the user’s text sequence.

To generate unconditional samples execute the following command:

python3 generate_unconditional_samples.py --model_name=<>

All the available flags are given below :

  • –model_name : The model being used for text generation. It can be one of the following :
    • 124M
    • 355M
    • 774M
    • 1558M
  • –seed : Integer seed for random number generators, fix seed to reproduce results.
  • –nsamples: Number of samples to return. If not specified will generate samples indefinitely.
  • –batch_size : Number of batches (only affects speed/memory).
  • –length : Number of tokens in generated text, if None (default), is determined by model hyperparameters.
  • –temperature : Float value controlling randomness in boltzmann distribution.
  • –top_k : Integer value controlling diversity.
  • –top_p :  Float value controlling diversity.

For a detailed description of the flags, enter the following command:

python3 src/generate_unconditional_samples.py --help

Example Output :

To generate conditional samples execute the following command:

python3 interactive_conditional_samples.py --model_name=MODEL_NAME

The flags are similar to that of the unconditional sample generator.

Example Output :

The above examples were generated using the 124M and 355M models respectively. If your machine is capable you may try the larger models for some state-of-the-art text generation.

Access all our open Survey & Awards Nomination forms in one place >>

Picture of Amal Nair

Amal Nair

A Computer Science Engineer turned Data Scientist who is passionate about AI and all related technologies. Contact: amal.nair@analyticsindiamag.com

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
Recent Stories