OpenAI’s GPT-2 or Generative Pre-Training version 2 is a state-of-the-art language model that can generate text like humans. It is unmatched when it comes to a model that is generalised yet capable of outperforming models trained on specific tasks.
Recently, OpenAI open-sourced the complete model with about 1.5 billion parameters after creating a buzz over security concerns regarding the misuse of the technology. In this article, we will have a glimpse of GPT-2’s capability to generate text.
In this tutorial, we will set up a Docker container with GPT-2 and will generate sample texts from the pre-trained model.
Setting Up Docker
Installing Docker is pretty straight forward. Head to https://hub.docker.com/ and sign up with a Docker ID. Once you are in, you will see the following page.
Click on the Get started with Docker Desktop button.
Click to download the right version for your operating system.
Once the file is downloaded, open it to install Docker Desktop. Follow the standard procedure for installation based on your operating system and preferences. On successful installation, you will be able to see Docker on your taskbar as shown below.
You can click on the icon to set your Docker preferences and to update it.
If you see a green dot which says Docker Desktop is running we are all set to fire up containers.
Also, execute the following command in the terminal or command prompt to ensure that everything is perfect:
If everything is fine, it should return the installed version of the docker.
Docker version 19.03.4, build 9013bf5
Generating Text With GPT-2
We will follow the below steps to generate text using GPT-2 :
- Building a docker image
- Downloading the Pre-Trained Models
- Running The Container
- Running The Text Generator
Building A Docker Image
We will using a neatly packaged repository for gpt-2: https://github.com/nshepperd/gpt-2.
Open your terminal and clone or download the above repository into a directory in your local system say users/user_name/Documents/GPT-2.
Create an empty file called dockerfile.gpt and copy the following commands and save the file in the directory or edit the existing Dockerfile.cpu to have the following content.
Note: If you want to build a docker image which would by default include the GPT-2 models you can use the default Docker.cpu or Docker.gpu file.
We can now build a docker image using the above docker file. Execute the following command to create a docker image with all the dependencies for the GPT-2 model.
docker build --tag gpt-2:1.0 -f docker.gpt .
docker build --tag gpt-2:1.0 -f Dockerfile.cpu .
Let’s check the list of images for our gpt-2:1.0 image
Voila! We have our image in place.
Downloading the Pre-Trained Models
Open the terminal and cd to the cloned repository in your local machine. In the repository, you will find a file called download_model.py. Use one of the following commands to download the required GPT-2 model.
python3 download_model.py 124M
python3 download_model.py 355M
python3 download_model.py 774M
python3 download_model.py 1558M
After the download is completed, you will be able to find a directory called models consisting of all the downloaded models.
Running The Container
Let’s fire up a container from the image. Execute the following command line to fire up the container with all the downloaded modules. Here we will mount the cloned repository in our local machine to the container. Mounting helps us share folders between the local machine and the docker container.
Note: Alternatively, you can use the default Dockerfile.cpu to build a docker image containing the repository and the modules by default. Since the process is a bit time consuming I have chosen to mount the modules and repository instead.
docker run -it -v /Users/user_name/Documents/GPT-2/gpt-2/:/gpt-2 gpt-2:1.0 bash
The above command will take us inside the container as shown below.
Running The Text Generator
Inside the docker container’s working directory, in the src folder, we will find two python files called generate_unconditional_samples.py and interactive_conditional_samples.py. The first one generates random samples using the trained model whereas the latter waits for user input and tries to complete the user’s text sequence.
To generate unconditional samples execute the following command:
python3 generate_unconditional_samples.py --model_name=<>
All the available flags are given below :
- –model_name : The model being used for text generation. It can be one of the following :
- –seed : Integer seed for random number generators, fix seed to reproduce results.
- –nsamples: Number of samples to return. If not specified will generate samples indefinitely.
- –batch_size : Number of batches (only affects speed/memory).
- –length : Number of tokens in generated text, if None (default), is determined by model hyperparameters.
- –temperature : Float value controlling randomness in boltzmann distribution.
- –top_k : Integer value controlling diversity.
- –top_p : Float value controlling diversity.
For a detailed description of the flags, enter the following command:
python3 src/generate_unconditional_samples.py --help
Example Output :
To generate conditional samples execute the following command:
python3 interactive_conditional_samples.py --model_name=MODEL_NAME
The flags are similar to that of the unconditional sample generator.
Example Output :
The above examples were generated using the 124M and 355M models respectively. If your machine is capable you may try the larger models for some state-of-the-art text generation.