Hands-On Guide To Markov Chain For Text Generation

In this article, we will learn about What are Markov chains? The model Implementation of a predictive text generator using Markov chains.


Markov chains are a very simple and easy way to create statistical models on a random process. They have been used for quite some time now and mostly find applications in the financial industry and for predictive text generation. Markov chains became popular due to the fact that it does not require complex mathematical concepts or advanced statistics to build it. They are a great way to start learning about probabilistic modelling and data science implementations. 

In this article, we will learn about


Sign up for your weekly dose of what's up in emerging technology.
  • What are Markov chains?
  • The model
  • Implementation of a predictive text generator using Markov chains. 

What are Markov chains?

Markov chains are random determined processes with a finite set of states that move from one state to another. These sets of transitions from state to state are determined by some probability distribution. 

markov chain


Consider the scenario of performing three activities: sleeping, running and eating ice cream. Each node contains the labels and the arrows determine the probability of that event occurring. In the above example, the probability of running after sleeping is 60% whereas sleeping after running is just 10%.

The important feature to keep in mind here is that the next state is entirely dependent on the previous state. The next state is determined on a probabilistic basis. Hence Markov chains are called memoryless. Since they are memoryless these chains are unable to generate sequences that contain some underlying trend. They simply lack the ability to produce content that depends on the context since they cannot take into account the full chain of prior states.

The model 

A Markov chain typically consists of two entities: A transition matrix and an initial state vector. 

As we saw above, the next state in the chain depends on the probability distribution of the previous state. These probabilities are represented in the form of a transition matrix. The transition matrix for the earlier example would look like this.

markov chain

Where S is for sleep, R is for run and I stands for ice cream.

 If the Markov chain has M possible states, the transition matrix would be M x M, such that entry (I, J) is the probability of transitioning from the state I to state J.The rows of the transition matrix should add up to 1 because they are probability distribution and each state will have its own probability. 

The second entity is an initial state vector which is an Mx1 matrix. This matrix describes the probability distribution of M possible values. The entry I mean the probability beginning at the state I. 

We know how to obtain the transitions from one state to another, but we need to be able to find the chances of that transition occurring over multiple steps. To do this, we need to determine the probability of moving from the state I to J over N iterations. Since the transition matrix is given, this can be calculated by raising N to the power of M. For small values of N, this can easily be done with repeated multiplication. 

Implementation of a text generator with Markov chain

Upon understanding the working of the Markov chain, we know that this is a random distribution model. We will use this concept to generate text. I will implement it both using Python code and built-in functions. Let’s get started. 

Coding from scratch

This model is a very simple single-function model. The dataset used for this can be download from this link. Once we have downloaded the data be sure to read the content of the entire dataset once. 

Now we will write a function that performs the text generations. 

import random
def markov_text():
    data_sample = "text.txt"
    text_data = open(data_sample, 'r').read()
    text_data = ''.join([i for i in text_data if not i.isdigit()]).replace("\n", " ").split(' ')
    index = 1
    markov_gen = {}
    word_count = int(input('select the number of words you want to generate'))

Here we have opened our file and written all the sentences into new lines. We will create a dictionary of words in the markov_gen variable based on the number of words you want to generate. 

for character in text_data[index:]:
        key = text_data[index-1]
        if key in markov_gen:
            markov_gen[key] = [character]
        index += 1

Next, we analyse each word in the data file and generate key-value pairs. 

    character1 = random.choice(list(markov_gen.keys()))
    message = character1.capitalize()
    while len(message.split(' ')) < word_count:
        character2 = random.choice(markov_gen[character1])
        character1 = character2
        message += ' ' + character2
    return message

Finally, we will create a range of random choice of words from our dictionary and display the output on the screen. I will give the word count to be 20. 

text generation, Markov Chain

At first glance, this may look like something an actual human being says or types. But looking closely you will notice that it is just a random set of words together. Also, note that this sentence does not appear in the original text file and is generated by our model. 

Using the built-in method

To make the implementation of Markov chains easy, you can make use of the built-in package known as markovify. 

To install this use the following command 

pip install markovify

We will implement this for the same dataset used above. 

import markovify
with open("/content/gdrive/My Drive/text.txt") as f:
    data = f.read()

Next, you can choose how many sentences you want to generate by assigning the sentence count in the for-loop. I have generated 3 sentences here. 

data_model = markovify.Text(data)
for i in range(3):

The Text method is for the generation of random sentences from our data. 

Markov Chain

Again, these sentences are only random. Another option with this package is to choose how many characters should be in the sentences. 

for i in range(3):

Here, it prints 3 sentences with a maximum of 280 characters.

text generation, Markov Chain

Every time the program is run a new output is generated because Markov models are memoryless. We have successfully built a Markov chain text generator using custom and built-in codes. 


Markov chains are a very simple and easy way to generate text that mimics humans to some extent. But, for effectively generate text, the text corpus needs to be filled with documents that are similar. Simple Markov chains are the building blocks of other, more sophisticated, modelling techniques. These models can be powerful tools for NLP and deep learning as well. 

More Great AIM Stories

Bhoomika Madhukar
I am an aspiring data scientist with a passion for teaching. I am a computer science graduate from Dayananda Sagar Institute. I have experience in building models in deep learning and reinforcement learning. My goal is to use AI in the field of education to make learning meaningful for everyone.

Our Upcoming Events

Conference, in-person (Bangalore)
MachineCon 2022
24th Jun

Conference, Virtual
Deep Learning DevCon 2022
30th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM