Hands-On Guide To Markov Chain For Text Generation

In this article, we will learn about What are Markov chains? The model Implementation of a predictive text generator using Markov chains.

Markov chains are a very simple and easy way to create statistical models on a random process. They have been used for quite some time now and mostly find applications in the financial industry and for predictive text generation. Markov chains became popular due to the fact that it does not require complex mathematical concepts or advanced statistics to build it. They are a great way to start learning about probabilistic modelling and data science implementations. 

In this article, we will learn about

  • What are Markov chains?
  • The model
  • Implementation of a predictive text generator using Markov chains. 

What are Markov chains?

Markov chains are random determined processes with a finite set of states that move from one state to another. These sets of transitions from state to state are determined by some probability distribution. 

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
markov chain


Consider the scenario of performing three activities: sleeping, running and eating ice cream. Each node contains the labels and the arrows determine the probability of that event occurring. In the above example, the probability of running after sleeping is 60% whereas sleeping after running is just 10%.

Download our Mobile App

The important feature to keep in mind here is that the next state is entirely dependent on the previous state. The next state is determined on a probabilistic basis. Hence Markov chains are called memoryless. Since they are memoryless these chains are unable to generate sequences that contain some underlying trend. They simply lack the ability to produce content that depends on the context since they cannot take into account the full chain of prior states.

The model 

A Markov chain typically consists of two entities: A transition matrix and an initial state vector. 

As we saw above, the next state in the chain depends on the probability distribution of the previous state. These probabilities are represented in the form of a transition matrix. The transition matrix for the earlier example would look like this.

markov chain

Where S is for sleep, R is for run and I stands for ice cream.

 If the Markov chain has M possible states, the transition matrix would be M x M, such that entry (I, J) is the probability of transitioning from the state I to state J.The rows of the transition matrix should add up to 1 because they are probability distribution and each state will have its own probability. 

The second entity is an initial state vector which is an Mx1 matrix. This matrix describes the probability distribution of M possible values. The entry I mean the probability beginning at the state I. 

We know how to obtain the transitions from one state to another, but we need to be able to find the chances of that transition occurring over multiple steps. To do this, we need to determine the probability of moving from the state I to J over N iterations. Since the transition matrix is given, this can be calculated by raising N to the power of M. For small values of N, this can easily be done with repeated multiplication. 

Implementation of a text generator with Markov chain

Upon understanding the working of the Markov chain, we know that this is a random distribution model. We will use this concept to generate text. I will implement it both using Python code and built-in functions. Let’s get started. 

Coding from scratch

This model is a very simple single-function model. The dataset used for this can be download from this link. Once we have downloaded the data be sure to read the content of the entire dataset once. 

Now we will write a function that performs the text generations. 

import random
def markov_text():
    data_sample = "text.txt"
    text_data = open(data_sample, 'r').read()
    text_data = ''.join([i for i in text_data if not i.isdigit()]).replace("\n", " ").split(' ')
    index = 1
    markov_gen = {}
    word_count = int(input('select the number of words you want to generate'))

Here we have opened our file and written all the sentences into new lines. We will create a dictionary of words in the markov_gen variable based on the number of words you want to generate. 

for character in text_data[index:]:
        key = text_data[index-1]
        if key in markov_gen:
            markov_gen[key] = [character]
        index += 1

Next, we analyse each word in the data file and generate key-value pairs. 

    character1 = random.choice(list(markov_gen.keys()))
    message = character1.capitalize()
    while len(message.split(' ')) < word_count:
        character2 = random.choice(markov_gen[character1])
        character1 = character2
        message += ' ' + character2
    return message

Finally, we will create a range of random choice of words from our dictionary and display the output on the screen. I will give the word count to be 20. 

text generation, Markov Chain

At first glance, this may look like something an actual human being says or types. But looking closely you will notice that it is just a random set of words together. Also, note that this sentence does not appear in the original text file and is generated by our model. 

Using the built-in method

To make the implementation of Markov chains easy, you can make use of the built-in package known as markovify. 

To install this use the following command 

pip install markovify

We will implement this for the same dataset used above. 

import markovify
with open("/content/gdrive/My Drive/text.txt") as f:
    data = f.read()

Next, you can choose how many sentences you want to generate by assigning the sentence count in the for-loop. I have generated 3 sentences here. 

data_model = markovify.Text(data)
for i in range(3):

The Text method is for the generation of random sentences from our data. 

Markov Chain

Again, these sentences are only random. Another option with this package is to choose how many characters should be in the sentences. 

for i in range(3):

Here, it prints 3 sentences with a maximum of 280 characters.

text generation, Markov Chain

Every time the program is run a new output is generated because Markov models are memoryless. We have successfully built a Markov chain text generator using custom and built-in codes. 


Markov chains are a very simple and easy way to generate text that mimics humans to some extent. But, for effectively generate text, the text corpus needs to be filled with documents that are similar. Simple Markov chains are the building blocks of other, more sophisticated, modelling techniques. These models can be powerful tools for NLP and deep learning as well. 

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Bhoomika Madhukar
I am an aspiring data scientist with a passion for teaching. I am a computer science graduate from Dayananda Sagar Institute. I have experience in building models in deep learning and reinforcement learning. My goal is to use AI in the field of education to make learning meaningful for everyone.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Council Post: Evolution of Data Science: Skillset, Toolset, and Mindset

In my opinion, there will be considerable disorder and disarray in the near future concerning the emerging fields of data and analytics. The proliferation of platforms such as ChatGPT or Bard has generated a lot of buzz. While some users are enthusiastic about the potential benefits of generative AI and its extensive use in business and daily life, others have raised concerns regarding the accuracy, ethics, and related issues.