Listen to this story

In our life, we all have been in queues either for buying groceries from your nearest superstore or withdrawing money from an ATM. If you are a rational person then a question must have arrived: how do these organizations calculate how many cash counters / ATM machines should be sufficient for efficient processing of customers? The answer comes from the famous mathematical concept “The Queuing Theory”. In this article, we will be discussing the mathematical formulation of queuing theory with indepth functioning and the application of this concept in real life. Along with these, we will try to understand its applicability to machine learning. Following are the topics to be covered in this article.
Table of contents
 A gentle introduction to Queuing theory
 The mathematical functioning of Queuing theory
 Queuing models
 Applications of Queuing theory
Let’s start with an introduction to the concept of queuing theory which will give a highlevel overview of this concept.
A gentle introduction to Queuing theory
The Queuing Theory focuses on understanding how lines, or queues, work and how to increase their efficiency. There are certain aspects of this study, including how people behave when they have to wait to make a purchase or receive a service, what kind of queue organization moves people through lines effectively, and also determining how many people can be processed through the queue within a given period.
A queuing theory is essentially a tool for cost analysis. Most businesses could not operate so that none of their customers ever had to wait in line, either because it would be prohibitively expensive or because they do not have that many clients.
(GIF link)
The above gif is the perfect example of a failure of the system. As a result, businesses use queuing theory to set up their operations in a way that strikes a balance between the cost of serving customers and the inconvenience of having to wait in line. The queuing theory has four major factors that help to derive the results:
 The method by which customers arrive at the queue.
 The nature of the queue means in which way the queue is operating.
 The method of providing the service.
 The method of departing from the queue location.
Let’s understand the mathematical formulation needed to get the above answers.
Are you looking for a complete repository of Python libraries used in data science, check out here.
The mathematical functioning of Queuing theory
The queuing theory uses the knowledge of probability theory to calculate the different stages of the process. It uses mainly the Exponential and Poisson probability distribution. Let’s have a look at how these probabilities are utilized by queuing theory.
Exponential and Poisson Probability Distributions
In queuing theory, the exponential distribution with a rate parameter and exponential random variable is calculated for an interarrival time in a Poisson point process. This distribution can be used to model customer arrival times or service times for several reasons.
 The exponential function is a strictly decreasing function of the random variable, which means the time between arrivals is more likely to be small than large after an arrival occurs.
 There is a property known as the “nomemory” in exponential distribution, meaning that the time until the next arrival does not depend on how much time has already passed. Since the actions of the customers are independent, this makes intuitive sense for a model that measures customer arrivals.
Poisson distribution relates to the Exponential distribution because it is used to determine the probability of a certain number of arrivals occurring in a given time frame.
When there are no arrivals, the Poisson distribution gives an exponential value which is equal to the probability of interarrival time greater than the random variable from the exponential distribution. The odds of zero arrivals in a period correlate with the odds of an interarrival time of a certain length. The interarrival time refers to the period between the arrival of customers.
Sometimes Exponential and Poisson probability distributions do not clearly explain the distribution of variables in that case Erlang distribution is used.
When the exponential distribution is generalized it forms the Erlang distribution. A random variable, referred to as the Erlang random variable, is a measure of the time interval between a specific event and an independent exponential random variable for the following event. This generalization is achieved by introducing two new parameters to the exponential distribution.
 The shape parameter generalizes the shape of the distribution denoted by ‘k’
 The scale parameter is the reciprocal of the rate parameter of the exponential distribution.
But where are we going to use these probability distributions? The answer is in calculating the input and output process for a queue.
The Input process
The Input Process calculates the customers’ arrival by initializing the time for every customer. This time is then used to further calculate the random independent variable for the exponential function. It is assumed that no more than one arrival can occur at a given instant. If more than one arrival can occur at a given instant, bulk arrivals are allowed.
A finite source model draws its arrivals from the population and a bulking model is one that has customers arrive but not enters. The nonmemory function of exponential distribution comes into play as the action of each customer is independent of the other.
The output process
The Service Process describes the output process of a queuing system. To determine a customer’s service time, we specify a probability distribution. There are two arrangements of servers, parallel and series.
 In parallel, if all servers provide the same type of service, then a customer needs only pass through one server to receive service.
 In series, a customer must pass through several servers to receive service.
In this process, the exponential distribution does not explain the service times accurately as a service that requires many different phases of service, it requires a much more flexible probability distribution that’s the reason Erlang distribution is used in this process.
Queuing disciplines
The queue discipline describes how arrivals are taken for service. It is common for a new arrival to be placed at the end of the queue and are not served until all of the arrivals have been dealt with. There are three basic queuing disciplines:
 FirstComeFirstServe discipline(FCFS) discipline
 LastComeFirstServed (LCFS) discipline
 Service in random order (SIRO) discipline
Depending on the chosen discipline, customers’ waiting times are likely to vary significantly. For instance, no one wants to arrive early at an LCFS discipline. In general, the discipline has no impact on the queue itself, since arrivals are constantly receiving service regardless of the discipline. The queuing depends on two factors.
 The traffic intensity is expressed as the mean number of arrivals per unit time taken as the mean service time. This offered load is a measure of what the customers want.
 Based on traffic intensity, the number of servers per load is computed which is called as utilization factor
KendallLee notation
As describing all of the queue characteristics becomes very wordy, a much simpler notation is used called KendallLee notation.
KendallLee notation gives a total of six abbreviations for characteristics written with a slash.
 Based on the probability distributions of the first and second characteristics, the first and second characteristics describe arrival and service processes. For the first and second characteristics,
 M represents an exponential distribution
 E represents an Erlang distribution
 G represents a general distribution
 The third characteristic specifies the number of servers working simultaneously, also referred to as parallel servers.
 The fourth describes the queue discipline according to its given acronym.
 The fifth describes the maximum number of customers that can be accommodated in the system.
 The sixth is the number of customers from which the system can draw.
For example, in a bank, queuing model is represented as M/M/5/F CF S/20/, which can be decoded as exponential arrival times, exponential service times, an FCFS queue discipline and a total capacity of 20 customers, and an infinite population pool to draw from.
Let’s talk about different Queuing models that have been utilized.
Queuing models
We can now analyze different models based on prior knowledge of the functionality of queuing theory.
The M/M/1/GD/∞/∞ Queuing System
An M/M/1/GD/∞/∞ system has exponential interarrival times, exponential service times, and one server. This system can be modelled as a birthdeath process.
A process wherein the system’s state at any interval time is a positive integer. There are two important probabilities in this process, birth probability and death probability.
 The birth probability is the probability of an arrival occurring over a while
 Similarly, the probability that completion of service occurs over a while is the death probability.
Therefore, births and deaths have similar meanings with arrivals and completion of service respectively. A birth increases the state by one, whereas a death decreases the state by one. Therefore, these two probabilities must be independent of each other.
The M/M/1/GD/c/∞ Queuing System
This queuing system has exponential interarrival and service times, with two different exponential rates λ and µ respectively. The system works very similarly to the birthdeath system, except that when a ‘c’ customer is present in the system, additional arrivals are excluded from entering, and are thereafter no longer counted.
As an example, a typical customer would not be bothered waiting in line at a fastfood restaurant if the lines are too long. Instead, the customer would visit another restaurant.
The M/M/s/GD/∞/∞ Queuing System
This queuing system has exponential interarrival and service times, with two different exponential rates λ and µ respectively. In this system, there are ‘s’ servers willing to serve a single line of customers, like perhaps one would find in a bank. If the number of customers present in the system is less than the number of servers, then every customer is being served. If the number of customers is greater than the number of servers in the system, then ‘s’ customers are being served and the remaining customers are waiting in line.
The M/G/∞/GD/∞/∞ and GI/G/∞/GD/∞/∞ Queuing Systems
These systems are unique in that they have an infinite number of servers, which means that a customer will never have to wait in a queue for their service to begin. Think of this as selfservice, similar to online shopping.
The M/G/s/GD/s/∞ Queuing System
In this system where a customer arrives and sees that all of the servers are busy, as a result, the customer leaves the system completely without receiving any services. In this case, no queue is created, and we say that the blocked customers have been cleared.
Applications of Queuing theory
Queuing theory is utilized in various industries which we are part of in a day to day life. Let’s have a look at a few popular use cases through which we can understand its applicability in machine learning.
A. Queuing theory applications in waiting time prediction
 Queuing theory in Customer service
In Customer service, the M/M/S queuing model is utilized. In this model, there are interarrival times, exponential service times and servers. Here the customers are referred to as callers, servers are telephone operators. These queues consist of callers that are waiting for the state to be served by system resources.
 Queuing theory at ATM
In an ATM, M/M/1 queuing model is used in which interarrival times, exponential service and a single server are present to serve the customers. Here the service time and arrival time are exponentially distributed. Here the single queue and single server are used in the model.
 Queuing theory in library management
The purpose of queuing in libraries is to facilitate the circulation of books, counter services and allied services. The basic tasks in the library are stack maintenance, membership management, selection of library materials, and planning the acquisition of materials.
B. Queuing theory applications in dynamic scheduling
 Queuing theory in Processors
The processor schedules the task based on their prioritization. For example, 10 different applications are running on your smartphone and they are all queued accordingly but the processor will decide which app to run in the background so that your usage of RAM does not get affected. In this, the processor is using the prioritybased queuing discipline to queue the background applications.
 Queuing theory in the Computer system
The arrival of jobs in a computer system behaves as the arrival process and the execution time is the Poisson process. Jobs are executed in the order they arrive, therefore FCFS is used as a queueing discipline of arrival. If the computer is busy when the job arrives, it is placed in a queue.
Along with the abovementioned usecases, this theory also has applications in the following tasks:
 Improving Employee Efficiency
 Improving Service Quality
 Improving Storewide Sales
 Increasing Customer Loyalty
 Streamlining Communication
 Reducing Operational Costs
 Scheduling in Hospitals
In the domain of machine learning, the queueing theory is popularly used in waiting time prediction. The above applications can also be considered and modelled with machine learning principles in order to achieve better results.
Final verdict
There is often a form of queue whenever the current demand for a service exceeds the current capacity of that service. This article provides some fundamental concepts of queuing theory and its applications. We also covered the important use cases of queuing theory and understood its applicability to the machine learning domain.