MITB Banner

How To Run Python Code Concurrently Using Multithreading

Multithreading in Python enables CPUs to run different parts(threads) of a process concurrently to maximize CPU utilization.
Share
multithreading in Python

Multithreading enables CPUs to run different parts(threads) of a process concurrently. But what does that mean? Processes can be divided into different parts; let’s take the example of an online multiplayer game. One thread of the game could be responsible for communicating with the servers and rendering the graphics. The communication thread requires minimal computation and would involve some wait time, on the other hand, the render thread is computationally intensive with minimal wait time. Multithreading enables the CPU to run the render thread while the communication thread is waiting for a response from the server, increasing the CPU utilisation. 

Note that multithreading is not to be confused with multi-processing. Modern CPUs have multiple cores; multi-processing utilizes these cores to run processes in parallel. Multithreading, however, aims to maximize the utilization of each of these cores by running multiple threads concurrently. Multithreading is useful when the task has IO or network operations that involve waiting; multiprocessing makes computation-intensive tasks of a process faster. Continuing the online game example, the render thread of most games are run in parallel on a GPU with thousands of cores, each thread rendering different aspects of the game. While the communication and IO threads are run concurrently on the CPU. 

Multithreading in Python

The threading module comes with the standard Python library, so there’s no need for installing anything. By default, your Python programs have a single thread, called the main thread. You can create threads by passing a function to the Thread() constructor or by inheriting the Thread class and overriding the run() method.

Using the Thread() Constructor

 import threading
 import time

 def useless_function(seconds):
     print(f'Waiting for {seconds} second(s)', end = "\n")
     time.sleep(seconds)
     print(f'Done Waiting {seconds}  second(s)')

 start = time.perf_counter()
 t = threading.Thread(target=useless_function, args=[1])
 t.start()
 print(f'Active Threads: {threading.active_count()}')
 t.join()
 end = time.perf_counter()
 print(f'Finished in {round(end-start, 2)} second(s)') 
-----------------------------Output-----------------------------
 Waiting for 1 second(s)
 Active Threads: 2
 Done Waiting 1  second(s)
 Finished in 1.0 second(s) 

The two active threads are the main thread and the useless_function thread that you just created. The join() method blocks the execution flow until the thread t terminates. If you were to remove the join() call, the main thread would finish before t, and the output would be something like.

-----------------------------Output-----------------------------
 Waiting for 1 second(s)
 Active Threads: 2
 Finished in 0.0 second(s) 

Although the active_count() method is called after starting the thread t, it finishes execution before it. This happens because the processor runs the main thread while the thread t is sleeping. If you were to add a delay of 1 second before it, it would be executed after the thread t terminates. 

 t = threading.Thread(target=useless_function, args=[1])
 start = time.perf_counter()
 t.start()
 time.sleep(1)
 print(f'Active Threads: {threading.active_count()}')
 end = time.perf_counter()
 print(f'Finished in {round(end-start, 2)} second(s)') 
-----------------------------Output-----------------------------
 Waiting for 1 second(s)
 Done Waiting 1  second(s)
 Active Threads: 2
 Finished in 1.0 second(s) 

Creating Your Thread Class

The Thread subclass should only override the run() method and the __init__() constructor. And if the constructor is overridden the base class constructor, Thread.__init__(self), should be invoked before doing anything else.

 from thread import Thread

 def countdown(name, delay, count):
     while count:
         time.sleep(delay)
         print (f'{name, time.ctime(time.time()), count}')
         count -= 1

 class newThread(Thread):
     def __init__(self, name, count):
         threading.Thread.__init__(self)
         self.name = name
         self.count = count
     def run(self):
         print("Starting: " + self.name + "\n")
         countdown(self.name, 1,self.count)
         print("Exiting: " + self.name + "\n")

 t = newThread("Thread 1", 5)
 t.start()
 t.join()
 print("Exiting Main Thread") 
-----------------------------Output-----------------------------
 Starting: Thread 1
 ('Thread 1', 'Thu Apr 22 06:29:57 2021', 5) 
 ('Thread 1', 'Thu Apr 22 06:29:58 2021', 4) 
 ('Thread 1', 'Thu Apr 22 06:29:59 2021', 3) 
 ('Thread 1', 'Thu Apr 22 06:30:00 2021', 2) 
 ('Thread 1', 'Thu Apr 22 06:30:01 2021', 1) 
 Exiting: Thread 1
 Exiting Main Thread 

Now you know how to run code concurrently using multithreading in Python, but why would you want to do so? Let’s illustrate the concurrency aspect of multithreading and the increased CPU utilization with the help of an example. 

Unoptimized Code
 import requests
 import time

 urls = [
     'https://images.pexels.com/photos/305821/pexels-photo-305821.jpeg',
     'https://images.pexels.com/photos/509922/pexels-photo-509922.jpeg',
     'https://images.pexels.com/photos/325812/pexels-photo-325812.jpeg',
     'https://images.pexels.com/photos/1252814/pexels-photo-1252814.jpeg',
     'https://images.pexels.com/photos/1420709/pexels-photo-1420709.jpeg',
     'https://images.pexels.com/photos/963486/pexels-photo-963486.jpeg',
     'https://images.pexels.com/photos/1557183/pexels-photo-1557183.jpeg',
     'https://images.pexels.com/photos/3023211/pexels-photo-3023211.jpeg',
     'https://images.pexels.com/photos/1031641/pexels-photo-1031641.jpeg',
     'https://images.pexels.com/photos/439227/pexels-photo-439227.jpeg',
     'https://images.pexels.com/photos/696644/pexels-photo-696644.jpeg',
     'https://images.pexels.com/photos/911254/pexels-photo-911254.jpeg',
     'https://images.pexels.com/photos/1001990/pexels-photo-1001990.jpeg',
     'https://images.pexels.com/photos/3518623/pexels-photo-3518623.jpeg',
     'https://images.pexels.com/photos/916044/pexels-photo-916044.jpeg'
 ]

 def download(url):
     img_data = requests.get(url).content
     img_name = url.split('/')[4]
     img_name = f'{img_name}.jpg'
     with open(img_name, 'wb') as img_file:
         img_file.write(img_data)
         print(f'downloading {img_name}')

 t1 = time.perf_counter()
 for i in urls:
     download(i)
 t2 = time.perf_counter()
 print(f'Finished in {t2-t1} seconds') 
-----------------------------Output-----------------------------
 downloading 305821.jpg
 downloading 509922.jpg
 downloading 325812.jpg
 downloading 1252814.jpg
 downloading 1420709.jpg
 downloading 963486.jpg
 downloading 1557183.jpg
 downloading 3023211.jpg
 downloading 1031641.jpg
 downloading 439227.jpg
 downloading 696644.jpg
 downloading 911254.jpg
 downloading 1001990.jpg
 downloading 3518623.jpg
 downloading 916044.jpg
 Finished in 5.95 seconds 
Multithreaded Code
 start = time.perf_counter()
 threads = []
 for i in urls:
     t = threading.Thread(target=download, args=[i])
     t.start()
     threads.append(t)
 for thread in threads:
     thread.join()
 finish = time.perf_counter()
 print(f'Finished in {round(finish-start, 2)} seconds') 
-----------------------------Output-----------------------------
 downloading 509922.jpg
 downloading 963486.jpg
 downloading 305821.jpg
 downloading 3023211.jpg
 downloading 325812.jpg
 downloading 696644.jpg
 downloading 1557183.jpg
 downloading 1420709.jpg
 downloading 1252814.jpg
 downloading 1001990.jpg
 downloading 911254.jpg
 downloading 1031641.jpg
 downloading 916044.jpg
 downloading 3518623.jpg
 downloading 439227.jpg
 Finished in 1.41 seconds 

The two loops can be replaced with Executor() object from concurrent.futures:

 import concurrent.futures
 start = time.perf_counter()
 with concurrent.futures.ThreadPoolExecutor() as executor:
     executor.map(download, urls)
 finish = time.perf_counter()
 print(f'Finished in {round(finish-start, 2)} seconds') 

The Executor object creates a thread for each function call and blocks the main thread’s execution until each of these threads is terminated. 

Sequential execution versus multithreading in python

In the unoptimized code, the GET requests happen sequentially, and the CPU is ideal between the requests. When each GET request happens in its separate thread, all of them are executed concurrently, and the CPU alternates between them instead of being ideal. 

References:

To learn more about the Python threading module, refer to the following resources:

PS: The story was written using a keyboard.
Share
Picture of Aditya Singh

Aditya Singh

A machine learning enthusiast with a knack for finding patterns. In my free time, I like to delve into the world of non-fiction books and video essays.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India