Multithreading in Python [With Coding Examples]
Updated on Dec 20, 2023 | 9 min read | 9.4k views
Share:
For working professionals
For fresh graduates
More
Updated on Dec 20, 2023 | 9 min read | 9.4k views
Share:
Improving and making code faster is the next step after the basic knowledge of Python is acquired. Multithreading is one such way to achieve that optimization using “Threads”. What are these threads? And how are these different from processes? Let’s find out.
import threading
def cuber(n):
print(“Cube: {}”.format(n * n * n))
def squarer(n):
print(“Square: {}”.format(n * n))
if __name__ == “__main__”:
# create the thread
t1 = threading.Thread(target=squarer, args=(5,))
t2 = threading.Thread(target=cuber, args=(5,))
# start the thread t1
t1.start()
# start the thread t2
t2.start()
# wait until t1 is completed
t1.join()
# wait until t2 is completed
t2.join()
# both threads completed
print(“Done!”)
#Output:
Square: 25
Cube: 125
Done!
Now let’s try to understand the code.
First, we import the Threading module which is responsible for all the tasks. Inside the main, we create 2 threads by creating subclasses of the Thread class. We need to pass the target, which is the function that needs to be executed in that thread, and the arguments that need to be passed into those functions.
Now once the threads are declared, we need to start them. That is done by calling the start method on threads. Once started, the main program needs to wait for threads to finish their processing. We use the wait method to let the main program pause and wait for threads T1 and T2 finish their execution.
Must Read: Python Challenges for Beginners
As we discussed above, threads do not execute in parallel, instead Python switches from one to another. So, there is a very critical need of correct synchronization between the threads to avoid any weird behavior.
Race Condition
Threads which are under the same process use common data and files which can lead to a “Race” for the data between multiple threads. Therefore, if a piece of data is accessed by multiple threads, it will be modified by both the threads and the results we’ll get won’t be as expected. This is called a Race Condition.
So, if you have two threads which have access to the same data, then they both can access and modify it when that particular thread is executing. So when T1 starts executing and modifies some data, T2 is in sleep/wait mode. Then T1 stops execution and goes into sleep mode handing the control over to T2, which also has the access to the same data. So T2 will now modify and overwrite the same data which will lead to problems when T1 begins again.
The aim of Thread Synchronization is to make sure this Race Condition never comes and the critical section of code is accessed by threads one at a time in a synchronized way.
upGrad’s Exclusive Data Science Webinar for you –
Watch our Webinar on The Future of Consumer Data in an Open Data Economy
To solve and prevent the Race Condition and its consequences, the thread module offers a Lock class which uses Semaphores to help threads synchronise. Semaphores are nothing but binary flags. Consider them as the “Engaged” sign on Telephone booths which have the value as “Engaged” (equivalent to 1) or “Not Engaged” (Equivalent to 0). So everytime a thread comes across a segment of code with lock, it has to check if lock is already in 1 state. If it is, then it will have to wait until it becomes 0 so that it can use it.
The Lock class has two primary methods:
On the other hand, if the lock for thread T1 was initiated with parameter blocking as False, the thread T1 won’t wait or remain blocked if the critical section is already locked by thread T2. If it sees it as locked, it will straightaway return False and exit. However, if the code was not locked by another thread, it will acquire the lock and return True.
release(): When the release method is called on the lock, it will unlock the lock and return True. Also, it will check if any threads are waiting for lock to be released. If there are, then it will allow exactly one of them to access the lock.
However, if the lock is already unlocked, a ThreadError is raised.
Deadlocks
Another issue which arises when we deal with multiple locks is – Deadlocks. Deadlocks occur when locks are not released by threads due to various reasons. Let’s consider a simple example where we do the following:
import threading
l = threading.Lock()
# Before the 1st acquire
l.acquire()
# Before the 2nd acquire
l.acquire()
# Now acquired the lock twice
In the above code, we call the acquire method twice but don’t release it after it is acquired for the first time. Hence, when Python sees the second acquire statement, it will go into the wait mode indefinitely as we never released the previous lock.
These deadlock conditions might creep into your code without you realizing it. Even if you include a release call, your code may fail mid way and the release will never be called and the lock will stay locked. One way to overcome this is by using the with–as statement, also called the Context Managers. Using the with–as statement, the lock will get automatically released once the processing is over or failed due to any reason.
As we discussed earlier, Multithreading is not useful in all applications as it doesn’t really make things run in parallel. But the main application of Multithreading is during I/O tasks where the CPU sits idly while waiting for data to be loaded. Multithreading plays a crucial role here as this idle time of CPU is utilized in other tasks, thereby making it ideal for optimization.
If you are curious to learn about data science, check out IIIT-B & upGrad’s Executive PG Program in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.
Get Free Consultation
By submitting, I accept the T&C and
Privacy Policy
Start Your Career in Data Science Today
Top Resources