Multiprocessing vs Multithreading in Python đ
1 July, 2021
To first understand the difference between these two libraries we need to understand the differences between processes and threads and how they interact with a CPU.
Lets take the first principles approachâŠ
How does a CPU work?
On a very fundamental level the CPU takes instructions, in the form of OP code(portion of a machine language instruction that specifies the operation to be performed), depending on the architecture this OP code is managed by the Control Unit.At a higher level of abstraction processes are controlled by the Process Control Block(PCB) its a data structure implemented in the operating system.
CPUs can only execute one command at a time, therefore one process at a time. So don't confuse the term "multiprocessing" in the Multiprocessing Python library with actually running multiple processes on the same core in your CPU...
Processes have their own allocation of memory in main memory(RAM). Threads use the same memory space.
What are processes?
A process is a piece of software running as part of an application.What are threads?
Threads in the operating system level of abstraction aren't to be confused with threads of a processor. A thread is a path of execution within a process. A process can contain multiple threads. For example a dual core CPU can run 4 threads.Whatâs Multithreading?
The multithreading library is lightweight, shares memory, responsible for responsive UI and is used well for I/O bound applications. However, the module isnât killable and is subject to the GIL Threading library in Python Multiple threads live in the same process in the same space, each thread will do a specific task, have its own code, own stack memory, instruction pointer, and share heap memory. If a thread has a memory leak it can damage the other threads and parent process.import threading
def calc_square(number):
print('Square:' , number * number)
def calc_quad(number):
print('Quad:' , number * number * number * number)
if __name__ == "__main__":
number = 7
thread1 = threading.Thread(target=calc_square, args=(number,))
thread2 = threading.Thread(target=calc_quad, args=(number,))
# Will execute both in parallel
thread1.start()
thread2.start()
# Joins threads back to the parent process, which is this
# program
thread1.join()
thread2.join()
# This program reduces the time of execution by running tasks in parallel
Whatâs multiprocessing? The multiprocessing library uses separate memory space, multiple CPU cores, bypasses GIL limitations in CPython, child processes are killable(ex. function calls in program) and is much easier to use. Some caveats of the module are a larger memory footprint and IPCâs a little more complicated with more overhead. Checkout Multiprocessing library in the Python docs
import multiprocessing
def calc_square(number):
print('Square:' , number * number)
result = number * number
print(result)
def calc_quad(number):
print('Quad:' , number * number * number * number)
if __name__ == "__main__":
number = 7
result = None
p1 = multiprocessing.Process(target=calc_square, args=(number,))
p2 = multiprocessing.Process(target=calc_quad, args=(number,))
p1.start()
p2.start()
p1.join()
p2.join()
# Wont print because processes run using their own memory location
print(result)
Executive Summary
The Python threading module uses threads instead of processes. Threads run in the same unique memory heap. Whereas Processes run in separate memory heaps. This, makes sharing information harder with processes and object instances. One problem arises because threads use the same memory heap, multiple threads can write to the same location in the memory heap which is why the default Python interpreter has a thread-safe mechanism, the âGILâ (Global Interpreter Lock). This prevent conflicts between threads, by executing only one statement at a time (serial processing, or single-threading).The Global Interpretor Lock (GIL) in CPython prevents parallel threads of execution on multiple cores, thus the threading implementation on python is useful mostly for concurrent thread implementation in web-servers.
â