Illustrating Python multithreading vs multiprocessing

Apr 8, 2015

While adding multithreading support to a Python script, I found myself thinking again about the difference between multithreading and multiprocessing in the context of Python.

For the uninitiated, Python multithreading uses threads to do parallel processing. This is the most common way to do parallel work in many programming languages. But CPython has the Global Interpreter Lock (GIL), which means that no two Python statements (bytecodes, strictly speaking) can execute at the same time. So this form of parallelization is only helpful if most of your threads are either not actively doing anything (for example, waiting for input), or doing something that happens outside the GIL (for example launching a subprocess or doing a numpy calculation). Using threads is very lightweight, for example, the threads share memory space.

Python multiprocessing, on the other hand, uses multiple system level processes, that is, it starts up multiple instances of the Python interpreter. This gets around the GIL limitation, but obviously has more overhead. In addition, communicating between processes is not as easy as reading and writing shared memory.

To illustrate the difference, I wrote two functions. The first is called idle and simply sleeps for two seconds. The second is called busy and computes a large sum. I ran each 15 times using 5 workers, once using threads and once using processes. Then I used matplotlib to visualize the results.

Here are the two idle graphs, which look essentially identical. (Although if you look closely, you can see that the multiprocess version is slightly slower.)

Idle threads. The tasks of each group run in parallel. Idle processes. The tasks of each group run in parallel.

And here are the two busy graphs. The threads are clearly not helping anything.

Busy threads. Each task run sequentially, despite multithreading. Busy processes. The tasks of each group run in parallel.

As is my custom these days, I did the computations in an iPython notebook.