Threading vs thread mo

Threading vs thread mo - python

There were several question on this topic but I couldn't find answer for my questions. Even python docs isn't that descriptive.
My problem is simple: I want to break up a huge list into pieces and process each piece in parallel.
So my question is whether the interpreter waits till all threads are finished before it starts the downstream lines of the program (in my case- consolidation of the processed list) or do I have to define the downstream process as a separate thread and use join.
Although, I read the post on the topic (Thread vs. Threading) I couldn't still much understand what is the difference between thread and threading.
Please direct me to a good text on the topic. The docs are not very informative.
PS (#zzk)
So even if I use multiprocessing, how will I execute a common code after all processes end? For e.g. 5 processes produce 5 lists. And now I have to merge these lists, sort and write to a file.
[the code is not exact and is just for explaining the situation]
def fun(x,y):
y=someprocessing(x) #type(y)=List
if __name__ == '__main__':
for i in listofprocesses:
p = Process(target=fun, args=(i,y))
p.start()
# DOWNSTREAM CODE#
yy=y1+y2+y3+y4+y5;
yy.sort()
for j in yy:
outfile.write(j)
I want to combine y produced from different processes to be merged.
There are two doubts here:
since the variable name is the same, do I have to pass the output list (y) as an argument
Assuming so, and all the processed lists are saved as y1,y2,y3,y4& y5, will the downstream code be executed. How to make sure that all the processes have ended?

Threading or thread won't help you help due to the GIL.
In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython's memory management is not thread-safe.
You may need multiprocessing

Related

How to read a variable without lock in Python threads?

I am using Python threading to do some jobs at the same time. I leave the main thread to perform task_A, and create one thread to perform task_B at the same time. Below is the simplified version of the code I am working on:
import threading
import numpy as np
def task_B(inc):
for elem in array:
value = elem + inc
if __name__ == '__main__':
array = np.random.rand(10)
t1 = threading.Thread(target=task_B, args=(1))
t1.start()
# task_A
array_copy = list()
for elem in array:
array_copy.append(elem)
t1.join()
I know the above code doesn't do something meaningful. Please think of it as a simplified example. As you can see, variable array is read-only both in the main thread and the newly created thread t1. Therefore, there is no need to lock array in both the main thread and the t1 thread, since none of them modifies (or writes) the variable. However, when I timed the code, it seems that Python threading automatically locks variables that are shared between threads, even though they are read-only. Is there a way to make each thread run simultaneously without locking the read-only variables? I've found this code, but cannot figure out how to apply it to my situation.

You are correct saying that in this case "there is no need for a lock", but the CPython interpreter (that I guess you use to run your Python code) is not that smart.
Python code always execute while holding the GIL, so that both threads execute exclusively from one another (instead of concurrently), although in an interleaved manner (which would not be the case without threads, the execution would be purely sequential).
That's the reason why performance-critical code is often offloaded to other *processes (using the multiprocessing library) or written in Cython (here an example solving a problem similar to yours).
See that question for a little more details on why the GIL is there : Is there a way to release the GIL for pure functions using pure python?.
There is hope that in the future (2022+) the Gil may be relaxed, but for now you are stuck with it, so work around it.

Execute Python threads in small groups

I am trying to insert some number(100) of data sets into SQL server using python. I am using multi-threading to create 100 threads in a loop. All of them are starting at the same time and this is bogging down the database. I want to group my threads into set of 5 and once that group is done, I would like to start the next group of threads and so on. As I am new to python and multi-threading, any help would be highly appreciated.Please find my code below.
for row in datasets:
argument1=row[0]
argument2=row[1]
jobs=[]
t = Thread(target=insertDataIntoSQLServer, args=(argument1,argument2,))
jobs.append(t)
t.start()
for t in jobs:
t.join()

On Python 2 and 3 you could use a multiprocessing.ThreadPool. This is like a multiprocessing.Pool, but using threads instead of processes.
import multiprocessing
datasets = [(1,2,3), (4,5,6)] # Iterable of datasets.
def insertfn(data):
pass # shove data to SQL server
pool = multiprocessing.ThreadPool()
p.map(insertfn, datasets)
By default, a Pool will create as many worker threads as your CPU has cores. Using more threads will probably not help, because they will be fighting for CPU time.
Note that I've grouped data into tuples. That is one way to get around the one argument restriction for pool workers.
On Python 3 you can also use a ThreadPoolExecutor.
Note however that on Python implementations (like the "standard" CPython) that have a Global Interpreter Lock, only one thread at a time can be executing Python bytecode. So using large numbers of threads will not automatically increase performance. Threads might help with operations that are I/O bound. If one thread is waiting for I/O, another thread can run.

First note that your code doesn't work as you intended: it sets jobs to an empty list every time through the loop, so after the loop is over you only join() the last thread created.
So repair that, by moving jobs=[] out of the loop. After that, you can get exactly what you asked for by adding this after t.start():
if len(jobs) == 5:
for t in jobs:
t.join()
jobs = []
I'd personally use some kind of pool (as other answers suggest), but it's easy to directly get what you had in mind.

You can create a ThreadPoolExecutor and specify max_workers=5.
See here.
And you can use functools.partial to turn your functions into the required 0-argument functions.
EDIT: You can pass the args in with the function name when you submit to the executor. Thanks, Roland Smith, for reminding me that partial is a bad idea. There was a better way.

Why do we need locks for threads, if we have GIL?

I believe it is a stupid question but I still can't find it. Actually it's better to separate it into two questions:
1) Am I right that we could have a lot of threads but because of GIL in one moment only one thread is executing?
2) If so, why do we still need locks? We use locks to avoid the case when two threads are trying to read/write some shared object, because of GIL twi threads can't be executed in one moment, can they?

GIL protects the Python interals. That means:
you don't have to worry about something in the interpreter going wrong because of multithreading
most things do not really run in parallel, because python code is executed sequentially due to GIL
But GIL does not protect your own code. For example, if you have this code:
self.some_number += 1
That is going to read value of self.some_number, calculate some_number+1 and then write it back to self.some_number.
If you do that in two threads, the operations (read, add, write) of one thread and the other may be mixed, so that the result is wrong.
This could be the order of execution:
thread1 reads self.some_number (0)
thread2 reads self.some_number (0)
thread1 calculates some_number+1 (1)
thread2 calculates some_number+1 (1)
thread1 writes 1 to self.some_number
thread2 writes 1 to self.some_number
You use locks to enforce this order of execution:
thread1 reads self.some_number (0)
thread1 calculates some_number+1 (1)
thread1 writes 1 to self.some_number
thread2 reads self.some_number (1)
thread2 calculates some_number+1 (2)
thread2 writes 2 to self.some_number
EDIT: Let's complete this answer with some code which shows the explained behaviour:
import threading
import time
total = 0
lock = threading.Lock()
def increment_n_times(n):
global total
for i in range(n):
total += 1
def safe_increment_n_times(n):
global total
for i in range(n):
lock.acquire()
total += 1
lock.release()
def increment_in_x_threads(x, func, n):
threads = [threading.Thread(target=func, args=(n,)) for i in range(x)]
global total
total = 0
begin = time.time()
for thread in threads:
thread.start()
for thread in threads:
thread.join()
print('finished in {}s.\ntotal: {}\nexpected: {}\ndifference: {} ({} %)'
.format(time.time()-begin, total, n*x, n*x-total, 100-total/n/x*100))
There are two functions which implement increment. One uses locks and the other does not.
Function increment_in_x_threads implements parallel execution of the incrementing function in many threads.
Now running this with a big enough number of threads makes it almost certain that an error will occur:
print('unsafe:')
increment_in_x_threads(70, increment_n_times, 100000)
print('\nwith locks:')
increment_in_x_threads(70, safe_increment_n_times, 100000)
In my case, it printed:
unsafe:
finished in 0.9840562343597412s.
total: 4654584
expected: 7000000
difference: 2345416 (33.505942857142855 %)
with locks:
finished in 20.564176082611084s.
total: 7000000
expected: 7000000
difference: 0 (0.0 %)
So without locks, there were many errors (33% of increments failed). On the other hand, with locks it was 20 times slower.
Of course, both numbers are blown up because I used 70 threads, but this shows the general idea.

At any moment, yes, only one thread is executing Python code (other threads may be executing some IO, NumPy, whatever). That is mostly true. However, this is trivially true on any single-processor system, and yet people still need locks on single-processor systems.
Take a look at the following code:
queue = []
def do_work():
while queue:
item = queue.pop(0)
process(item)
With one thread, everything is fine. With two threads, you might get an exception from queue.pop() because the other thread called queue.pop() on the last item first. So you would need to handle that somehow. Using a lock is a simple solution. You can also use a proper concurrent queue (like in the queue module)--but if you look inside the queue module, you'll find that the Queue object has a threading.Lock() inside it. So either way you are using locks.
It is a common newbie mistake to write multithreaded code without the necessary locks. You look at code and think, "this will work just fine" and then find out many hours later that something truly bizarre has happened because threads weren't synchronized properly.
Or in short, there are many places in a multithreaded program where you need to prevent another thread from modifying a structure until you're done applying some changes. This allows you to maintain the invariants on your data, and if you can't maintain invariants, then it's basically impossible to write code that is correct.
Or put in the shortest way possible, "You don't need locks if you don't care if your code is correct."

the GIL prevents simultaneous execution of multiple threads, but not in all situations.
The GIL is temporarily released during I/O operations executed by threads. That means, multiple threads can run at the same time. That's one reason you still need locks.
I don't know where I found this reference.... in a video or something - hard to look it up, but you can investigate further yourself
UPDATE:
The few thumbs down I got signal to me that people think memory is not a good enough reference, and google not a good enough database. While I'd disagree with that, let me provide one of the first URLs I looked up (and checked!), so the people who disliked my answer can live happily from how on:
https://wiki.python.org/moin/GlobalInterpreterLock

the GIL does not protect you from modification of the internal states of the objects that you are accessing concurrently from different threads, meaning that you can still mess things up if you don't take measures.
So, despite the fact that two threads may not be running at the same exact time, they can still be trying to manipulate the internal state of an object (one at a time, intermittently), and if you don't prevent that from happening (with some locking mechanism) your code could/will eventually fail.
Regards.

Python threading and GIL

I was reading about the GIL and it never really specified if this includes the main thread or not (i assume so). Reason I ask is because I have a program with threads setup that modify a dictionary. The main thread adds/deletes based on player input while a thread loops the data updating and changing data.
However in some cases a thread may iterate over the dictionary keys where one could delete them. If there is a so called GIL and they are run sequentially, why am I getting dict changed errors? If only one is suppose to run at a time, then technically this should not happen.
Can anyone shed some light on such a thing? Thank you.

They are running at the same time, they just don't execute at the same time. The iterations might be interleaved. Quote Python:
The mechanism used by the CPython interpreter to assure that only one thread executes Python bytecode at a time.
So two for loops might run at the same time, there will just be no (for example) two del dict[index]'s at the same time.

The GIL locks at a Python byte-code level, and applies to all threads, even the main thread. If you have one thread modifying a dictionary, and another iterating keys, they will interfere with each other.
"Only one runs at a time" is true, but you have to understand the unit of granularity. In the case of CPython's GIL, the granularity is a bytecode instruction, so execution can switch between threads at any bytecode.

The gil prevents two threads from modifying the interpreter state simultaneously. It doesn't provide any thread consistency constraints, or any kind of mutex at all on a granularity smaller than the whole process. If you need to read and modify a dict in two threads, you should be using a mutex

Python switches threads more often than you seem to think it does. You say "only one" is supposed to run at a time, and technically that's true, but it depends on your definition of "one." Python's atomic operations are very small. For example: adding a single item to a dictionary. Iteration over an entire dictionary can be interrupted.
You should use a lock object from the threading library to isolate your program's atomic operations.

Multiprocessing in python with more then 2 levels

I want to do a program and want make a the spawn like this process -> n process -> n process
can the second level spawn process with multiprocessing ? using multiprocessinf module of python 2.6
thnx

#vilalian's answer is correct, but terse. Of course, it's hard to supply more information when your original question was vague.
To expand a little, you'd have your original program spawn its n processes, but they'd be slightly different than the original in that you'd want them (each, if I understand your question) to spawn n more processes. You could accomplish this by either by having them run code similar to your original process, but that spawned new sets of programs that performed the task at hand, without further processing, or you could use the same code/entry point, just providing different arguments - something like
def main(level):
if level == 0:
do_work
else:
for i in range(n):
spawn_process_that_runs_main(level-1)
and start it off with level == 2

You can structure your app as a series of process pools communicating via Queues at any nested depth. Though it can get hairy pretty quick (probably due to the required context switching).
It's not erlang though that's for sure.
The docs on multiprocessing are extremely useful.
Here(little too much to drop in a comment) is some code I use to increase throughput in a program that updates my feeds. I have one process polling for feeds that need to fetched, that stuffs it's results in a queue that a Process Pool of 4 workers picks up those results and fetches the feeds, it's results(if any) are then put in a queue for a Process Pool to parse and put into a queue to shove back in the database. Done sequentially, this process would be really slow due to some sites taking their own sweet time to respond so most of the time the process was waiting on data from the internet and would only use one core. Under this process based model, I'm actually waiting on the database the most it seems and my NIC is saturated most of the time as well as all 4 cores are actually doing something. Your mileage may vary.

Yes - but, you might run into an issue which would require the fix I committed to python trunk yesterday. See bug http://bugs.python.org/issue5313

Sure you can. Expecially if you are using fork to spawn child processes, they works as perfectly normal processes (like the father). Thread management is quite different, but you can also use "second level" sub-treading.
Pay attention to not over-complicate your program, as example program with two level threads are normally unused.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.