What is and isn't automatically thread-local in Python Threading?

What is and isn't automatically thread-local in Python Threading? - python

I'm having a hard time wrapping my head around Python threading, especially since the documentation explicitly tells you to RTFS at some points, instead of kindly including the relevant info. I'll admit I don't feel qualified to read the threading module. I've seen lots of dirt-simple examples, but they all use global variables, which is offensive and makes me wonder if anyone really knows when or where it's required to use them as opposed to just convenient.
In particular, I'd like to know:
In threading.Thread(target=x), is x shared or private? Does each thread have its own stack, or are all threads using the same context simultaneously?
What is the preferred way to pass mutable variables to threads? Immutable ones are obviously through Thread(args=[],kwargs={}) and that's what all the examples cover. If it's global, I'll have to hold my nose and use it, but it seems like there has to be a better way. I suppose I could wrap everything in a class and just pass the instance in, but it'd be nice to point at regular variables, too.
When do I need threading.local()? In the x above?
Do I have to subclass Thread to update data, as many examples show?
I'm used to Win32 threads and pthreads, where it's explicitly laid out in docs what is and isn't shared with different uses of threads. Those are pretty low-level, and I'd like to avoid _thread if possible to be pythonic.
I'm not sure if it's relevant, but I'm trying to thread OpenMP-style to get the hang of it - make a for loop run concurrently using a queue and some threads. It was easy with globals and locks, but now I'd like to nail down scopes for better lock use.

In threading.Thread(target=x), is x shared or private?
It is private. Each thread has its own private invocation of x.
This is similar to recursion, for example (regardless of multithreading). If x calls itself, each invocation of x gets its own "private" frame, with its own private local variables.
What is the preferred way to pass mutable variables to threads? Do I have to subclass Thread to update data?
I view the target argument as a quick shortcut, good for quick experiments, but not much else. Using it where it ought not be used leads to all the limitations you describe in your question (and the hacks you describe in the possible solutions you contemplate).
Most of the time, you'd want to subclass threading.Thread. The code creating/managing the threads would pass all mutable shared objects to your thread-classes' __init__, and they should keep those objects as their attributes, and access them when running (within their run method).
When do I need threading.local()?
You rarely do, so you probably don't.
I'd like to avoid _thread if possible to be pythonic
Without a doubt, avoid it.

Related

Why is using global in python threading bad practice?

I read all over various websites how using global is bad. I have an application where I am storing say, 300 objects, in an array. I want to have 8 threads running through these 300 objects. These objects are different sizes, say between 10 and 50,000 integers and randomly distributed (think worst case scenario here).
Basically, I want to start up 8 threads, do a process on an object, report or store the results, and pick up a new object, 300 times.
The solution I can think of is to set a global lock and a global counter, lock the array, get the current object, increment the counter, release the lock.
There is 1 lock for 8 threads. There is 1 counter for 8 threads. I have 2 global objects. I store results in a dictionary, possibly also global to make it visible to all threads but also threadsafe. I am not bothering to do something stupid like subclassing thread and passing along 300/8 objects to each object because multiprocessing.pool does that for me. So how would you do it? Also, convince me that using global in this situation is bad.

Classifying approaches as either "good" or "bad" is a bit simplistic -- in practice, if a design makes sense to you and accomplishes the goals you set out to accomplish, then it doesn't matter whether other people (except possibly your boss) think it's "good" or not; it either works or it doesn't. On the other hand, if your design causes you a lot of pain and suffering, that's a sign that you might not be using the most suitable design for the task at hand.
That said, there are some valid reasons why a lot of people think that global variables are problematic, particularly when combined with multithreading.
The general problem with global variables (with or without multithreading) is that as your program grows larger, it becomes increasingly difficult to mentally keep track of which parts of your program might be reading and/or updating the global variables' values at which times -- since they are global, by definition all parts of your program have access to them, so when you're trying to trace through your program to figure out who it was who set a global variable to some unexpected value, the list of suspects can become unmanageably large. (this isn't much of a problem for small programs, but the larger your program grows, the worse this problem becomes -- and a lot of programmers have learned, through painful experience, that it's better to nip the problem in the bud by avoiding globals wherever possible in the first place, then to have to go back and rewrite a big, complicated, buggy program later on)
In the specific use-case of a multithreaded program, the anybody-could-be-accessing-my-global-variable-at-any-time property becomes even more fraught with peril, since in a multithreaded scenario, any (non-immutable) data that is shared between threads can only be safely accessed with proper serialization (e.g. by locking a mutex before reading/writing the shared data, and unlocking it afterwards). Ideally programmers would never accidentally read or write any shared+mutable data without locking the mutex -- but programmers are human and will inevitably make mistakes; if given the ability to do so, sooner or later you (or someone else) will forget that access to a particular global variable needs to be serialized, and will just go ahead and read/write it, and then you're in for a lot of pain, because the symptoms will be rare and random, and the cause of the fault won't be obvious.
So smart programmers try to make it impossible to fall into that sort of trap, usually by limiting access to the shared-state to a specific, small, carefully-written set of functions (a.k.a. an API) that implement the serialization correctly so that no other code has to. When doing that, you want to make sure that only the code in this particular API has access to the shared data, and that nobody else does -- something that is impossible to do with a global variable, as by definition everyone has direct access to it.
There is also one performance-related reason why people prefer not to mix global variables and multithreading: the more serialization you have to do, the less your program can exploit the power of multiple CPU cores. In particular, it does you no good to have an 8-core CPU if 7 of your 8 threads are spending most of their time blocked, waiting for a mutex to become available.
So how does that relate to globals? It's related in that in most cases it's difficult or impossible to prove that a global variable won't ever be accessed by another thread, which means all accesses to that global variable need to be serialized. With a non-global variable, on the other hand, you can make sure to give a reference to that variable to only a single thread -- at which point you have effectively guaranteed that only that one thread will ever access the variable (since the other threads have no references to it, you know they can't access it), and because you have that guarantee, you no longer need to serialize access to that data, and now your thread can run more efficiently because it never has to block waiting for a mutex.
(Btw note that CPython in particular suffers from a severe form of implicit serialization caused by Python's Global Interpreter Lock, which means that even the best multithreaded, CPU-bound Python code will be unlikely to use more than a single CPU core at a time. The only way to get around that is to use multiprocessing instead, or do the bulk of your program's computations in a lower-level language such C, so that it can execute without holding the GIL)

Unclear documentation: Sharing state between processes

there is a part in the Python documentation that is unclear to me:
https://docs.python.org/3.4/library/multiprocessing.html#sharing-state-between-processes
"As mentioned above, when doing concurrent programming it is usually best to avoid using shared state as far as possible."
But I cannot find any description above 17.2.1.5 that describes why it is best to avoid using shared state. Any ideas?

Shared state is like a global variable, but… more global.
Not only do you have to consider what parts of your code are reading and modifying the state, but also which running copy of your code is accessing it, and how. This gets even trickier when the state is mutable, i.e. can be changed.
To make sure one thread doesn't stomp on what another thread is doing you have to coordinate access to the state. That could be done using semaphores, message-passing, software transactional memory, etc.
See also https://softwareengineering.stackexchange.com/questions/148108/why-is-global-state-so-evil.

How to use global variable in python, in a threadsafe way

I want to use a global variable,
Init it once.
having a thread safe access.
Can someone share an example please?

If you need read-only access and the value is initialized before threads are spawn, you don't need to worry about thread safety.
If that is not the case Python threading library is probably what you need, more precisely locks. A really good read on the subject - http://effbot.org/zone/thread-synchronization.htm with quite a lot of examples.

You do have a problem if you are using multiprocessing.Processes. In which case you should take a look at Managers and Queues in the multiprocessing module.

The threading library is what you want:
import threading
mydata = threading.local()
mydata.x = 1

If you initialize it once, and if you initialize it on when module is loaded (that means: before it can be accessed from other threads), you will have no problems with thread safety at all. No synchronization is needed.
But if you mean a more complex scenario, you have to explain it further to get a reasonable code example.

Is CCKeyDerivationPBKDF thread safe?

I'm using CCKeyDerivationPBKDF to generate and verify password hashes in a concurrent environment and I'd like to know whether it it thread safe. The documentation of the function doesn't mention thread safety at all, so I'm currently using a lock to be on the safe side but I'd prefer not to use a lock if I don't have to.

After going through the source code of the CCKeyDerivationPBKDF() I find it to be "thread unsafe". While the code for CCKeyDerivationPBKDF() uses many library functions which are thread-safe(eg: bzero), most user-defined function(eg:PRF) and the underlying functions being called from those user-defined functions, are potentially thread-unsafe. (For eg. due to use of several pointers and unsafe casting of memory eg. in CCHMac). I would suggest unless they make all the underlying functions thread-safe or have some mechanism to alteast make it conditionally thread-safe, stick with your approach, or modify the commoncrypto code to make it thread-safe and use that code.
Hope it helps.

Lacking documentation or source code, one option is to build a test app with say 10 threads looping on calls to CCKeyDerivationPBKDF with a random selection from say 10 different sets of arguments with 10 known results.
Each thread checks the result of a call to make sure it is what is expected. Each thread should also have a usleep() call for some random amount of time (bell curve sitting on say 10% of the time each call to CCKeyDerivationPBKDF takes) in this loop in order to attempt to interleave operations as much as possible.
You'll probably want to instrument it with debugging that keeps track of how much concurrency you are able to generate. With a 10% sleep time and 10 threads, you should be able to keep 9 threads concurrent.
If it makes it through an aggregate of say 100,000,000 calls without an error, I'd assume it was thread safe. Of course you could run it for much longer than that to get greater assurances.

Does thread-local mean thread safe?

Specifically I'm talking about Python. I'm trying to hack something (just a little) by seeing an object's value without ever passing it in, and I'm wondering if it is thread safe to use thread local to do that. Also, how do you even go about doing such a thing?

No -- thread local means that each thread gets its own copy of that variable. Using it is (at least normally) thread-safe, simply because each thread uses its own variable, separate from variables by the same name that's accessible to other threads. OTOH, they're not (normally) useful for communication between threads.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.