Is the following code thread-safe in Python?

Is the following code thread-safe in Python? - python

I have code similar to the following in a Python module:
_x = None
def set_x():
global _x
_x = # some value (will always be the same)
def worker():
if _x is None:
set_x()
Can threads created in a client of this module safely call worker()?
X is used in a read-only mode elsewhere in the module; once set it is never changed.
Yes, I realize that x could be encapsulated in a class and the threads could create their own instances of that class on the stack and thus obtain thread-safety, but assume for the purposes of this question that a class-based implementation is not feasible or desirable (e.g., the module represents a Singleton where the cost of setting x is relatively high and we want to set it as infrequently as is practical).

Assuming you are using a global variable in set_x:
x = None
def set_x():
global x
x = "foo"
def worker():
if not x:
set_x()
There is a race condition between if not x and actually setting x. So multiple threads may assign a value to x. That's mostly harmless for immutable objects but problematic for mutable objects like lists where reassignment loses values in the existing x. And by "mostly", even objects whose values compare may have different id's, so == may work but is may fail:
>>> a = "foo"
>>> b = "bar"
>>> c = a + b
>>> d = a + b
>>> c == d
True
>>> c is d
False
And of course code paths that don't play the set_x game may find that x changes from None to your expected value while they execute.
Especially because x is expensive to calculate, a good approach is to hide it behind a getter. Now everyone has to play the same game and only one of the threads will actually do the work.
_x = None
_x_lock = threading.Lock()
def get_x():
global _x
if _x is None:
with _x_lock:
if _x is None:
_x = "foobar"
return _x

Related

How to catch a particular name assignment?

(Based on this question):
One can override the __setattr__ magic method for an object to have additional instructions when an attribute of an object is set. As in:
class MyClass(object):
def __init__(self, attribute=None):
object.__init__(self)
self.attribute = attribute
def __setattr__(self, name, value):
self.__dict__[name] = value
if name == 'attribute':
print("attribute's value is modified to {}.".format(
self.attribute))
if __name__ == '__main__':
my_obj = MyClass(True)
while True:
my_obj.attribute = input()
How can I catch a particular name assignment in the current script
without using classes(specifically to call a method with more
instructions)?
def b_is_modified():
print("b is modified!")
if __name__ == '__main__':
a = 3
b = 4
b = 5
How to call b_is_modified when b is assigned a value?

I think the other answer by Nae sums it up; I'm not aware of any built-in mechanisms in the Python languages to detect assignments, so if you want an interrupt-like event system to trigger upon assignment I don't know if it's feasible.
However, you seem quiet determined to get a way to "detect" assignment, so I want to describe an approach that might get you closer than nothing.
There are the built-in functions globals() and locals() that creates dictionary of variables in global and local scope respectively. (They, in additon to vars() are further explained here).
A noteworthy point is that locals() will behave differently if called from inside a function:
If locals() is called inside a function it constructs a dictionary of the function namespace as of that moment and returns it -- any further name assignments are not reflected in the returned dictionary, and any assignments to the dictionary are not reflected in the actual local namespace
If locals() is called outside a function it returns the actual dictionary that is the current namespace. Further changes to the namespace are reflected in the dictionary, and changes to the dictionary are reflected in the namespace:
Here is a "hacky" way to detect changes to variables:
def b_is_modified():
print("b is modified!")
if __name__ == '__main__':
old = locals().get('b')
a = 3
b = 4
b = 5
new = locals().get('b')
if id(new) != id(old) and new is not None:
b_is_modified()
This is nothing else but an (obfuscated?) way of checking if the value of b has changed from one point in execution to another, and there is no callback event or trigger action that detects it. However, if you want to expand on this approach continue reading.
The rest of the answer explains how to check for changes in b by rewriting it to something like:
if __name__ == '__main__':
monitor = ScopeVariableMonitor(locals())
a = 3
b = 4
monitor.compare_and_update() # Detects creation of a and b
b = 5
monitor.compare_and_update() # Detects changes to b
The following will "detect" any changes to the variables, and I've also included an example where it's used inside a function, to reiterate that then the dictionary returned from locals() does not update.
The ScopeVariableMonitor-class is just an example, and combines the code in one place. In essence, it's comparing changes to the existence and values of variables between update()s.
class ScopeVariableMonitor:
def __init__(self, scope_vars):
self.scope_vars = scope_vars # Save a locals()-dictionary instance
self.old = self.scope_vars.copy() # Make a shallow copy for later comparison
def update(self, scope_vars=None):
scope_vars = scope_vars or self.scope_vars
self.old = scope_vars.copy() # Make new shallow copy for next time
def has_changed(self, var_name):
old, new = self.old.get(var_name), self.scope_vars.get(var_name)
print('{} has changed: {}'.format(var_name, id(old) != id(new)))
def compare_and_update(self, var_list=None, scope_vars=None):
scope_vars = scope_vars or self.scope_vars
# Find new keys in the locals()-dictionary
new_variables = set(scope_vars.keys()).difference(set(self.old.keys()))
if var_list:
new_variables = [v for v in new_variables if v in var_list]
if new_variables:
print('\nNew variables:')
for new_variable in new_variables:
print(' {} = {}'.format(new_variable, scope_vars[new_variable]))
# Find changes of values in the locals()-dictionary (does not handle deleted vars)
changed_variables = [var_name for (var_name, value) in self.old.items() if
id(value) != id(scope_vars[var_name])]
if var_list:
changed_variables = [v for v in changed_variables if v in var_list]
if changed_variables:
print('\nChanged variables:')
for var in changed_variables:
print(' Before: {} = {}'.format(var, self.old[var]))
print(' Current: {} = {}\n'.format(var, scope_vars[var], self.old[var]))
self.update()
The "interesting" part is the compare_and_update()-method, if provided with a list of variables names, e.g. ['a', 'b'], it will only look for changes to those to variables. The scope_vars-parameter is required when inside the function scope, but not in the global scope; for reasons explained above.
def some_function_scope():
print('\n --- Now inside function scope --- \n')
monitor = ScopeVariableMonitor(locals())
a = 'foo'
b = 42
monitor.compare_and_update(['a', 'b'], scope_vars=locals())
b = 'bar'
monitor.compare_and_update(scope_vars=locals())
if __name__ == '__main__':
monitor = ScopeVariableMonitor(locals())
var_list = ['a', 'b']
a = 5
b = 10
c = 15
monitor.compare_and_update(var_list=var_list)
print('\n *** *** *** \n') # Separator for print output
a = 10
b = 42
c = 100
d = 1000
monitor.has_changed('b')
monitor.compare_and_update()
some_function_scope()
Output:
New variables:
a = 5
b = 10
*** *** ***
b has changed: True
New variables:
d = 1000
Changed variables:
Before: b = 10
Current: b = 42
Before: a = 5
Current: a = 10
Before: c = 15
Current: c = 100
--- Now inside function scope ---
New variables:
a = foo
b = 42
Changed variables:
Before: b = 42
Current: b = bar
Conclusion
My answer is just a more general way of doing:
b = 1
old_b = b
# ...
if b != old_b:
print('b has been assigned to')
The dictionary from locals() will hold everything that is a variable, including functions and classes; not just "simple" variables like your a, b and c.
In the implementation above, checks between "old" and "new" values are done by comparing the id() of the shallow copy of before with id() of the current value. This approach allows for comparison of ANY value, because the id() will return the virtual memory address, but this is assumable far from a good, general comparison scheme.
I'm curious to what you want to achieve and why you want to detect assignments: if you share your goal then perhaps I can think of another way to reach it in another way.

Based on this answer:
It can't be catched(at least in python level).
Simple name assignment(b = 4), as oppposed to object attribute assignment (object.b = 5), is a fundamental operation of the language itself. It's not implemented in terms of a lower-level operation that one can override. Assignment just is.

How exactly does the caller see a change in the object?

From Chapter "Classes" of the official Python tutorial:
[...] if a function modifies an object passed as an argument, the caller will see the change — this eliminates the need for two different argument passing mechanisms as in Pascal.
What would be an example of how exactly the caller will see a change? Or how could it be (not in Python but in general) that the caller doesn't see the change?

It basically means that if a mutable object is changed, it will change everywhere.
For an example of passing by reference (which is what Python does):
x = []
def foo_adder(y):
y.append('foo')
foo_addr(x)
print(x) # ['foo']
vs something like Pascal, where you can pass copies of an object as a parameter, instead of the object itself:
# Pretend this is Pascal code.
x = []
def foo_adder(y):
y.append('foo')
foo_adder(x)
print(x) # []
You can get the behavior of the second example in Python if you pass a copy of the object. For lists, you use [:].
# Pretend this is Pascal code.
x = []
def foo_adder(y):
y.append('foo')
foo_adder(x[:])
print(x) # []
For your second question about how the caller might not see the change, let's take that same foo_adder function and change it a little so that it doesn't modify the object, but instead replaces it.
x = []
def foo_adder(y):
y = y + ['foo']
foo_adder(x)
print(x) # []

What would be an example of how exactly the caller will see a change?
>>> def modify(x):
... x.append(1)
...
>>> seq = []
>>> print(seq)
[]
>>> modify(seq)
>>> print(seq)
[1]
Or how could it be (not in Python but in general) that the caller doesn't see the change?
Hypothetically, a language could exist where a deep copy of seq is created and assigned to x, and any change made to x has no effect on seq, in which case print(seq) would display [] both times. But this isn't what happens in Python.
Edit: note that assigning a new value to an old variable name typically doesn't count as "modification".
>>> def f(x):
... x = x + 1
...
>>> y = 23
>>> f(y)
>>> print(y)
23

Argument passing by reference to a class in python (á la C++), to modify it with the class methods

In this case, I want that the program print "X = changed"
class Clase:
def __init__(self,variable):
self.var = variable
def set_var(self):
self.var = 'changed'
X = 'unchanged'
V = Clase(X)
V.set_var()
print "X = ",X

All values are objects and are passed by reference in Python, and assignment changes the reference.
def myfunc(y):
y = 13
x = 42 # x now points at the integer object, 42
myfunc(y) # inside myfunc, y initially points to 42,
# but myfunc changes its y to point to a
# different object, 13
print(x) # prints 42, since changing y inside myfunc
# does not change any other variable
It's important to note here that there are no "simple types" as there are in other languages. In Python, integers are objects. Floats are objects. Bools are objects. And assignment is always changing a pointer to refer to a different object, whatever the type of that object.
Thus, it's not possible to "assign through" a reference and change someone else's variable. You can, however, simulate this by passing a mutable container (e.g. a list or a dictionary) and changing the contents of the container, as others have shown.
This kind of mutation of arguments through pointers is common in C/C++ and is generally used to work around the fact that a function can only have a single return value. Python will happily create tuples for you in the return statement and unpack them to multiple variables on the other side, making it easy to return multiple values, so this isn't an issue. Just return all the values you want to return. Here is a trivial example:
def myfunc(x, y, z):
return x * 2, y + 5, z - 3
On the other side:
a, b, c = myFunc(4, 5, 6)
In practice, then, there is rarely any reason to need to do what you're trying to do in Python.

In python list and dict types are global and are passed around by reference. So if you change the type of your variable X to one of those you will get the desired results.
[EDIT: Added use case that op needed]
class Clase:
def __init__(self,variable):
self.var = variable
def set_var(self):
self.var.test = 'changed'
class ComplicatedClass():
def __init__(self, test):
self.test = test
X = ComplicatedClass('unchanged')
print('Before:', X.test)
V = Clase(X)
V.set_var()
print("After:",X.test)
>>> Before: unchanged
>>> After: changed

strings are immutable so you could not change X in this way
... an alternative might be reassigning X in the global space... this obviously will fail in many many senarios (ie it is not a global)
class Clase:
def __init__(self,variable):
self.var = variable
def set_var(self):
globals()[self.var] = 'changed'
X = 'unchanged'
V = Clase('X')
V.set_var()
print "X = ",X
the other alternative is to use a mutable data type as suggested by Ashwin
or the best option is that this is probably not a good idea and you should likely not do it...

generate consecutive numbers in python while input doesnt change

i need to get consecutive numbers while an input number doesnt change.
so i get give(5)->1, give(5)->2, and so on, but then: give(6)->1 again, starting the count.
So far I solved it with an iterator function count() and a function give(num) like this:
def count(start=1):
n=start
while True:
yield n
n +=1
def give(num):
global last
global a
if num==last:
ret=a.next()
else:
a=count()
ret=a.next()
last=num
return ret
It works, but its ugly: I have two globals and have to set them before I call give(num). I'd like to be able to call give(num) without setting previously the 'a=count()' and 'last=999' variables. I'm positive there's better way to do this...
edit: ty all for incredibly fast and varied responses, i've got a lot to study here..

The obvious thing to do is to make give into an object rather than a function.* Any object can be made callable by defining a __call__ method.
While we're at it, your code can be simplified quite a bit, so let's do that.
class Giver(object):
def __init__(self):
self.last, self.a = object(), count()
def __call__(self, num):
if num != self.last:
self.a = count(1)
self.last = num
return self.a.next()
give = Giver()
So:
>>> give(5)
1
>>> give(5)
2
>>> give(6)
1
>>> give(5)
1
This also lets you create multiple separate givers, each with its own, separate current state, if you have any need to do that.
If you want to expand it with more state, the state just goes into the instance variables. For example, you can replace last and a with a dictionary mapping previously-seen values to counters:
class Giver(object):
def __init__(self):
self.counters = defaultdict(count)
def __call__(self, num):
return next(self.counters[num])
And now:
>>> give(5)
1
>>> give(5)
2
>>> give(6)
1
>>> give(5)
3
* I sort of skipped a step here. You can always remove globals by putting the variables and everything that uses them (which may just be one function) inside a function or other scope, so they end up as free variables in the function's closure. But in your case, I think this would just make your code look "uglier" (in the same sense you thought it was ugly). But remember that objects and closures are effectively equivalent in what they can do, but different in what they look like—so when one looks horribly ugly, try the other.

Just keep track of the last returned value for each input. You can do this with an ordinary dict:
_counter = {}
def give(n):
_counter[n] = _counter.get(n, 0) + 1
return _counter[n]
The standard library has a Counter class that makes things a bit easier:
import collections
_counter = collections.Counter()
def give(n):
_counter[n] += 1
return _counter[n]
collections.defaultdict(int) works too.

You can achieve this with something like this:
def count(start=1):
n = start
while True:
yield n
n += 1
def give(num):
if num not in give.memo:
give.memo[num] = count()
return next(give.memo[num])
give.memo = {}
Which produces:
>>> give(5)
1
>>> give(5)
2
>>> give(5)
3
>>> give(6)
1
>>> give(5)
4
>>>
The two key points are using a dict to keep track of multiple iterators simultaneously, and setting a variable on the function itself. You can do this because functions are themselves objects in python. This is the equivalent of a static local variable in C.

You can basically get what you want via combination of defaultdict and itertools.count:
from collections import defaultdict
from itertools import count
_counters = defaultdict(count)
next(_counters[5])
Out[116]: 0
next(_counters[5])
Out[117]: 1
next(_counters[5])
Out[118]: 2
next(_counters[5])
Out[119]: 3
next(_counters[6])
Out[120]: 0
next(_counters[6])
Out[121]: 1
next(_counters[6])
Out[122]: 2
If you need the counter to start at one, you can get that via functools.partial:
from functools import partial
_counters = defaultdict(partial(count,1))
next(_counters[5])
Out[125]: 1
next(_counters[5])
Out[126]: 2
next(_counters[5])
Out[127]: 3
next(_counters[6])
Out[128]: 1

Adding a second answer because this is rather radically different from my first.
What you are basically trying to accomplish is a coroutine - a generator that preserves state that at arbitrary time, values can be sent into. PEP 342 gives us a way to do that with the "yield expression". I'll jump right into how it looks:
from collections import defaultdict
from itertools import count
from functools import partial
def gen(x):
_counters = defaultdict(partial(count,1))
while True:
out = next(_counters[x])
sent = yield out
if sent:
x = sent
If the _counters line is confusing, see my other answer.
With a coroutine, you can send data into the generator. So you can do something like the following:
g = gen(5)
next(g)
Out[159]: 1
next(g)
Out[160]: 2
g.send(6)
Out[161]: 1
next(g)
Out[162]: 2
next(g)
Out[163]: 3
next(g)
Out[164]: 4
g.send(5)
Out[165]: 3
Notice how the generator preserves state and can switch between counters at will.

In my first answer, I suggested that one solution was to transform the closure into an object. But I skipped a step—you're using global variables, not a closure, and that's part of what you didn't like about it.
Here's a simple way to transform any global state into encapsulated state:
def make_give():
last, a = None, None
def give(num):
nonlocal last
nonlocal a
if num != last:
a = count()
last=num
return a.next()
return give
give = make_give()
Or, adapting my final version of Giver:
def make_giver():
counters = defaultdict(count)
def give(self, num):
return next(counters[num])
return give
If you're curious how this works:
>>> give.__closure__
(<cell at 0x10f0e2398: NoneType object at 0x10b40fc50>, <cell at 0x10f0e23d0: NoneType object at 0x10b40fc50>)
>>> give.__code__.co_freevars
('a', 'last')
Those cell objects are essentially references into the stack frame of the make_give call that created the give function.
This doesn't always work quite as well in Python 2.x as in 3.x. While closure cells work the same way, if you assign to a variable inside the function body and there's no global or nonlocal statement, it automatically becomes local, and Python 2 had no nonlocal statement. So, the second version works fine, but for the first version, you'd have to do something like state = {'a': None, 'last': None} and then write state['a'] = count instead of a = count.
This trick—creating a closure just to hide local variables—is very common in a few other languages, like JavaScript. In Python (partly because of the long history without the nonlocal statement, and partly because Python has alternatives that other languages don't), it's less common. It's usually more idiomatic to stash the state in a mutable default parameter value, or an attribute on the function—or, if there's a reasonable class to make the function a method of, as an attribute on the class instances. There are plenty of cases where a closure is pythonic, this just isn't usually one of them.

How to test whether x is a member of a universal set?

I have a list L, and x in L evaluates to True if x is a member of L. What can I use instead of L in order x in smth will evaluate to True independently on the value of x?
So, I need something, what contains all objects, including itself, because x can also be this "smth".

class Universe:
def __contains__(_,x): return True

You can inherit from the built-in list class and redefine the __contains__ method that is called when you do tests like item in list:
>>> class my_list(list):
def __contains__(self, item):
return True
>>> L = my_list()
>>> L
[]
>>> x = 2
>>> x
2
>>> x in L
True

Theorem: There is no universal set.
Proof. Let X be a set such that X = {\empty, x} where x is every possible element in the domain. The question arises, is X \in X? Most sets are not defined that way, so let us define a new set Y. Y = {A \in X; A \notin A} i.e. Y is the set of all sets not belonging to themselves.
Now, does Y \in Y? Well, we have defined Y as all sets not belonging to themselves, so Y cannot exist in Y, which contradicts our assumption.
So now assume Y is not in Y. Now A definitely contains Y, as Y is not in itself, but the definition of Y is such that if we define Y to be in Y, we contradict our own definition.
Thus, there is no set of all sets. This is known as Russell's Paradox.
So, why programmatically try to create an object that violates a result proved and tested by set theorists far more intelligent than I am? If that was my interview, this would be my answer and if they insisted it was possible, I'd suggest explaining what the problem domain is, since conceptually Russell has fundamentally proved it is impossible.
If you want a user-friendly problem usually posed for people studying introductory set theory, try the Barber Paradox.
Edit: Python lets you implement an object that contains itself. See this:
class Universal(object):
def __init__(self):
self.contents = []
def add(self, x):
self.contents.append(x)
def remove(self, x):
self.contents.remove(x)
def __contains__(self, x):
return ( x in self.contents )
However, this is not a strict set theoretic object, since the contents actually contains a reference to the parent object. If you require that objects be distinct as per the proof above, this cannot happen.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Is the following code thread-safe in Python? - python

Related

How to catch a particular name assignment?

How exactly does the caller see a change in the object?

Argument passing by reference to a class in python (á la C++), to modify it with the class methods

generate consecutive numbers in python while input doesnt change

How to test whether x is a member of a universal set?

Categories

Resources