Python, counter atomic increment - python

How can I translate the following code from Java to Python?
AtomicInteger cont = new AtomicInteger(0);
int value = cont.getAndIncrement();

Most likely with an threading.Lock around any usage of that value. There's no atomic modification in Python unless you use pypy (if you do, have a look at __pypy__.thread.atomic in stm version).

itertools.count returns an iterator which will perform the equivalent to getAndIncrement() on each iteration.
Example:
import itertools
cont = itertools.count()
value = next(cont)

This will perform the same function, although its not lockless as the name 'AtomicInteger' would imply.
Note other methods are also not strictly lockless -- they rely on the GIL and are not portable between python interpreters.
class AtomicInteger():
def __init__(self, value=0):
self._value = int(value)
self._lock = threading.Lock()
def inc(self, d=1):
with self._lock:
self._value += int(d)
return self._value
def dec(self, d=1):
return self.inc(-d)
#property
def value(self):
with self._lock:
return self._value
#value.setter
def value(self, v):
with self._lock:
self._value = int(v)
return self._value

Using the atomics library, the same could would be written in Python as:
import atomics
a = atomics.atomic(width=4, atype=atomics.INT)
value = a.fetch_inc()
This method is strictly lock-free.
Note: I am the author of this library

8 years and still no full example code for the threading.Lock option without using any external library... Here it comes:
import threading
i = 0
lock = threading.Lock()
# Worker thread for increasing count
class CounterThread(threading.Thread):
def __init__(self):
super(CounterThread, self).__init__()
def run(self):
lock.acquire()
global i
i = i + 1
lock.release()
threads = []
for a in range(0, 10000):
th = CounterThread()
th.start()
threads.append(th)
for thread in threads:
thread.join()
global i
print(i)

Python atomic for shared data types.
https://sharedatomic.top
The module can be used for atomic operations under multiple processs and multiple threads conditions. High performance python! High concurrency, High performance!
atomic api Example with multiprocessing and multiple threads:
You need the following steps to utilize the module:
create function used by child processes, refer to UIntAPIs, IntAPIs, BytearrayAPIs, StringAPIs, SetAPIs, ListAPIs, in each process, you can create multiple threads.
def process_run(a):
def subthread_run(a):
a.array_sub_and_fetch(b'\x0F')
threadlist = []
for t in range(5000):
threadlist.append(Thread(target=subthread_run, args=(a,)))
for t in range(5000):
threadlist[t].start()
for t in range(5000):
threadlist[t].join()
create the shared bytearray
a = atomic_bytearray(b'ab', length=7, paddingdirection='r', paddingbytes=b'012', mode='m')
start processes / threads to utilize the shared bytearray
processlist = []
for p in range(2):
processlist.append(Process(target=process_run, args=(a,)))
for p in range(2):
processlist[p].start()
for p in range(2):
processlist[p].join()
assert a.value == int.to_bytes(27411031864108609, length=8, byteorder='big')

Related

Effectively save instance attribute with nested multiprocessing Pools/Processes

I have two custom Python classes, the first one has a method to make some calculations (using Pool) and create a new instance attribute, and the second one is used to aggregate two objects of the first class and has a method with which I want to send said calculations (also in parallel) in the two first-class objects and correctly save their new instance attributes.
Dummy code:
from multiprocessing import Pool, Process
class State:
def __init__(self, data):
self.data = data
def calculate(self):
with Pool() as p:
p.map(function, args)
new_attribute = *some code that reads the files generated with the Pool*
self.new_attribute = new_attribute
return
class Pair:
def __init__(self. state1:State, state2:State):
self.state1 = state1
self.state2 = state2
def calculate_states(self):
for state in [self.state1, self.state2]
p = Process(state.calculate, args)
p.start()
return
state1 = State(data1)
state2 = State(data2)
pair = Pair(state1, state2)
pair.calculate_states()
The problem is that, as I have found out during my extensive research about the problem, multiprocessing.Process creates copies of the namespace in which the processes work, and the values aren't returned to the main namespace. Setting the process.daemon to True produces an error, because "daemonic processes aren't allowed to have children", which is the same thing that happens if I exchange the Processes by an additional Pool. Using multiprocess (instead of multiprocessing) or concurrent.futures doesn't seem to work either. Additionally, I don't understand how multiprocessing.Queue works and I'm not sure if it could be applied here (I have read somewhere that it could be used).
I would like to do what I am trying to do without having to pass a shared-memory object to the Processes (to write the new_attribute into it and then apply it to the States in the main namespace). I hope someone can point me towards the solution even if I have not provided a working code/reproducible example.
Your problem arises from invoking method calculate as a new subprocess. You can still compute the new attributes in parallel without doing that by using map_async with a callback argument.
I have taken your code and provided missing function implementations to demonstrate:
from multiprocessing import Pool, cpu_count
def some_code(data):
if data == 1:
return 1032
if data == 2:
return 9874
raise ValueError('Invalid data value:', data)
def function(val):
...
# return value is not of interest
class State:
def __init__(self, data):
self.data = data
def calculate(self, pool, args):
pool.map_async(function, args, callback=self.callback)
def callback(self, result):
"""
Called when map_async completes
"""
new_attribute = some_code(self.data)
self.new_attribute = new_attribute
class Pair:
def __init__(self, state1:State, state2:State):
self.state1 = state1
self.state2 = state2
def calculate_states(self):
args = (6, 9, 18)
# Assumption is computation is VERY CPU-intensive
# If there is quite a bit of I/O involved then: pool_size = 2 * len(args)
# If it's mostly I/O you should have been using multithreading to begin with
pool_size = min(2*len(args), cpu_count())
with Pool(pool_size) as pool:
for state in [self.state1, self.state2]:
state.calculate(pool, args)
# wait for tasks to complete
pool.close()
pool.join()
# Required for Windows:
if __name__ == '__main__':
data1 = 1
data2 = 2
state1 = State(data1)
state2 = State(data2)
pair = Pair(state1, state2)
pair.calculate_states()
print(state1.new_attribute, state2.new_attribute)
Prints:
1032 9874

Python sharing a deque between multiprocessing processes

I've been looking at the following questions for the pas hour without any luck:
Python sharing a dictionary between parallel processes
multiprocessing: sharing a large read-only object between processes?
multiprocessing in python - sharing large object (e.g. pandas dataframe) between multiple processes
I've written a very basic test file to illustrate what I'm trying to do:
from collections import deque
from multiprocessing import Process
import numpy as np
class TestClass:
def __init__(self):
self.mem = deque(maxlen=4)
self.process = Process(target=self.run)
def run(self):
while True:
self.mem.append(np.array([0, 1, 2, 3, 4]))
def print_values(x):
while True:
print(x)
test = TestClass()
process = Process(target=print_values(test.mem))
test.process.start()
process.start()
Currently this outputs the following :
deque([], maxlen=4)
How can I access the mem value's from the main code or the process that runs "print_values"?
Unfortunately multiprocessing.Manager() doesn't support deque but it does work with list, dict, Queue, Value and Array. A list is fairly close so I've used it in the example below..
from multiprocessing import Process, Manager, Lock
import numpy as np
class TestClass:
def __init__(self):
self.maxlen = 4
self.manager = Manager()
self.mem = self.manager.list()
self.lock = self.manager.Lock()
self.process = Process(target=self.run, args=(self.mem, self.lock))
def run(self, mem, lock):
while True:
array = np.random.randint(0, high=10, size=5)
with lock:
if len(mem) >= self.maxlen:
mem.pop(0)
mem.append(array)
def print_values(mem, lock):
while True:
with lock:
print mem
test = TestClass()
print_process = Process(target=print_values, args=(test.mem, test.lock))
test.process.start()
print_process.start()
test.process.join()
print_process.join()
You have to be a little careful using manager objects. You can use them a lot like the objects they reference but you can't do something like... mem = mem[-4:] to truncate the values because you're changing the referenced object.
As for coding style, I might move the Manager objects outside the class or move the print_values function inside it but for an example, this works. If you move things around, just note that you can't use self.mem directly in the run method. You need to pass it in when you start the process or the fork that python does in the background will create a new instance and it won't be shared.
Hopefully this works for your situation, if not, we can try to adapt it a bit.
So by combining the code provided by #bivouac0 and the comment #Marijn Pieters posted, I came up with the following solution:
from multiprocessing import Process, Manager, Queue
class testClass:
def __init__(self, maxlen=4):
self.mem = Queue(maxsize=maxlen)
self.process = Process(target=self.run)
def run(self):
i = 0
while True:
self.mem.empty()
while not self.mem.full():
self.mem.put(i)
i += 1
def print_values(queue):
while True:
values = queue.get()
print(values)
if __name__ == "__main__":
test = testClass()
print_process = Process(target=print_values, args=(test.mem,))
test.process.start()
print_process.start()
test.process.join()
print_process.join()

Apply a method to a list of objects in parallel using multi-processing

I have created a class with a number of methods. One of the methods is very time consuming, my_process, and I'd like to do that method in parallel. I came across Python Multiprocessing - apply class method to a list of objects but I'm not sure how to apply it to my problem, and what effect it will have on the other methods of my class.
class MyClass():
def __init__(self, input):
self.input = input
self.result = int
def my_process(self, multiply_by, add_to):
self.result = self.input * multiply_by
self._my_sub_process(add_to)
return self.result
def _my_sub_process(self, add_to):
self.result += add_to
list_of_numbers = range(0, 5)
list_of_objects = [MyClass(i) for i in list_of_numbers]
list_of_results = [obj.my_process(100, 1) for obj in list_of_objects] # multi-process this for-loop
print list_of_numbers
print list_of_results
[0, 1, 2, 3, 4]
[1, 101, 201, 301, 401]
I'm going to go against the grain here, and suggest sticking to the simplest thing that could possibly work ;-) That is, Pool.map()-like functions are ideal for this, but are restricted to passing a single argument. Rather than make heroic efforts to worm around that, simply write a helper function that only needs a single argument: a tuple. Then it's all easy and clear.
Here's a complete program taking that approach, which prints what you want under Python 2, and regardless of OS:
class MyClass():
def __init__(self, input):
self.input = input
self.result = int
def my_process(self, multiply_by, add_to):
self.result = self.input * multiply_by
self._my_sub_process(add_to)
return self.result
def _my_sub_process(self, add_to):
self.result += add_to
import multiprocessing as mp
NUM_CORE = 4 # set to the number of cores you want to use
def worker(arg):
obj, m, a = arg
return obj.my_process(m, a)
if __name__ == "__main__":
list_of_numbers = range(0, 5)
list_of_objects = [MyClass(i) for i in list_of_numbers]
pool = mp.Pool(NUM_CORE)
list_of_results = pool.map(worker, ((obj, 100, 1) for obj in list_of_objects))
pool.close()
pool.join()
print list_of_numbers
print list_of_results
A big of magic
I should note there are many advantages to taking the very simple approach I suggest. Beyond that it "just works" on Pythons 2 and 3, requires no changes to your classes, and is easy to understand, it also plays nice with all of the Pool methods.
However, if you have multiple methods you want to run in parallel, it can get a bit annoying to write a tiny worker function for each. So here's a tiny bit of "magic" to worm around that. Change worker() like so:
def worker(arg):
obj, methname = arg[:2]
return getattr(obj, methname)(*arg[2:])
Now a single worker function suffices for any number of methods, with any number of arguments. In your specific case, just change one line to match:
list_of_results = pool.map(worker, ((obj, "my_process", 100, 1) for obj in list_of_objects))
More-or-less obvious generalizations can also cater to methods with keyword arguments. But, in real life, I usually stick to the original suggestion. At some point catering to generalizations does more harm than good. Then again, I like obvious things ;-)
If your class is not "huge", I think process oriented is better.
Pool in multiprocessing is suggested.
This is the tutorial -> https://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers
Then seperate the add_to from my_process since they are quick and you can wait util the end of the last process.
def my_process(input, multiby):
return xxxx
def add_to(result,a_list):
xxx
p = Pool(5)
res = []
for i in range(10):
res.append(p.apply_async(my_process, (i,5)))
p.join() # wait for the end of the last process
for i in range(10):
print res[i].get()
Generally the easiest way to run the same calculation in parallel is the map method of a multiprocessing.Pool (or the as_completed function from concurrent.futures in Python 3).
However, the map method applies a function that only takes one argument to an iterable of data using multiple processes.
So this function cannot be a normal method, because that requires at least two arguments; it must also include self! It could be a staticmethod, however. See also this answer for a more in-depth explanation.
Based on the answer of Python Multiprocessing - apply class method to a list of objects and your code:
add MyClass object into simulation object
class simulation(multiprocessing.Process):
def __init__(self, id, worker, *args, **kwargs):
# must call this before anything else
multiprocessing.Process.__init__(self)
self.id = id
self.worker = worker
self.args = args
self.kwargs = kwargs
sys.stdout.write('[%d] created\n' % (self.id))
run what you want in run function
def run(self):
sys.stdout.write('[%d] running ... process id: %s\n' % (self.id, os.getpid()))
self.worker.my_process(*self.args, **self.kwargs)
sys.stdout.write('[%d] completed\n' % (self.id))
Try this:
list_of_numbers = range(0, 5)
list_of_objects = [MyClass(i) for i in list_of_numbers]
list_of_sim = [simulation(id=k, worker=obj, multiply_by=100*k, add_to=10*k) \
for k, obj in enumerate(list_of_objects)]
for sim in list_of_sim:
sim.start()
If you don't absolutely need to stick with Multiprocessing module then,
it can easily achieved using concurrents.futures library
here's the example code:
from concurrent.futures.thread import ThreadPoolExecutor, wait
MAX_WORKERS = 20
class MyClass():
def __init__(self, input):
self.input = input
self.result = int
def my_process(self, multiply_by, add_to):
self.result = self.input * multiply_by
self._my_sub_process(add_to)
return self.result
def _my_sub_process(self, add_to):
self.result += add_to
list_of_numbers = range(0, 5)
list_of_objects = [MyClass(i) for i in list_of_numbers]
With ThreadPoolExecutor(MAX_WORKERS) as executor:
for obj in list_of_objects:
executor.submit(obj.my_process, 100, 1).add_done_callback(on_finish)
def on_finish(future):
result = future.result() # do stuff with your result
here executor returns future for every task it submits. keep in mind that if you use add_done_callback() finished task from thread returns to the main thread (which would block your main thread) if you really want true parallelism then you should wait for future objects separately. here's the code snippet for that.
futures = []
with ThreadPoolExecutor(MAX_WORKERS) as executor:
for objin list_of_objects:
futures.append(executor.submit(obj.my_process, 100, 1))
wait(futures)
for succeded, failed in futures:
# work with your result here
if succeded:
print (succeeeded.result())
if failed:
print (failed.result())
hope this helps.

Printing an update line whenever a subprocess finishes in Python 3's multiprocessing Pool

I'm using Python's multiprocessing library to process a list of inputs with the built-in map() method. Here's the relevant code segment:
subp_pool = Pool(self.subprocesses)
cases = subp_pool.map(self.get_case, input_list)
return cases
The function to be run in parallel is self.get_case(), and the list of inputs is input_list.
I wish to print a progress prompt to the standard output in the following format:
Working (25/100 cases processed)
How can I update a local variable inside the class that contains the Pool, so that whenever a subprocess finishes, the variable is incremented by 1 (and then printed to the standard output)?
There's no way to do this using multiprocessing.map, because it doesn't alert the main process about anything until it's completed all its tasks. However, you can get similar behavior by using apply_async in tandem with the callback keyword argument:
from multiprocessing.dummy import Pool
from functools import partial
import time
class Test(object):
def __init__(self):
self.count = 0
self.threads = 4
def get_case(self, x):
time.sleep(x)
def callback(self, total, x):
self.count += 1
print("Working ({}/{}) cases processed.".format(self.count, total))
def do_async(self):
thread_pool = Pool(self.threads)
input_list = range(5)
callback = partial(self.callback, len(input_list))
tasks = [thread_pool.apply_async(self.get_case, (x,),
callback=callback) for x in input_list]
return [task.get() for task in tasks]
if __name__ == "__main__":
t = Test()
t.do_async()
Call the print_data() from the get_case() method and you are done.
from threading import Lock
Class A(object):
def __init__(self):
self.mutex = Lock()
self.count = 0
def print_data(self):
self.mutex.acquire()
try:
self.count += 1
print('Working (' + str(self.count) + 'cases processed)')
finally:
self.mutex.release()

Is modifying a class variable in python threadsafe?

I was reading this question (which you do not have to read because I will copy what is there... I just wanted to give show you my inspiration)...
So, if I have a class that counts how many instances were created:
class Foo(object):
instance_count = 0
def __init__(self):
Foo.instance_count += 1
My question is, if I create Foo objects in multiple threads, is instance_count going to be correct? Are class variables safe to modify from multiple threads?
It's not threadsafe even on CPython. Try this to see for yourself:
import threading
class Foo(object):
instance_count = 0
def inc_by(n):
for i in xrange(n):
Foo.instance_count += 1
threads = [threading.Thread(target=inc_by, args=(100000,)) for thread_nr in xrange(100)]
for thread in threads: thread.start()
for thread in threads: thread.join()
print(Foo.instance_count) # Expected 10M for threadsafe ops, I get around 5M
The reason is that while INPLACE_ADD is atomic under GIL, the attribute is still loaded and store (see dis.dis(Foo.__init__)). Use a lock to serialize the access to the class variable:
Foo.lock = threading.Lock()
def interlocked_inc(n):
for i in xrange(n):
with Foo.lock:
Foo.instance_count += 1
threads = [threading.Thread(target=interlocked_inc, args=(100000,)) for thread_nr in xrange(100)]
for thread in threads: thread.start()
for thread in threads: thread.join()
print(Foo.instance_count)
No it is not thread safe. I've faced a similar problem a few days ago, and I chose to implement the lock thanks to a decorator. The benefit is that it makes the code readable:
def threadsafe_function(fn):
"""decorator making sure that the decorated function is thread safe"""
lock = threading.Lock()
def new(*args, **kwargs):
lock.acquire()
try:
r = fn(*args, **kwargs)
except Exception as e:
raise e
finally:
lock.release()
return r
return new
class X:
var = 0
#threadsafe_function
def inc_var(self):
X.var += 1
return X.var
Following on from luc's answer, here's a simplified decorator using with context manager and a little __main__ code to spin up the test. Try it with and without the #synchronized decorator to see the difference.
import concurrent.futures
import functools
import logging
import threading
def synchronized(function):
lock = threading.Lock()
#functools.wraps(function)
def wrapper(self, *args, **kwargs):
with lock:
return function(self, *args, **kwargs)
return wrapper
class Foo:
counter = 0
#synchronized
def increase(self):
Foo.counter += 1
if __name__ == "__main__":
foo = Foo()
print(f"Start value is {foo.counter}")
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
for index in range(200000):
executor.submit(foo.increase)
print(f"End value is {foo.counter}")
Without #synchronized
End value is 198124
End value is 196827
End value is 197968
With #synchronized
End value is 200000
End value is 200000
End value is 200000
Is modifying a class variable in python threadsafe?
It depends on the operation.
While the Python GIL (Global Interpreter Lock) only allows access to one thread at a time, per atomic operation, some operations are not atomic, that is, they are implemented with more than one operation, such as, given (L, L1, L2 are lists, D, D1, D2 are dicts, x, y are objects, i, j are ints)
i = i+1
L.append(L[-1])
L[i] = L[j]
D[x] = D[x] + 1
See What kinds of global value mutation are thread-safe?
You're example is included in the non-safe operations, as += is short hand for i = i + 1.
Other posters have shown how to make the operation thread-safe. An alternative thread-safe way to implement your operation, without using a thread locking mechanism would be to reference a different variable, only set via an atomic operation. For example
max_reached = False
# in one thread
count = 0
maximum = 100
count += 1
if count >= maximum:
max_reached = True
# in another thread
while not max_reached:
time.sleep(1)
# do something
This would be thread safe, as long as only one thread increments the count.
I would say it is thread-safe, at least on CPython implementation. The GIL will make all your "threads" to run sequentially so they will not be able to mess with your reference count.

Categories