I have a class with a method which modifies its internal state, for instance:
class Example():
def __init__(self, value):
self.param = value
def example_method(self, m):
self.param = self.param * m
# By convention, these methods in my implementation return the object itself
return self
I wanna run example_method in parallel (I am using the mpire lib, but other options are welcome as well), for many instances of Example, and have their internal states altered in my instances. Something like:
import mpire
list_of_instances = [Example(i) for i in range(1, 6)]
def run_method(ex):
ex.example_method(10)
print("Before parallel calls, this should print <1>}")
print(f"<{list_of_instances[0]}>")
with mpire.WorkerPool(n_jobs=3) as pool:
pool.map_unordered(run_method, [(example,) for example in list_of_instances])
print("After parallel calls, this should print <10>}")
print(f"<{list_of_instances[0]}>")
However, the way that mpire works, what is being modified are copies of example, and not the objects within list_of_instances, making any changes to internal state not being kept after the parallel processing. So the second print will print <1> instead, because that object`s internal state was not changed, a copy of it was.
I am wondering if there are any solutions to have the internal state changes be applied to the original objects in list_of_instances.
The only solutions I can think about is:
replace list_of_instances by the result of pool.map_unordered (changing to pool.map_ordered if order is important).
Since in any other case (even when using shared_objects) I have a copy of the original objects being made, resulting in the state changes being lost.
Is there any way to solve this with parallel processing? I also accept answers using other libs.
Related
Consider the following snippet:
import concurrent.futures
import time
from random import random
class Test(object):
def __init__(self):
self.my_set = set()
def worker(self, name):
temp_set = set()
temp_set.add(name)
temp_set.add(name*10)
time.sleep(random() * 5)
temp_set.add(name*10 + 1)
self.my_set = self.my_set.union(temp_set) # question 1
return name
def start(self):
result = []
names = [1,2,3,4,5,6,7]
with concurrent.futures.ThreadPoolExecutor(max_workers=len(names)) as executor:
futures = [executor.submit(self.worker, x) for x in names]
for future in concurrent.futures.as_completed(futures):
result.append(future.result()) # question 2
Is there a chance self.my_set can become corrupted via the line marked "question 1"? I believe union is atomic, but couldn't the assignment be a problem?
Is there a problem on the line marked "question 2"? I believe the list append is atomic, so perhaps this is ok.
I've read these docs:
https://docs.python.org/3/library/stdtypes.html#set
https://web.archive.org/web/20201101025814id_/http://effbot.org/zone/thread-synchronization.htm
Is Python variable assignment atomic?
https://docs.python.org/3/glossary.html#term-global-interpreter-lock
and executed the snippet code provided in this question, but I can't find a definitive answer to how concurrency should work in this case.
Regarding question 1: Think about what's going on here:
self.my_set = self.my_set.union(temp_set)
There's a sequence of at least three distinct steps
The worker call grabs a copy of self.my_set (a reference to Set object)
The union function constructs a new set.
The worker assigns self.my_set to refer to the newly constructed set.
So what happens if two or more workers concurrently try to do the same thing? (note: it's not guaranteed to happen this way, but it could happen this way.)
Each of them could grab a reference to the original my_set.
Each of them could compute a new set, consisting only of the original members of my_set plus its own contribution.
Each of them could assign its new set to the my_set variable.
The problem is in step three. If it happened this way, then each of those new sets only would contain the contribution from the one worker that created it. There would be no single set containing the new contributions from all of the workers. When it's all over, my_set would only refer to one of those new sets—whichever thread was the last to perform the assignment would "win"—and the other new sets all would be be thrown away.
One way to prevent that would be to use mutual exclusion to keep other threads from trying to compute their new sets and update the shared variable at the same time:
class Test(object):
def __init__(self):
self.my_set = set()
self.my_set_mutex = threading.Lock()
def worker(self, name):
...
with self.my_set_mutex
self.my_set = self.my_set.union(temp_set)
return name
Regarding question 2: It doesn't matter whether or not appending to a list is "atomic." The result variable is local to the start method. In the code that you've shown, the list to which result refers is inaccessible to any other thread than the one that created it. There can't be any interference between threads unless you share the list with other threads.
Issue: I have 2 functions that both require the same nested functions to operate so they're currently copy-pasted into each function. These functions cannot be combined as the second function relies on calling the first function twice. Unnesting the functions would result in the addition of too many parameters.
Question: Is it better to run the nested functions in the first function and append their values to an object to be fed into the 2nd function, or is it better to copy and paste the nested functions?
Example:
def func_A(thing):
def sub_func_A(thing):
thing += 1
return sub_func_A(thing)
def func_B(thing):
def sub_func_B(thing):
thing += 1
val_A, val_B = func_A(5), func_A(5)
return sub_func_B(val_A), sub_func_B(val_B)
Imagine these functions couldn't be combined and the nested function relied on so many parameters that moving it outside and calling it would be too cluttered
The "better option" depends on a few factors -:
The type of optimization you want to achieve.
The time taken by the functions to execute.
If the type of optimization to be achieved here is based on the time taken to execute the second function in the two cases, then it depends on the time taken for the nested function to fully execute, if that time is less than the time taken to store it's output when it's first called by the first function then its better copy pasting them.
While, if the time taken by the nested function to execute is more than the time taken to store it's output, then its a better option to execute it first time and then store it's output for future use.
Further, As mentioned by #DarylG in the comments, a class based approach can also be used wherein the nested function(subfunction) can be a private function(only accessible by the class's inner components), while the two functions(func_A and func_B) can be public thus allowing them to be used and accessed widely from the outside as well. If implemented in code it might look something like this -:
class MyClass() :
def __init__(self, ...) :
...
return
def __subfunc(self, thing) :
# PRIVATE SUBFUNC
thing += 1
return thing
def func_A(self, thing):
# PUBLIC FUNC A
return self.__subfunc(thing)
def func_B(self, thing):
# PUBLIC FUNC B
val_A, val_B = self.func_A(5), self.func_A(5)
return self.__subfunc(val_A), self.__subfunc(val_B)
I am using the Pool class from python's multiprocessing library write a program that will run on an HPC cluster.
Here is an abstraction of what I am trying to do:
def myFunction(x):
# myObject is a global variable in this case
return myFunction2(x, myObject)
def myFunction2(x,myObject):
myObject.modify() # here I am calling some method that changes myObject
return myObject.f(x)
poolVar = Pool()
argsArray = [ARGS ARRAY GOES HERE]
output = poolVar.map(myFunction, argsArray)
The function f(x) is contained in a *.so file, i.e., it is calling a C function.
The problem I am having is that the value of the output variable is different each time I run my program (even though the function myObject.f() is a deterministic function). (If I only have one process then the output variable is the same each time I run the program.)
I have tried creating the object rather than storing it as a global variable:
def myFunction(x):
myObject = createObject()
return myFunction2(x, myObject)
However, in my program the object creation is expensive, and thus, it is a lot easier to create myObject once and then modify it each time I call myFunction2(). Thus, I would like to not have to create the object each time.
Do you have any tips? I am very new to parallel programming so I could be going about this all wrong. I decided to use the Pool class since I wanted to start with something simple. But I am willing to try a better way of doing it.
I am using the Pool class from python's multiprocessing library to do
some shared memory processing on an HPC cluster.
Processes are not threads! You cannot simply replace Thread with Process and expect all to work the same. Processes do not share memory, which means that the global variables are copied, hence their value in the original process doesn't change.
If you want to use shared memory between processes then you must use the multiprocessing's data types, such as Value, Array, or use the Manager to create shared lists etc.
In particular you might be interested in the Manager.register method, which allows the Manager to create shared custom objects(although they must be picklable).
However I'm not sure whether this will improve the performance. Since any communication between processes requires pickling, and pickling takes usually more time then simply instantiating the object.
Note that you can do some initialization of the worker processes passing the initializer and initargs argument when creating the Pool.
For example, in its simplest form, to create a global variable in the worker process:
def initializer():
global data
data = createObject()
Used as:
pool = Pool(4, initializer, ())
Then the worker functions can use the data global variable without worries.
Style note: Never use the name of a built-in for your variables/modules. In your case object is a built-in. Otherwise you'll end up with unexpected errors which may be obscure and hard to track down.
Global keyword works on the same file only. Another way is to set value dynamically in pool process initialiser, somefile.py can just be an empty file:
import importlib
def pool_process_init():
m = importlib.import_module("somefile.py")
m.my_global_var = "some value"
pool = Pool(4, initializer=pool_process_init)
How to use the var in task:
def my_coroutine():
m = importlib.import_module("somefile.py")
print(m.my_global_var)
I have an class called Experiment and another called Case. One Experiment is made of many individual cases. See Class definitions below,
from multiprocessing import Process
class Experiment (object):
def __init__(self, name):
self.name = name
self.cases = []
self.cases.append(Case('a'))
self.cases.append(Case('b'))
self.cases.append(Case('c'))
def sr_execute(self):
for c in self.cases:
c.setVars(6)
class Case(object):
def __init__(self, name):
self.name = name
def setVars(self, var):
self.var = var
In my Experiment Class, I have a function called sr_execute. This function shows the desired behavior. I am interested in parsing thru all cases and set an attribute for each of the cases. When I run the following code,
if __name__ == '__main__':
#multiprocessing.freeze_support()
e = Experiment('exp')
e.sr_execute()
for c in e.cases: print c.name, c.var
I get,
a 6
b 6
c 6
This is the desired behavior.
However, I would like to do this in parallel using multiprocessing. To do this, I add a mp_execute() function to the Experiment Class,
def mp_execute(self):
processes = []
for c in self.cases:
processes.append(Process(target= c.setVars, args = (6,)))
[p.start() for p in processes]
[p.join() for p in processes]
However, this does not work. When I execute the following,
if __name__ == '__main__':
#multiprocessing.freeze_support()
e = Experiment('exp')
e.mp_execute()
for c in e.cases: print c.name, c.var
I get an error,
AttributeError: 'Case' object has no attribute 'var'
Apparently, I am unable to set class attribute using multiprocessing.
Any clues what is going on,
When you call:
def mp_execute(self):
processes = []
for c in self.cases:
processes.append(Process(target= c.setVars, args = (6,)))
[p.start() for p in processes]
[p.join() for p in processes]
when you create the Process it will use a copy of your object and the modifications to such object are not passed to the main program because different processes have different adress spaces. It would work if you used Threads
since in that case no copy is created.
Also note that your code will probably fail in Windows because you are passing a method as target and Windows requires the target to be picklable (and instance methods are not pickable).
The target should be a function defined at the top level of a module in order to work on all Oses.
If you want to communicate to the main process the changes you could:
Use a Queue to pass the result
Use a Manager to built a shared object
Anyway you must handle the communication "explicitly" either by setting up a "channel" (like a Queue) or setting up a shared state.
Style note: Do not use list-comprehensions in this way:
[p.join() for p in processes]
it's simply wrong. You are only wasting space creating a list of Nones. It is also probably slower compared to the right way:
for p in processes:
p.join()
Since it has to append the elements to the list.
Some say that list-comprehensions are slightly faster than for loops, however:
The difference in performance is so small that it generally doesn't matter
They are faster if and only if you consider this kind of loops:
a = []
for element in something:
a.append(element)
If the loop, like in this case, does not create a list, then the for loop will be faster.
By the way: some use map in the same way to perform side-effects. This again is wrong because you wont gain much in speed for the same reason as before and it fails completely in python3 where map returns an iterator and hence it will not execute the functions at all, thus making the code less portable.
#Bakuriu's answer offers good styling and efficiency suggestions. And true that each process gets a copy of the master process stack, hence the changes made by forked processes will not be reflected in address space of the master process unless you utilize some form of IPC (e.g. Queue, Pipe, Manager).
But the particular AttributeError: 'Case' object has no attribute 'var' error that you are getting has an additional reason, namely that your Case objects do not yet have the var attribute at the time you launch your processes. Instead, the var attribute is created in the setVars() method.
Your forked processes do indeed create the variable when they call setVars() (and actually even set it to 6), but alas, this change is only in the copies of Case objects, i.e. not reflected in the master process's memory space (where the variable still does not exist).
To see what I mean, change your Case class to this:
class Case(object):
def __init__(self, name):
self.name = name
self.var = 7 # Create var in the constructor.
def setVars(self, var):
self.var = var
By adding the var member variable in the constructor, your master process will have access to it. Of course, the changes in the forked processes will still not be reflected in the master process, but at least you don't get the error:
a 7
b 7
c 7
Hope this sheds light on what's going on. =)
SOLUTION:
The least-intrusive (to original code) thing to do is use ctypes object from shared memory:
from multiprocessing import Value
class Case(object):
def __init__(self, name):
self.name = name
self.var = Value('i', 7) # Use ctypes "int" from shared memory.
def setVars(self, var):
self.var.value = var # Set the variable's "value" attribute.
and change your main() to print c.var.value:
for c in e.cases: print c.name, c.var.value # Print the "value" attribute.
Now you have the desired output:
a 6
b 6
c 6
I'm writing a program that uses genetic techniques to evolve equations.
I want to be able to submit the function 'mainfunc' to the Parallel Python 'submit' function.
The function 'mainfunc' calls two or three methods defined in the Utility class.
They instantiate other classes and call various methods.
I think what I want is all of it in one NAMESPACE.
So I've instantiated some (maybe it should be all) of the classes inside the function 'mainfunc'.
I call the Utility method 'generate()'. If we were to follow it's chain of execution
it would involve all of the classes and methods in the code.
Now, the equations are stored in a tree. Each time a tree is generated, mutated or cross
bred, the nodes need to be given a new key so they can be accessed from a dictionary attribute of the tree. The class 'KeySeq' generates these keys.
In Parallel Python, I'm going to send multiple instances of 'mainfunc' to the 'submit' function of PP. Each has to be able to access 'KeySeq'. It would be nice if they all accessed the same instance of KeySeq so that none of the nodes on the returned trees had the same key, but I could get around that if necessary.
So: my question is about stuffing EVERYTHING into mainfunc.
Thanks
(Edit) If I don't include everything in mainfunc, I have to try to tell PP about dependent functions, etc by passing various arguements in various places. I'm trying to avoid that.
(late Edit) if ks.next() is called inside the 'generate() function, it returns the error 'NameError: global name 'ks' is not defined'
class KeySeq:
"Iterator to produce sequential \
integers for keys in dict"
def __init__(self, data = 0):
self.data = data
def __iter__(self):
return self
def next(self):
self.data = self.data + 1
return self.data
class One:
'some code'
class Two:
'some code'
class Three:
'some code'
class Utilities:
def generate(x):
'___________'
def obfiscate(y):
'___________'
def ruminate(z):
'__________'
def mainfunc(z):
ks = KeySeq()
one = One()
two = Two()
three = Three()
utilities = Utilities()
list_of_interest = utilities.generate(5)
return list_of_interest
result = mainfunc(params)
It's fine to structure your program that way. A lot of command line utilities follow the same pattern:
#imports, utilities, other functions
def main(arg):
#...
if __name__ == '__main__':
import sys
main(sys.argv[1])
That way you can call the main function from another module by importing it, or you can run it from the command line.
If you want all of the instances of mainfunc to use the same KeySeq object, you can use the default parameter value trick:
def mainfunc(ks=KeySeq()):
key = ks.next()
As long as you don't actually pass in a value of ks, all calls to mainfunc will use the instance of KeySeq that was created when the function was defined.
Here's why, in case you don't know: A function is an object. It has attributes. One of its attributes is named func_defaults; it's a tuple containing the default values of all of the arguments in its signature that have defaults. When you call a function and don't provide a value for an argument that has a default, the function retrieves the value from func_defaults. So when you call mainfunc without providing a value for ks, it gets the KeySeq() instance out of the func_defaults tuple. Which, for that instance of mainfunc, is always the same KeySeq instance.
Now, you say that you're going to send "multiple instances of mainfunc to the submit function of PP." Do you really mean multiple instances? If so, the mechanism I'm describing won't work.
But it's tricky to create multiple instances of a function (and the code you've posted doesn't). For example, this function does return a new instance of g every time it's called:
>>> def f():
def g(x=[]):
return x
return g
>>> g1 = f()
>>> g2 = f()
>>> g1().append('a')
>>> g2().append('b')
>>> g1()
['a']
>>> g2()
['b']
If I call g() with no argument, it returns the default value (initially an empty list) from its func_defaults tuple. Since g1 and g2 are different instances of the g function, their default value for the x argument is also a different instance, which the above demonstrates.
If you'd like to make this more explicit than using a tricky side-effect of default values, here's another way to do it:
def mainfunc():
if not hasattr(mainfunc, "ks"):
setattr(mainfunc, "ks", KeySeq())
key = mainfunc.ks.next()
Finally, a super important point that the code you've posted overlooks: If you're going to be doing parallel processing on shared data, the code that touches that data needs to implement locking. Look at the callback.py example in the Parallel Python documentation and see how locking is used in the Sum class, and why.
Your concept of classes in Python is not sound I think. Perhaps, it would be a good idea to review the basics. This link will help.
Python Basics - Classes