Python Multi-Processing Global variable access issue - python

I am a beginner in multi processing Python. I have the following code but I am not able to understand why it is not working. Could someone explain me why this is not working.
from multiprocessing import Pool
db_conn = ""
db_sel = ""
def new_func(i):
global db_conn, db_sel
print(db_conn, db_sel)
if __name__ == "__main__":
db_conn = "56"
db_sel = "64"
with Pool() as p:
p.map(new_func, range(5))
I am expecting the output to print 56 64 5 times but the program prints space. Does multiprocessing wont read global variable ?

Related

Change global variable in threading

I am working with threading and multiprocessing and encounter an issue, here is my code:
xxx = []
def func1(*args):
#do something 1
global list
xxx.append('x')
def func2(*args2):
func1x = partial(func1, ...(*argsx)...)
with multiprocessing.Pool(3) as pool: pool.map(func1x, arg)
def func3(arg3):
While True:
try:
# do something 2
global list
print(list)
except:
continue
def main():
t1 = threading.Thread(target=func2, args=(*args2))
t2 = threading.Thread(target=func3, args=(args3))
t1.start()
t2.start()
main()
My code block run smoothly without any error with these nested multiprocessing and threading.
Problem is eventhough I tried to set a new global variable for xxx in func1(), the print command in func3() still print [] instead of ['x'] as I expected.
I use while loop to wait for the func1() to declares new variable of xxx, still not working.
How can I use new global variable everytime it's changed in a running thread?

Multiprocessing goes stuck after .get()

I'm trying to understand how multiprocessing works in python and having some issues.
This is the example:
import multiprocessing
def func():
return 1
p = multiprocessing.Pool()
result = p.apply_async(func).get()
When .get() function is called, the code is just stuck. What am I doing wrong?
You need to add these 2 lines inside if __name__ == "__main__":
so now your code must look like
import multiprocessing
def func():
return 1
if __name__ == "__main__":
p = multiprocessing.Pool()
result = p.apply_async(func).get()
If this is being called as an import, it will cause an infinite sequence of new processes. And adding them inside the if block works because if statement won't execute during import.
i do not have enough details to know exactly what is the problem.
but i have real strong guess that putting those lines:
p = multiprocessing.Pool()
result = p.apply_async(func).get()
inside a function will fix your problem.
try this:
import multiprocessing
def func():
return 1
def main():
p = multiprocessing.Pool()
result = p.apply_async(func).get()
print(result)
if __name__ == '__main__':
main()
tell me if it worked :)

Python - multiprocessing max # of processes

I would like to create and run at most N processes at once.
As soon as a process is finished, a new one should take its place.
The following code works(assuming Dostuff is the function to execute).
The problem is that I am using a loop and need time.sleep to allow
the processes to do their work. This is rather ineficient.
What's the best method for this task?
import time,multiprocessing
if __name__ == "__main__":
Jobs = []
for i in range(10):
while len(Jobs) >= 4:
NotDead = []
for Job in Jobs:
if Job.is_alive():
NotDead.append(Job)
Jobs = NotDead
time.sleep(0.05)
NewJob = multiprocessing.Process(target=Dostuff)
Jobs.append(NewJob)
NewJob.start()
After a bit of tinkering, I thought about creating new threads and then
launching my processes from these threads like so:
import threading,multiprocessing,time
def processf(num):
print("in process:",num)
now=time.clock()
while time.clock()-now < 2:
pass ##..Intensive processing..
def main():
z = [0]
lock = threading.Lock()
def threadf():
while z[0] < 20:
lock.acquire()
work = multiprocessing.Process(target=processf,args=(z[0],))
z[0] = z[0] +1
lock.release()
work.start()
work.join()
activet =[]
for i in range(2):
newt = threading.Thread(target=threadf)
activet.append(newt)
newt.start()
for i in activet:
i.join()
if __name__ == "__main__":
main()
This solution is better(doesn't slow down the launched processes), however,
I wouldn't really trust code that I wrote in a field I don't know..
I've had to use a list(z = [0]) since an integer was immutable.
Is there a way to embed processf into main()? I'd prefer not needing an additional
global variable. If I try to simply copy/paste the function inside, I get a nasty error(
Attribute error can't pickle local object 'main.(locals).processf')
Why not using concurrent.futures.ThreadPoolExecutor?
executor = ThreadPoolExecutor(max_workers=20)
res = execuror.submit(any_def)

Python multiprocessing silent failure with class

the following does not work using python 2.7.9, but also does not throw any error or exception. is there a bug, or can multiprocessing not be used in a class?
from multiprocessing import Pool
def testNonClass(arg):
print "running %s" % arg
return arg
def nonClassCallback(result):
print "Got result %s" % result
class Foo:
def __init__(self):
po = Pool()
for i in xrange(1, 3):
po.apply_async(self.det, (i,), callback=self.cb)
po.close()
po.join()
print "done with class"
po = Pool()
for i in xrange(1, 3):
po.apply_async(testNonClass, (i,), callback=nonClassCallback)
po.close()
po.join()
def cb(self, r):
print "callback with %s" % r
def det(self, M):
print "method"
return M+2
if __name__ == "__main__":
Foo()
running prints this:
done with class
running 1
running 2
Got result 1
Got result 2
EDIT: THis seems related, but it uses .map, while I specifically am needing to use apply_async which seems to matter in terms of how multiprocessing works with class instances (e.g. I dont have a picklnig error, like many other questions related to this) - Python how to do multiprocessing inside of a class?
Processes don't share state or memory by default, each process is an independent program. You need to either 1) use threading 2) use specific types capable of sharing state or 3) design your program to avoid shared state and rely on return values instead.
Update
You have two issues in your code, and one is masking the other.
1) You don't do anything with the result of the apply_async, I see that you're using callbacks, but you still need to catch the results and handle them. Because you're not doing this, you're not seeing the error caused by the second problem.
2) Methods of an object cannot be passed to other processes... I was really annoyed when I first discovered this, but there is an easy workaround. Try this:
from multiprocessing import Pool
def _remote_det(foo, m):
return foo.det(m)
class Foo:
def __init__(self):
pass
po = Pool()
results = []
for i in xrange(1, 3):
r = po.apply_async(_remote_det, (self, i,), callback=self.cb)
results.append(r)
po.close()
for r in results:
r.wait()
if not r.successful():
# Raises an error when not successful
r.get()
po.join()
print "done with class"
def cb(self, r):
print "callback with %s" % r
def det(self, M):
print "method"
return M+2
if __name__ == "__main__":
Foo()
I'm pretty sure it can be used in a class, but you need to protect the call to Foo inside of a clause like:
if name == "__main__":
so that it only gets called in the main thread. You may also have to alter the __init__ function of the class so that it accepts a pool as an argument instead of creating a pool.
I just tried this
from multiprocessing import Pool
#global counter
#counter = 0
class Foo:
def __init__(self, po):
for i in xrange(1, 300):
po.apply_async(self.det, (i,), callback=self.cb)
po.close()
po.join()
print( "foo" )
#print counter
def cb(self, r):
#global counter
#print counter, r
counter += 1
def det(self, M):
return M+2
if __name__ == "__main__":
po = Pool()
Foo(po)
and I think I know what the problem is now. Python isn't multi-threaded; global interpreter lock prevents that. Python is using multiple processes, instead, so the sub-processes in the Pool don't have access to the standard output of the main process.
The subprocesses also are unable to modify the variable counter because it exists in a different process (I tried running with the counter lines commented out and uncommented). Now, I do recall seeing cases where global state variables get altered by processes in the pool, so I don't know all of the minutiae. I do know that it is, in general, a bad idea to have global state variables like that, if for no other reason than they can lead to race conditions and/or wasted time with locks and waiting for access to the global variable.

Process containing object method doesn't recognize edit to object

I have the following situation process=Process(target=sample_object.run) I then would like to edit a property of the sample_object: sample_object.edit_property(some_other_object).
class sample_object:
def __init__(self):
self.storage=[]
def edit_property(self,some_other_object):
self.storage.append(some_other_object)
def run:
while True:
if len(self.storage) is not 0:
print "1"
#I know it's an infinite loop. It's just an example.
_______________________________________________________
from multiprocessing import Process
from sample import sample_object
from sample2 import some_other_object
class driver:
if __name__ == "__main__":
samp = sample_object()
proc = Process(target=samp.run)
proc.start()
while True:
some = some_other_object()
samp.edit_property(some)
#I know it's an infinite loop
The previous code never prints "1". How would I connect the Process to the sample_object so that an edit made to the object whose method Process is calling is recognized by the process? In other words, is there a way to get .run to recognize the change in sample_object ?
Thank you.
You can use multiprocessing.Manager to share Python data structures between processes.
from multiprocessing import Process, Manager
class A(object):
def __init__(self, storage):
self.storage = storage
def add(self, item):
self.storage.append(item)
def run(self):
while True:
if self.storage:
print 1
if __name__ == '__main__':
manager = Manager()
storage = manager.list()
a = A(storage)
p = Process(target=a.run)
p.start()
for i in range(10):
a.add({'id': i})
p.join()

Categories