Using Python Server Process Manager() with Value() - python

I'm using a Server Process to handle shared memory in my program.
manager = multiprocessing.Manager()
tasksRemaining = manager.list()
sampleFileList = manager.list()
sortedSamples = manager.Value(c_int)
I get the following error when trying to declare sortedSamples:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/managers.py", line 207, in handle_request
result = func(c, *args, **kwds)
File "/usr/lib/python2.7/multiprocessing/managers.py", line 386, in create
obj = callable(*args, **kwds)
TypeError: __init__() takes at least 3 arguments (2 given)
According to the documentation at https://docs.python.org/2/library/multiprocessing.html#sharing-state-between-processes, Manager() supports list, dict, Namespace, Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Queue, Value and Array.
Whenever I do this outside of a manager, it works fine, such as:
sortedSamples = multiprocessing.Value(c_int)
What seems to be the problem?

Seems like when you use a manager you need to provide the actual value as well:
Value(typecode, value) Create an object with a writable value
attribute and return a proxy for it.\
(See the documentation for SyncManager)
Try doing e.g.:
value = manager.Value('i', 0)

Related

How do I uses Python's multiprocessing.Value method with .map()?

I'm testing some code using multiprocessing to try to understand it, but I'm struggling to get the .Value to work. What am I doing wrong, it says p doesn't exist?
here's my code:
from multiprocessing import Pool, Value
from ctypes import c_int
if __name__ =="__main__":
p=Value(c_int,1)
def addone(a):
print(a)
with p.get_lock():
print(p.value)
p.value += 1
if __name__ =="__main__":
with Pool() as po:
po.map(addone,range(19))
print(p.value)
And I get this error:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\(I removed this)\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "C:\Users\(I removed this)\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 48, in mapstar
return list(map(*args))
File "C:\Users\(I removed this)\Desktop\Python\make\multiTest.py", line 10, in addone
with p.get_lock():
NameError: name 'p' is not defined
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File (I removed this), line 15, in <module>
po.map(addone,range(19))
File "C:\Users\(I removed this)\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Users\(I removed this)\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 771, in get
raise self._value
NameError: name 'p' is not defined
What should I do?
You only want a single instance of the shared value, p, to be operated on by the pool processes executing your worker function, addone. The way to do this is to use the initalizer and initargs arguments of the multiprocessing.pool.Pool class to initialize a global variable, p, in each pool process with the shared value created by the main process. Passing explicitly p to a multiprocessing pool worker function as an additional argument will not work; this will result in the cryptic error: RuntimeError: Synchronized objects should only be shared between processes through inheritance.
def init_pool_processes(shared_value):
global p
p = shared_value
def addone(a):
print(a)
with p.get_lock():
print(p.value)
p.value += 1
if __name__ =="__main__":
from multiprocessing import Pool, Value
from ctypes import c_int
p = Value(c_int, 1)
with Pool(initializer=init_pool_processes, initargs=(p,)) as po:
po.map(addone, range(19))
print(p.value)

Error using pexpect and multiprocessing? error "TypError: cannot serialize '_io.TextIOWrapper' object"

I have a Python 3.7 script on a Linux machine where I am trying to run a function in multiple threads, but when I try I receive the following error:
Traceback (most recent call last):
File "./test2.py", line 43, in <module>
pt.ping_scanx()
File "./test2.py", line 39, in ping_scanx
par = Parallel(function=self.pingx, parameter_list=list, thread_limit=10)
File "./test2.py", line 19, in __init__
self._x = self._pool.starmap(function, parameter_list, chunksize=1)
File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 276, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 431, in _handle_tasks
put(task)
File "/usr/local/lib/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/usr/local/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
TypeError: cannot serialize '_io.TextIOWrapper' object
This is the sample code that I am using to demonstrate the issue:
#!/usr/local/bin/python3.7
from multiprocessing import Pool
import pexpect # Used to run SSH for sessions
class Parallel:
def __init__(self, function, parameter_list, thread_limit=4):
# Create new thread to hold our jobs
self._pool = Pool(processes=thread_limit)
self._x = self._pool.starmap(function, parameter_list, chunksize=1)
class PingTest():
def __init__(self):
self._pex = None
def connect(self):
self._pex = pexpect.spawn("ssh snorton#127.0.0.1")
def pingx(self, target_ip, source_ip):
print("PING {} {}".format(target_ip, source_ip))
def ping_scanx(self):
self.connect()
list = [['8.8.8.8', '96.53.16.93'],
['8.8.8.8', '96.53.16.93']]
par = Parallel(function=self.pingx, parameter_list=list, thread_limit=10)
pt = PingTest()
pt.ping_scanx()
If I don't include the line with pexpect.spawn, the error doesn't happen. Can someone explain why I am getting the error, and suggest a way to fix it?
With multiprocessing.Pool you're actually calling the function as separate processes, not threads. Processes cannot share Python objects unless they are serialized first before transmitting them to each other via inter-processing communication channels, which is what multiprocessing.Pool does for you behind the scene using pickle as the serializer. Since pexpect.spawn opens a terminal device as a file-like TextIOWrapper object, and you're storing the returning object in the PingTest instance and then passing the bound method self.pingx to Pool.starmap, it will try to serialize self, which contains the pexpect.spawn object in the _pex attribute that unfortunately cannot be serialized because TextIOWrapper does not support serialization.
Since your function is I/O-bound, you should use threading instead via the multiprocessing.dummy module for more efficient parallelization and more importantly in this case, to allow the pexpect.spawn object to be shared across the threads, with no need for serialization.
Change:
from multiprocessing import Pool
to:
from multiprocessing.dummy import Pool
Demo: https://repl.it/#blhsing/WiseYoungExperiments

Python multiprocessing pool apply_async error

I'm trying to evaluate a number of processes in a multiprocessing pool but keep running into errors and I can't work out why... There's a simplified version of the code below:
class Object_1():
def add_godd_spd_column()
def calculate_correlations(arg1, arg2, arg3):
return {'a': 1}
processes = {}
pool = Pool(processes=6)
for i in range(1, 10):
processes[i] = pool.apply_async(calculate_correlations,
args=(arg1, arg2, arg3,))
correlations = {}
for i in range(0, 10):
correlations[i] = processes[i].get()
This returns the following error:
Traceback (most recent call last):
File "./02_results.py", line 116, in <module>
correlations[0] = processes[0].get()
File "/opt/anaconda3/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
File "/opt/anaconda3/lib/python3.5/multiprocessing/pool.py", line 385, in
_handle_tasks
put(task)
File "/opt/anaconda3/lib/python3.5/multiprocessing/connection.py", line 206, in send
self._send_bytes(ForkingPickler.dumps(obj))
File "/opt/anaconda3/lib/python3.5/multiprocessing/reduction.py", line 50, in dumps
cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'SCADA.add_good_spd_column.<locals>.calculate_correlations
When I call the following:
correlations[0].successful()
I get the following error:
Traceback (most recent call last):
File "./02_results.py", line 116, in <module>
print(processes[0].successful())
File "/opt/anaconda3/lib/python3.5/multiprocessing/pool.py", line 595, in
successful
assert self.ready()
AssertionError
Is this because the process isn't actually finished before the .get() is called? The function being evaluated just returns a dictionary which should definitely be pickle-able...
Cheers,
The error is occurring because pickling a function nested in another function is not supported, and multiprocessing.Pool needs to pickle the function you pass as an argument to apply_async in order to execute it in a worker process. You have to move the function to the top level of the module, or make it an instance method of the class. Keep in mind that if you make it an instance method, the instance of the class itself must also be picklable.
And yes, the assertion error when calling successful() occurs because you're calling it before a result is ready. From the docs:
successful()
Return whether the call completed without raising an exception. Will raise AssertionError if the result is not ready.

Python Manager access dict through registry

I have a manager with one dict proxy. I want to access it from a different process. For brevity sample code only uses one process but assume we don't have a direct reference to it (as if the other process was created from the first one).
I register dict proxy dict1 under typeName 'd1' so I can access it remotely. I check the manager's registry and see the dict proxy there.
But when I try to access it by calling d1 I get a KeyError from the remote. I know I could create a method to access my dict, but there must be a direct way to access the dict proxy according to the documentation (python 2, chapter 16.6).
from multiprocessing.managers import SyncManager
from sys import stderr
proxy = SyncManager()
proxy.start()
dict1 = proxy.dict({'k1': 'blah'})
proxy.register('d1', dict1)
# next line shows that d1 is bound method of manager
print >>stderr, 'd1 is', proxy.d1
# next line produces a KeyError
print 'The value of d1 is', proxy.d1()
Run output
d1 is <bound method SyncManager.d1 of <multiprocessing.managers.SyncManager object at 0x10925dd10>>
Traceback (most recent call last):
File "/Users/adriancepleanu/PycharmProjects/AFX/UIproxy.py", line 32, in <module>
print 'The value of d1 is', proxy.d1()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 667, in temp
token, exp = self._create(typeid, *args, **kwds)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 567, in _create
id, exposed = dispatch(conn, None, 'create', (typeid,)+args, kwds)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 105, in dispatch
raise convert_to_error(kind, result)
multiprocessing.managers.RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 207, in handle_request
result = func(c, *args, **kwds)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 380, in create
self.registry[typeid]
KeyError: 'd1'
---------------------------------------------------------------------------
Found that the proxy server must be started after the call to register the new type; it appears that registrations made after the proxy was started will not function.

How to do multiprocessing with class instances

The program is designed to set up a process creation listener on various ips on the network. The code is:
import multiprocessing
from wmi import WMI
dynaIP = ['192.168.165.1','192.168.165.2','192.168.165.3','192.168.165.4',]
class WindowsMachine:
def __init__(self, ip):
self.ip = ip
self.connection = WMI(self.ip)
self.created_process = multiprocessing.Process(target = self.monitor_created_process, args = (self.connection,))
self.created_process.start()
def monitor_created_process(self, remote_pc):
while True:
created_process = remote_pc.Win32_Process.watch_for("creation")
print('Creation:',created_process.Caption, created_process.ProcessId, created_process.CreationDate)
return created_process
if __name__ == '__main__':
for ip in dynaIP:
print('Running', ip)
WindowsMachine(ip)
When running the code I get the following error:
Traceback (most recent call last):
File "U:/rmarshall/Work For Staff/ROB/_Python/__Python Projects Code/multipro_instance_stack_question.py", line 26, in <module>
WindowsMachine(ip)
File "U:/rmarshall/Work For Staff/ROB/_Python/__Python Projects Code/multipro_instance_stack_question.py", line 14, in __init__
self.created_process.start()
File "C:\Python33\lib\multiprocessing\process.py", line 111, in start
self._popen = Popen(self)
File "C:\Python33\lib\multiprocessing\forking.py", line 248, in __init__
dump(process_obj, to_child, HIGHEST_PROTOCOL)
File "C:\Python33\lib\multiprocessing\forking.py", line 166, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <class 'PyIID'>: attribute lookup builtins.PyIID failed
I have looked at other questions surrounding this issue but none I feel have clearly explained the work-around for pickling class instances.
Is anyone able to demonstrate this?
The problem here is that multiprocessing pickles the arguments to the process in order to pass them around. The WMI class is not pickleable, so cannot be passed as an argument to multiprocessing.Process.
If you want this to work, you can either:
switch to using threads instead of processes (see the threading module)
create the WMI object in monitor_created_process
I'd recommend the former, because there doesn't seem to be much use in creating full-blown processes.

Categories