Sending socket through multiprocessing Queue Python 3.6 - python

I am trying to implement a generic "timeout" function which allows me to send a function to be run, and if it doesn't complete after a certain amount of time, kill it. Here is the current implementation:
from multiprocessing import Process, Queue
import multiprocessing as mp
mp.allow_connection_pickling()
from fn.monad import Full, Empty
import traceback
def timeout(timeout, func, args=()):
'''
Calls function, and if it times out returns an Empty()
:param timeout: int | The amount of time to wait for the function
:param func: () => Any | The function to call (must take no arguments)
:param queue: Queue | The multiprocessing queue to put the result into
:return Option | Full(Result) if we get the result, Empty() if it times out
'''
queue = Queue()
p = Process(target=_helper_func, args=(func, queue, args,))
p.daemon = True
p.start()
p.join(timeout)
if p.is_alive() or queue.empty():
p.terminate()
return Empty()
else:
out = queue.get()
# if 'rebuild_handle' in dir(out):
# out.rebuild_handle()
return Full(out)
def _helper_func(func, queue, args):
try:
func(*args, queue)
except Exception as e:
pass
The function must put the "return value" into the multiprocessing queue. However, when timeout is run, and a socket is put into the queue, I get the following error.
Traceback (most recent call last):
File "socket_test.py", line 27, in <module>
print(q.get())
File "/usr/lib/python3.6/multiprocessing/queues.py", line 113, in get
return _ForkingPickler.loads(res)
File "/usr/lib/python3.6/multiprocessing/reduction.py", line 239, in _rebuild_socket
fd = df.detach()
File "/usr/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/usr/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 487, in Client
c = SocketClient(address)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient
s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused
I have tried various previous stack overflow posts, such as the following: Python3 Windows multiprocessing passing socket to process
Please let me know if you know of a solution to this issue, as it is throwing a giant wrench into my code. Thanks!

Related

Python Multiprocessing remote manager: failure to connect when using BaseManager.start() instead of .server().serve_forever()

I'm following an example from the official python documentation here:
I'm trying to make it so that I spin up a BaseManager at localhost:50000 which registers a queue, then a bunch of workers that read from that queue. I can get it to work if I use the method in the official python docs that has three files (one server, one put client, one get client), but I can't get it to work all in one file where I spawn the clients via multiprocessing.Process(target=...).
Here is my full code. The issue is that when the clients attempt to connect they get a ConnectionRefused (stack trace below)
from typing import Dict, Optional, Any, List
from multiprocessing.managers import BaseManager, SyncManager
import time
import multiprocessing as mp
import argparse
import queue
q = queue.Queue()
def parse_args() -> argparse.Namespace:
a = argparse.ArgumentParser()
a.add_argument("--n-workers", type=int, default=2)
return a.parse_args()
def run_queue_server(args: argparse.Namespace) -> None:
class QueueManager(BaseManager): pass
QueueManager.register("get_queue", lambda: q)
m = QueueManager(address=('', 50000), authkey=b'abracadabra')
m.start()
def _worker_process(worker_uid: str) -> None:
class QueueManager(BaseManager): pass
QueueManager.register("get_queue")
m = QueueManager(address=('', 50000), authkey=b'abracadabra')
# <-- This line fails with ConnectionRefused -->
m.connect()
queue: queue.Queue = m.get_queue()
def spawn_workers(args: argparse.Namespace) -> None:
time.sleep(2)
worker_procs = dict()
for i in range(args.n_workers):
print(f"Spawning worker process {i}..")
p = mp.Process(target=_worker_process, args=[str(i)])
p.start()
worker_procs[str(i)] = p
def main():
args = parse_args()
run_queue_server(args)
spawn_workers(args)
while True:
time.sleep(1)
if __name__ == '__main__':
main()
The error is here
$ python minimal.py
Spawning worker process 0..
Spawning worker process 1..
Process Process-2:
Process Process-3:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 313, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "minimal.py", line 26, in _worker_process
m.connect()
File "/usr/lib/python3.8/multiprocessing/managers.py", line 548, in connect
conn = Client(self._address, authkey=self._authkey)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 502, in Client
c = SocketClient(address)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 629, in SocketClient
s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 313, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "minimal.py", line 26, in _worker_process
m.connect()
File "/usr/lib/python3.8/multiprocessing/managers.py", line 548, in connect
conn = Client(self._address, authkey=self._authkey)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 502, in Client
c = SocketClient(address)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 629, in SocketClient
s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused
However, If I spawn another process that targets the manager creatoin step and run m.get_server().serve_forever() then I do not get the connection-refused error, see this code below which works
from typing import Dict, Optional, Any, List
from multiprocessing.managers import BaseManager, SyncManager
import time
import multiprocessing as mp
import argparse
import queue
q = queue.Queue()
def parse_args() -> argparse.Namespace:
a = argparse.ArgumentParser()
a.add_argument("--n-workers", type=int, default=2)
return a.parse_args()
def run_queue_server(args: argparse.Namespace) -> None:
class QueueManager(BaseManager): pass
QueueManager.register("get_queue", lambda: q)
m = QueueManager(address=('', 50000), authkey=b'abracadabra')
#m.start()
# This works!!
m.get_server().serve_forever()
def _worker_process(worker_uid: str) -> None:
class QueueManager(BaseManager): pass
QueueManager.register("get_queue")
m = QueueManager(address=('', 50000), authkey=b'abracadabra')
m.connect()
queue: queue.Queue = m.get_queue()
print(f"Gotten queue: {queue}")
def spawn_workers(args: argparse.Namespace) -> None:
time.sleep(2)
worker_procs = dict()
for i in range(args.n_workers):
print(f"Spawning worker process {i}..")
p = mp.Process(target=_worker_process, args=[str(i)])
p.start()
worker_procs[str(i)] = p
def main():
args = parse_args()
#run_queue_server(args)
# I don't want to run this in another process?
mp.Process(target=run_queue_server, args=(args,)).start()
spawn_workers(args)
while True:
time.sleep(1)
if __name__ == '__main__':
main()
The thing is, I don't want to have to start another process to be the manager.. why can't it just be this process?
Edit - I'm an idiot who was programming too late into the night. The issue is that my run_queue_server when calling m.start() and returning was... then losing the reference to the QueueManager which I'm sure caused it to be garbage collected.
All I did was change
def run_queue_server(args: argparse.Namespace) -> None:
class QueueManager(BaseManager): pass
QueueManager.register("get_queue", lambda: q)
m = QueueManager(address=('', 50000), authkey=b'abracadabra')
m.start()
return m
and change the caller to accept the return value and everything works..

How to terminate loop.run_in_executor with ProcessPoolExecutor gracefully?

How to terminate loop.run_in_executor with ProcessPoolExecutor gracefully? Shortly after starting the program, SIGINT (ctrl + c) is sent.
def blocking_task():
sleep(3)
async def main():
exe = concurrent.futures.ProcessPoolExecutor(max_workers=4)
loop = asyncio.get_event_loop()
tasks = [loop.run_in_executor(exe, blocking_task) for i in range(3)]
await asyncio.gather(*tasks)
if __name__ == "__main__":
try:
asyncio.run(main())
except KeyboardInterrupt:
print('ctrl + c')
With max_workers equal or lesser than the the number of tasks everything works. But if max_workers is greater, the output of the above code is as follows:
Process ForkProcess-4:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.8/concurrent/futures/process.py", line 233, in _process_worker
call_item = call_queue.get(block=True)
File "/usr/lib/python3.8/multiprocessing/queues.py", line 97, in get
res = self._recv_bytes()
File "/usr/lib/python3.8/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
buf = self._recv(4)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
KeyboardInterrupt
ctrl + c
I would like to catch the exception (KeyboardInterrupt) only once and ignore or mute the other exception(s) in the process pool, but how?
Update extra credit:
Can you explain (the reason for) the multi exception?
Does adding a signal handler work on Windows?
If not, is there a solution that works without a signal handler?
You can use the initializer parameter of ProcessPoolExecutor to install a handler for SIGINT in each process.
Update:
On Unix, when the process is created, it becomes a member of the process group of its parent. If you are generating the SIGINT with Ctrl+C, then the signal is being sent to the entire process group.
import asyncio
import concurrent.futures
import os
import signal
import sys
from time import sleep
def handler(signum, frame):
print('SIGINT for PID=', os.getpid())
sys.exit(0)
def init():
signal.signal(signal.SIGINT, handler)
def blocking_task():
sleep(15)
async def main():
exe = concurrent.futures.ProcessPoolExecutor(max_workers=5, initializer=init)
loop = asyncio.get_event_loop()
tasks = [loop.run_in_executor(exe, blocking_task) for i in range(2)]
await asyncio.gather(*tasks)
if __name__ == "__main__":
try:
asyncio.run(main())
except KeyboardInterrupt:
print('ctrl + c')
Ctrl-C shortly after start:
^CSIGINT for PID= 59942
SIGINT for PID= 59943
SIGINT for PID= 59941
SIGINT for PID= 59945
SIGINT for PID= 59944
ctrl + c

How to construct proxy objects from multiprocessing.managers.SyncManager?

I have long running file I/O tasks which I'd like to be able to move into a daemon/server process. A CLI tool would be used to queue new jobs to run, query the status of running jobs, and wait for individual jobs. Python's multiprocessing.managers looks like a nice simple way to handle the IPC. I'd like to be able to construct a SyncManager.Event for the client to wait on without blocking the server, but attempting to do so results in triggers a "server not yet started" assertion. Ironically this assertion gets sent from the server to the client, so obviously the server is started, somewhere.
Here's the minimal example:
#!/usr/bin/env python3
import time
import sys
import concurrent.futures
from multiprocessing.managers import SyncManager
def do_work(files):
"""Simulate doing some work on a set of files."""
print(f"Starting work for {files}.")
time.sleep(2)
print(f"Finished work for {files}.")
# Thread pool to do work in.
pool = concurrent.futures.ProcessPoolExecutor(max_workers=1)
class Job:
job_counter = 1
def __init__(self, files):
"""Setup a job and queue work for files on our thread pool."""
self._job_number = self.job_counter
Job.job_counter += 1
print(f"manager._state.value = {manager._state.value}")
self._finished_event = manager.Event()
print(f"Queued job {self.number()}.")
future = pool.submit(do_work, files)
future.add_done_callback(lambda f : self._finished_event.set())
def number(self):
return self._job_number
def event(self):
"""Get an event which can be waited on for the job to complete."""
return self._finished_event
class MyManager(SyncManager):
pass
MyManager.register("Job", Job)
manager = MyManager(address=("localhost", 16000), authkey=b"qca-authkey")
if len(sys.argv) > 1 and sys.argv[1] == "server":
manager.start()
print(f"Manager listening at {manager.address}.")
while True:
time.sleep(1)
else:
manager.connect()
print(f"Connected to {manager.address}.")
job = manager.Job(["a", "b", "c"])
job.event().wait()
print("Done")
If I run the client I see:
$ ./mp-manager.py
Connected to ('localhost', 16000).
Traceback (most recent call last):
File "./mp-manager.py", line 54, in <module>
job = manager.Job(["a", "b", "c"])
File "/usr/lib/python3.8/multiprocessing/managers.py", line 740, in temp
token, exp = self._create(typeid, *args, **kwds)
File "/usr/lib/python3.8/multiprocessing/managers.py", line 625, in _create
id, exposed = dispatch(conn, None, 'create', (typeid,)+args, kwds)
File "/usr/lib/python3.8/multiprocessing/managers.py", line 91, in dispatch
raise convert_to_error(kind, result)
multiprocessing.managers.RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/managers.py", line 210, in handle_request
result = func(c, *args, **kwds)
File "/usr/lib/python3.8/multiprocessing/managers.py", line 403, in create
obj = callable(*args, **kwds)
File "./mp-manager.py", line 24, in __init__
self._finished_event = manager.Event()
File "/usr/lib/python3.8/multiprocessing/managers.py", line 740, in temp
token, exp = self._create(typeid, *args, **kwds)
File "/usr/lib/python3.8/multiprocessing/managers.py", line 622, in _create
assert self._state.value == State.STARTED, 'server not yet started'
AssertionError: server not yet started
---------------------------------------------------------------------------
The server output is:
$ ./mp-manager.py server
Manager listening at ('127.0.0.1', 16000).
manager._state.value = 0

How do I prevent Pyro4 from closing the connection after COMMTIMEOUT

I have the following situation; My Pyro4 project has a server and a client. The server contains a method which need to call 2 callbacks on the same callback object. So class Callback has two callback methods: Callback() and SecondCallback(). There is some delay between the calls of those callback methods. I've simulated this delay in my example by calling time.sleep.
I need to set a timeout on Pyro4 (Pyro4.config.COMMTIMEOUT), because without one, the Pyro4 daemon will never break out of the requestLoop method. This works perfectly when calling just one callback method, but when you have to call a second callback method, the Pyro4 callback daemon closes the connection after the first callback method was called + the timeout.
I've tried to set the timeout to a bigger amount, but this timeout is also the time the requestLoop method blocks untill it processes the loopCondition.
An example script which demonstrates my issue is included below. You need to start it by starting a server after you started the Pyro4 nameserver:
python -m Pyro4.naming
python test.py -s
And afterwards starting a client in a new cmd window:
python test.py
Test.py
import Pyro4, time
from argparse import ArgumentParser
ip = "127.0.0.1"
class Server:
def __init__(self):
pass
def ActionOne(self):
return "Foo"
def ActionTwo(self):
return "Bar"
#Pyro4.oneway
def ActionThree(self, callback):
time.sleep(4)
callback.Callback()
time.sleep(3)
callback.SecondCallback()
class Callback:
def __init__(self):
self.Executed = False
pass
def Callback(self):
print "FooBar"
def SecondCallback(self):
print "raBooF"
self.Executed = True
def loopWhile(condition):
while condition:
time.sleep(.1)
if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument("--server", "-s", action="store_true")
args = parser.parse_args()
if(args.server):
print "Server"
daemon = Pyro4.core.Daemon(host=ip)
uri = daemon.register(Server())
ns = Pyro4.naming.locateNS(host=ip)
ns.register("server", uri)
daemon.requestLoop()
pass
else:
print "Client"
Pyro4.config.COMMTIMEOUT = .5
ns = Pyro4.naming.locateNS(host=ip)
serverUri = ns.lookup("server")
proxy = Pyro4.core.Proxy(serverUri)
print proxy.ActionOne()
print proxy.ActionTwo()
daemon = Pyro4.core.Daemon(host=ip)
callback = Callback()
daemon.register(callback)
proxy.ActionThree(callback)
daemon.requestLoop(lambda: not callback.Executed)
print "FINISHED"
The result of this script:
Server:
Server
Exception in thread Thread-17:
Traceback (most recent call last):
File "C:\Program Files (x86)\IronPython 2.7\Lib\threading.py", line 552, in _T
hread__bootstrap_inner
self.run()
File "C:\Program Files (x86)\IronPython 2.7\Lib\threading.py", line 505, in ru
n
self.__target(*self.__args, **self.__kwargs)
File "test.py", line 22, in ActionThree
callback.SecondCallback()
File "C:\Program Files (x86)\IronPython 2.7\lib\site-packages\Pyro4\core.py",
line 171, in __call__
return self.__send(self.__name, args, kwargs)
File "C:\Program Files (x86)\IronPython 2.7\lib\site-packages\Pyro4\core.py",
line 410, in _pyroInvoke
msg = message.Message.recv(self._pyroConnection, [message.MSG_RESULT], hmac_
key=self._pyroHmacKey)
File "C:\Program Files (x86)\IronPython 2.7\lib\site-packages\Pyro4\message.py
", line 168, in recv
msg = cls.from_header(connection.recv(cls.header_size))
File "C:\Program Files (x86)\IronPython 2.7\lib\site-packages\Pyro4\socketutil
.py", line 448, in recv
return receiveData(self.sock, size)
File "C:\Program Files (x86)\IronPython 2.7\lib\site-packages\Pyro4\socketutil
.py", line 190, in receiveData
raise ConnectionClosedError("receiving: connection lost: " + str(x))
ConnectionClosedError: receiving: connection lost: [Errno 10022] A request to se
nd or receive data was disallowed because the socket is not connected and (when
sending on a datagram socket using a sendto call) no address was supplied
Client:
Client
Foo
Bar
FooBar
My final question is: How do I prevent Pyro4 from closing the connection after COMMTIMEOUT expired when a second callback is called?
I hope all this information is clear enough to understand.
Thank you for your help.
For future reference:
I was able to restart the connection to the callback by calling:
callback._pyroReconnect()
Just before calling the second callback method
Your question is a bit strange, to be honest.
On the one hand you're configuring COMMTIMEOUT to a (very low) value of 0.5 seconds, thereby enabling the concept of timeouts. On the other hand you're asking to not get a timeout that closes the connection on the server. What is it you want?
But yeah, you can use _pyroReconnect to reconnect a proxy that has been disconnected. Also see the readmes and code of the autoreconnect and disconnects examples that come with Pyro4.

build a simple remote dispatcher using multiprocessing.Managers

Consider the following code :
Server :
import sys
from multiprocessing.managers import BaseManager, BaseProxy, Process
def baz(aa) :
l = []
for i in range(3) :
l.append(aa)
return l
class SolverManager(BaseManager): pass
class MyProxy(BaseProxy): pass
manager = SolverManager(address=('127.0.0.1', 50000), authkey='mpm')
manager.register('solver', callable=baz, proxytype=MyProxy)
def serve_forever(server):
try :
server.serve_forever()
except KeyboardInterrupt:
pass
def runpool(n):
server = manager.get_server()
workers = []
for i in range(int(n)):
Process(target=serve_forever, args=(server,)).start()
if __name__ == '__main__':
runpool(sys.argv[1])
Client :
import sys
from multiprocessing.managers import BaseManager, BaseProxy
import multiprocessing, logging
class SolverManager(BaseManager): pass
class MyProxy(BaseProxy): pass
def main(args) :
SolverManager.register('solver')
m = SolverManager(address=('127.0.0.1', 50000), authkey='mpm')
m.connect()
print m.solver(args[1])._getvalue()
if __name__ == '__main__':
sys.exit(main(sys.argv))
If I run the server using only one process as python server.py 1
then the client works as expected. But if I spawn two processes (python server.py 2) listening for connections, I get a nasty error :
$python client.py ping
Traceback (most recent call last):
File "client.py", line 24, in <module>
sys.exit(main(sys.argv))
File "client.py", line 21, in main
print m.solver(args[1])._getvalue()
File "/usr/lib/python2.6/multiprocessing/managers.py", line 637, in temp
authkey=self._authkey, exposed=exp
File "/usr/lib/python2.6/multiprocessing/managers.py", line 894, in AutoProxy
incref=incref)
File "/usr/lib/python2.6/multiprocessing/managers.py", line 700, in __init__
self._incref()
File "/usr/lib/python2.6/multiprocessing/managers.py", line 750, in _incref
dispatch(conn, None, 'incref', (self._id,))
File "/usr/lib/python2.6/multiprocessing/managers.py", line 79, in dispatch
raise convert_to_error(kind, result)
multiprocessing.managers.RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.6/multiprocessing/managers.py", line 181, in handle_request
result = func(c, *args, **kwds)
File "/usr/lib/python2.6/multiprocessing/managers.py", line 402, in incref
self.id_to_refcount[ident] += 1
KeyError: '7fb51084c518'
---------------------------------------------------------------------------
My idea is pretty simple. I want to create a server that will spawn a number of workers that will share the same socket and handle requests independently. Maybe I'm using the wrong tool here ?
The goal is to build a 3-tier structure where all requests are handled via an http server and then dispatched to nodes sitting in a cluster and from nodes to workers via the multiprocessing managers...
There is one public server, one node per machine and x number of workers on each machine depending on the number of cores... I know I can use a more sophisticated library, but for such a simple task (I'm just prototyping here) I would just use the multiprocessing library... Is this possible or I should explore directly other solutions ? I feel I'm very close to have something working here ... thanks.
You're trying to invent a wheel, many have invented before.
It sounds to me that you're looking for task queue where your server dispatches tasks to, and your workers execute this tasks.
I would recommend you to have a look at Celery.

Categories