Errno 9 using the multiprocessing module with Tornado in Python

Errno 9 using the multiprocessing module with Tornado in Python - python

For operations in my Tornado server that are expected to block (and can't be easily modified to use things like Tornado's asynchronous HTTP request client), I have been offloading the work to separate worker processes using the multiprocessing module. Specifically, I was using a multiprocessing Pool because it offers a method called apply_async, which works very well with Tornado since it takes a callback as one of its arguments.
I recently realized that a pool preallocates the number of processes, so if they all become blocking, operations that require a new process will have to wait. I do realize that the server can still take connections since apply_async works by adding things to a task queue, and is rather immediately finished, itself, but I'm looking to spawn n processes for n amount of blocking tasks I need to perform.
I figured that I could use the add_handler method for my Tornado server's IOLoop to add a handler for each new PID that I create to that IOLoop. I've done something similar before, but it was using popen and an arbitrary command. An example of such use of this method is here. I wanted to pass arguments into an arbitrary target Python function within my scope, though, so I wanted to stick with multiprocessing.
However, it seems that something doesn't like the PIDs that my multiprocessing.Process objects have. I get IOError: [Errno 9] Bad file descriptor. Are these processes restricted somehow? I know that the PID isn't available until I actually start the process, but I do start the process. Here's the source code of an example I've made that demonstrates this issue:
#!/usr/bin/env python
"""Creates a small Tornado program to demonstrate asynchronous programming.
Specifically, this demonstrates using the multiprocessing module."""
import tornado.httpserver
import tornado.ioloop
import tornado.web
import multiprocessing as mp
import random
import time
__author__ = 'Brian McFadden'
__email__ = 'brimcfadden#gmail.com'
def sleepy(queue):
"""Pushes a string to the queue after sleeping for 5 seconds.
This sleeping can be thought of as a blocking operation."""
time.sleep(5)
queue.put("Now I'm awake.")
return
def random_num():
"""Returns a string containing a random number.
This function can be used by handlers to receive text for writing which
facilitates noticing change on the webpage when it is refreshed."""
n = random.random()
return "<br />Here is a random number to show change: {0}".format(n)
class SyncHandler(tornado.web.RequestHandler):
"""Demonstrates handing a request synchronously.
It executes sleepy() before writing some more text and a random number to
the webpage. While the process is sleeping, the Tornado server cannot
handle any requests at all."""
def get(self):
q = mp.Queue()
sleepy(q)
val = q.get()
self.write(val)
self.write('<br />Brought to you by SyncHandler.')
self.write('<br />Try refreshing me and then the main page.')
self.write(random_num())
class AsyncHandler(tornado.web.RequestHandler):
"""Demonstrates handing a request asynchronously.
It executes sleepy() before writing some more text and a random number to
the webpage. It passes the sleeping function off to another process using
the multiprocessing module in order to handle more requests concurrently to
the sleeping, which is like a blocking operation."""
#tornado.web.asynchronous
def get(self):
"""Handles the original GET request (normal function delegation).
Instead of directly invoking sleepy(), it passes a reference to the
function to the multiprocessing pool."""
# Create an interprocess data structure, a queue.
q = mp.Queue()
# Create a process for the sleepy function. Provide the queue.
p = mp.Process(target=sleepy, args=(q,))
# Start it, but don't use p.join(); that would block us.
p.start()
# Add our callback function to the IOLoop. The async_callback wrapper
# makes sure that Tornado sends an HTTP 500 error to the client if an
# uncaught exception occurs in the callback.
iol = tornado.ioloop.IOLoop.instance()
print "p.pid:", p.pid
iol.add_handler(p.pid, self.async_callback(self._finish, q), iol.READ)
def _finish(self, q):
"""This is the callback for post-sleepy() request handling.
Operation of this function occurs in the original process."""
val = q.get()
self.write(val)
self.write('<br />Brought to you by AsyncHandler.')
self.write('<br />Try refreshing me and then the main page.')
self.write(random_num())
# Asynchronous handling must be manually finished.
self.finish()
class MainHandler(tornado.web.RequestHandler):
"""Returns a string and a random number.
Try to access this page in one window immediately after (<5 seconds of)
accessing /async or /sync in another window to see the difference between
them. Asynchronously performing the sleepy() function won't make the client
wait for data from this handler, but synchronously doing so will!"""
def get(self):
self.write('This is just responding to a simple request.')
self.write('<br />Try refreshing me after one of the other pages.')
self.write(random_num())
if __name__ == '__main__':
# Create an application using the above handlers.
application = tornado.web.Application([
(r"/", MainHandler),
(r"/sync", SyncHandler),
(r"/async", AsyncHandler),
])
# Create a single-process Tornado server from the application.
http_server = tornado.httpserver.HTTPServer(application)
http_server.listen(8888)
print 'The HTTP server is listening on port 8888.'
tornado.ioloop.IOLoop.instance().start()
Here is the traceback:
Traceback (most recent call last):
File "/usr/local/lib/python2.6/dist-packages/tornado/web.py", line 810, in _stack_context
yield
File "/usr/local/lib/python2.6/dist-packages/tornado/stack_context.py", line 77, in StackContext
yield
File "/usr/local/lib/python2.6/dist-packages/tornado/web.py", line 827, in _execute
getattr(self, self.request.method.lower())(*args, **kwargs)
File "/usr/local/lib/python2.6/dist-packages/tornado/web.py", line 909, in wrapper
return method(self, *args, **kwargs)
File "./process_async.py", line 73, in get
iol.add_handler(p.pid, self.async_callback(self._finish, q), iol.READ)
File "/usr/local/lib/python2.6/dist-packages/tornado/ioloop.py", line 151, in add_handler
self._impl.register(fd, events | self.ERROR)
IOError: [Errno 9] Bad file descriptor
The above code is actually modified from an older example that used process pools. I've had it saved for reference for my coworkers and myself (hence the heavy amount of comments) for quite a while. I constructed it in such a way so that I could open two small browser windows side-by-side to demonstrate to my boss that the /sync URI blocks connections while /async allows more connections. For the purposes of this question, all you need to do to reproduce it is try to access the /async handler. It errors immediately.
What should I do about this? How can the PID be "bad"? If you run the program, you can see it be printed to stdout.
For the record, I'm using Python 2.6.5 on Ubuntu 10.04. Tornado is 1.1.

add_handler takes a valid file descriptor, not a PID. As an example of what's expected, tornado itself uses add_handler normally by passing in a socket object's fileno(), which returns the object's file descriptor. PID is irrelevant in this case.

Check out this project:
https://github.com/vukasin/tornado-subprocess
it allows you to start arbitrary processes from tornado and get a callback when they finish (with access to their status, stdout and stderr).

Related

Python - improving logging to file and console using multiprocessing

I am trying to download a file on CAN bus using python-can. It involves sending data very quickly (in the order of 2-3 messages per millisecond). I am trying to log to file these messages without impacting the speed of sending. Doing the file I/O slows down the sending due to the logging overhead. I tried various methods to improve this (including using queues and reading the queue from another thread but this was not much better - possibly due to GIL). Most of these tests started with using the Python logging module and trying various handlers (QueueHandler/QueueListener, MemoryHandler, etc).
I've managed to make some significant improvements by moving the file I/O into a separate process. I initially ran into an issue with the overhead of sending data from one process to another - so I now buffer it. Now, instead of taking 150% longer with direct file I/O in the main process, I see ~20% increase in time.
I thought that, since this is running in another process, I could also print() the data to console (which I know is relative expensive) but I see a huge increase in the file download time.
What is happening that means the print() affects the main process even though it is running in a child process?
Code below:
file_logger_mp() is called from the main process and it starts the child process that does the logging. The main process then uses the log_hdl function to add a message to the buffer. When the buffer reaches a certain size (100) it is sent to the child process for logging to file or printing to console.
Device: rpi4. And the main process uses asyncio, in case that affects it.
def file_logger_mp(logger_name: str, log_file_pth: str):
conn_rec, conn_send = multiprocessing.Pipe()
log_hdl_c = MyLogger(conn_send)
log_hdl = log_hdl_c.log_hdl # This is used by main code to provide log messages to child process
listener = MyProcess(conn_rec, log_file_pth)
atexit.register(log_hdl_c.final_flush, listener)
listener.start() # Start the child process
return log_hdl, listener
class MyLogger():
def __init__(self, conn_send) -> None:
self.buffer = []
self.conn_send = conn_send
def log_hdl(self, msg):
self.buffer.append(msg)
if len(self.buffer) > 100:
self.conn_send.send(self.buffer)
self.buffer.clear()
def final_flush(self, listener):
self.conn_send.send(self.buffer)
listener.terminate()
class MyProcess(multiprocessing.Process):
def __init__(self, queue, f_hdl):
multiprocessing.Process.__init__(self)
self.exit = multiprocessing.Event()
self.queue = queue
self.f_hdl = f_hdl
def run(self):
f = open(self.f_hdl, "w+")
while not self.exit.is_set():
try:
record = self.queue.recv()
for msg in record:
output = str(msg)
f.write(output+'\n')
print(output) # This `print()` causes large delays to main process?!
record.clear()
except Exception:
import sys, traceback
print('Whoops! Problem:', file=sys.stderr)
traceback.print_exc(file=sys.stderr)
for msg in record: # Flush any pending records before finishing
f.write(str(msg)+'\n')
f.close()
def terminate(self):
self.exit.set()

Sharing asyncio objects between processes

I am working with both the asyncio and the multiprocessing library to run two processes, each with one server instance listening on different ports for incoming messages.
To identify each client, I want to share a dict between the two processes to update the list of known clients. To achieve this, I decided to use a Tuple[StreamReader, StreamWriter] lookup key which is assigned a Client object for this connection.
However, as soon as I insert or simply access the shared dict, the program crashes with the following error message:
Task exception was never retrieved
future: <Task finished name='Task-5' coro=<GossipServer.handle_client() done, defined at /home/croemheld/Documents/network/server.py:119> exception=AttributeError("Can't pickle local object 'WeakSet.__init__.<locals>._remove'")>
Traceback (most recent call last):
File "/home/croemheld/Documents/network/server.py", line 128, in handle_client
if not await self.handle_message(reader, writer, buffer):
File "/home/croemheld/Documents/network/server.py", line 160, in handle_message
client = self.syncmanager.get_api_client((reader, writer))
File "<string>", line 2, in get_api_client
File "/usr/lib/python3.9/multiprocessing/managers.py", line 808, in _callmethod
conn.send((self._id, methodname, args, kwds))
File "/usr/lib/python3.9/multiprocessing/connection.py", line 211, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/usr/lib/python3.9/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'WeakSet.__init__.<locals>._remove'
Naturally I looked up the error message and found this question, but I don't really understand what the reason is here. As far as I understand, the reason for this crash is that StreamReader and StreamWriter cannot be pickled/serialized in order to be shared between processes. If that is in fact the reason, is there a way to pickle them, maybe by patching the reducer function to instead use a different pickler?

You might be interested in using SyncManager instead. just be sure to close the manager by calling shutdown at the end so no zombie process is left.
from multiprocessing.managers import SyncManager
from multiprocessing import Process
import signal
my_manager = SyncManager()
# to avoid closing the manager by ctrl+C. be sure to handle KeyboardInterrupt errors and close the manager accordingly
def manager_init():
signal.signal(signal.SIGINT, signal.SIG_IGN)
my_manager.start(manager_init)
my_dict = my_manager.dict()
my_dict["clients"] = my_manager.list()
def my_process(my_id, the_dict):
for i in range(3):
the_dict["clients"].append(f"{my_id}_{i}")
processes = []
for j in range(4):
processes.append(Process(target=my_process, args=(j,my_dict)))
for p in processes:
p.start()
for p in processes:
p.join()
print(my_dict["clients"])
# ['0_0', '2_0', '0_1', '3_0', '1_0', '0_2', '1_1', '2_1', '3_1', '1_2', '2_2', '3_2']
my_manager.shutdown()

I managed to find a workaround while also keeping the asyncio and multiprocessing libraries without any other libraries.
First, since the StreamReader and StreamWriter objects are not pickable, I am forced to use a socket. This is easily achievable with a simple function:
def get_socket(writer: StreamWriter):
fileno = writer.get_extra_info('socket').fileno()
return socket.fromfd(fileno, AddressFamily.AF_INET, socket.SOCK_STREAM)
The socket is inserted into the shared object (e.g. Manager().dict() or even a custom class, which you have to register via a custom BaseManager instance). Now, since the application is build on asyncio and makes use of the streams provided by the library, we can easily convert the socket back to a pair of StreamReader and StreamWriter via:
node_reader, node_writer = await asyncio.open_connection(sock=self.node_sock)
node_writer.write(mesg_text)
await node_writer.drain()
Where self.node_sock is the socket instance that was passed through the shared object.

Win32com events not raising inside thread?

I am new to both COM and Python, so im not very familiar with exact terminologies. So apologies for using inexact terms.
I am trying to connect to a desktop application via a proprietary COM interface using pywin32.
I created a PoC and it runs fine. The COM function call is processed and I get the expected event.
class MyEvents:
def __init__(self):
print("Callback class initialized")
def OnMyEvent(self, data):
print('MyEvent raised')
class ComUser:
comObj = None
def __init__(self):
comObj = win32com.client.DispatchWithEvents("ProproetaryInterface.InterfaceClass",
MyEvents)
comObj.Register()
comObj.DoSomething(data)
time.sleep(120)
userObj = ComUser()
So far so good. I get the event on the screen
Callback class initialized
MyEvent raised
Next I tried to put it into my application where I have multiple threads. To explain it in simple terms:
Main creates an object of Class X which initializes an XMLRPC Server thread.
The XMLRPC handler simply takes incoming info and puts it into a queue
The queue is from multiprocessing lib.
Another thread waits on this queue for an incoming message
def __startPollingThread(self):
pythoncom.CoInitialize()
pollingThread = Thread(target=self.__checkQueue )
pollingThread.start()
pythoncom.CoUninitialize()
This is the polling thread method:
def __checkQueue(self):
try:
pythoncom.CoInitialize()
while True:
currMessage = self.__messageQueue.get()
self.__processMessage(currMessage);
except :
#Log message
finally:
pythoncom.CoUninitialize()
The __processMessage passes through multliple classes (something like a strategy pattern + state pattern) before it hits the class that handles COM interface.
In the ComUser class, i have a method which registers with the client application's com interface:
def initSystem(self):
import pythoncom
try:
pythoncom.CoInitialize()
self.ComConnector = win32com.client.DispatchWithEvents("ProprietaryInterface.InterfaceClass",
MyEvents)
self.ComConnector.Register()
except:
finally:
pythoncom.CoUninitialize()
Another method handles the specific requests as they arrive and makes the corresponding COM calls.
def handleMessage(self, message):
#if message = this then
comObj.DoSomething(data)
Both methods are called from the __processMessage method. All my classes reside in separate Py files. Except ComUser and MyEvents which are in same py module
I can call the Com Interface and see the Application reacting to the COM method calls but I cant see any events being raised. I have tried a whole lot of combinations of CoInitialize and Uninitialze and "import pythoncom" statements to ensure that it is not a problem with the threading. Also tried setting the sys.coinit_flags = 0 and checked. Seems to make no difference. I just dont see any events.
Is it a problem that I call DispatchWithEvents in a child thread instead of the main thread(The calls seem to work fine) ? Or is it that the main thread (ie main method of the program) dies out. I tried putting a long sleep there too. I even tried a separate thread with PumpWaitingMessages loop but it made no difference. I cant think of any solutions.

explicit switch() with gevent

I have a primitive producer/consumer script running in gevent. It starts a few producer functions that put things into a gevent.queue.Queue, and one consumer function that fetches them out of the queue again:
from __future__ import print_function
import time
import gevent
import gevent.queue
import gevent.monkey
q = gevent.queue.Queue()
# define and spawn a consumer
def consumer():
while True:
item = q.get(block=True)
print('consumer got {}'.format(item))
consumer_greenlet = gevent.spawn(consumer)
# define and spawn a few producers
def producer(ID):
while True:
print("producer {} about to put".format(ID))
q.put('something from {}'.format(ID))
time.sleep(0.1)
# consumer_greenlet.switch()
producer_greenlets = [gevent.spawn(producer, i) for i in range(5)]
# wait indefinitely
gevent.monkey.patch_all()
print("about to join")
consumer_greenlet.join()
It works fine if I let gevent handle the scheduling implicitly (e.g. by calling time.sleep or some other gevent.monkey.patch()ed function), however when I switch to the consumer explicitly (replace time.sleepwith the commented-out switch call), gevent raises an AssertionError:
Traceback (most recent call last):
File "/my/virtualenvs/venv/local/lib/python2.7/site-packages/gevent/greenlet.py", line 327, in run
result = self._run(*self.args, **self.kwargs)
File "switch_test.py", line 14, in consumer
item = q.get(block=True)
File "/my/virtualenvs/venv/lib/python2.7/site-packages/gevent/queue.py", line 201, in get
assert result is waiter, 'Invalid switch into Queue.get: %r' % (result, )
AssertionError: Invalid switch into Queue.get: ()
<Greenlet at 0x7fde6fa6c870: consumer> failed with AssertionError
I would like to employ explicit switching because in production I have a lot of producers, gevent's scheduling does not allocate nearly enough runtime to the consumer and the queue gets longer and longer (which is bad). Alternatively, any insights into how to configure or modify gevent's scheduler is greatly appreciated.
This is on Python 2.7.2, gevent 1.0.1 and greenlet 0.4.5.

Seems to me explicit switch doesn't really play well with implicit switch.
You already have implicit switch happening either because monkey-patched I/O or because the gevent.queue.Queue().
The gevent documentation discourages usage of the raw greenlet methods:
Being a greenlet subclass, Greenlet also has switch() and throw()
methods. However, these should not be used at the application level as
they can very easily lead to greenlets that are forever unscheduled.
Prefer higher-level safe classes, like Event and Queue, instead.
Iterating gevent.queue.Queue() or accessing the queue's get method does implicit switching, interestingly put does not. So you have to generate an implicit thread switch yourself. Easiest is to call gevent.sleep(0) (you don't have to actually wait a specific time).
In conclusion you don't even have to monkey-pach things, provide that your code does not have blocking IO operations.
I would rewrite your code like this:
import gevent
import gevent.queue
q = gevent.queue.Queue()
# define and spawn a consumer
def consumer():
for item in q:
print('consumer got {}'.format(item))
consumer_greenlet = gevent.spawn(consumer)
# define and spawn a few producers
def producer(ID):
print('producer started', ID)
while True:
print("producer {} about to put".format(ID))
q.put('something from {}'.format(ID))
gevent.sleep(0)
producer_greenlets = [gevent.spawn(producer, i) for i in range(5)]
# wait indefinitely
print("about to join")
consumer_greenlet.join()

Stopping task.LoopingCall if exception occurs

I'm new to Twisted and after finally figuring out how the deferreds work I'm struggling with the tasks. What I want to achieve is to have a script that sends a REST request in a loop, however if at some point it fails I want to stop the loop. Since I'm using callbacks I can't easily catch exceptions and because I don't know how to stop the looping from an errback I'm stuck.
This is the simplified version of my code:
def send_request():
agent = Agent(reactor)
req_result = agent.request('GET', some_rest_link)
req_result.addCallbacks(cp_process_request, cb_process_error)
if __name__ == "__main__":
list_call = task.LoopingCall(send_request)
list_call.start(2)
reactor.run()

To end a task.LoopingCall all you need to do is call the stop on the return object (list_call in your case).
Somehow you need to make that var available to your errback (cb_process_error) either by pushing it into a class that cb_process_error is in, via some other class used as a pseudo-global or by literally using a global, then you simply call list_call.stop() inside the errback.
BTW you said:
Since I'm using callbacks I can't easily catch exceptions
Thats not really true. The point of an errback to to deal with exceptions, thats one of the things that literally causes it to be called! Check out my previous deferred answer and see if it makes errbacks any clearer.
The following is a runnable example (... I'm not saying this is the best way to do it, just that it is a way...)
#!/usr/bin/python
from twisted.internet import task
from twisted.internet import reactor
from twisted.internet.defer import Deferred
from twisted.web.client import Agent
from pprint import pprint
class LoopingStuff (object):
def cp_process_request(self, return_obj):
print "In callback"
pprint (return_obj)
def cb_process_error(self, return_obj):
print "In Errorback"
pprint(return_obj)
self.loopstopper()
def send_request(self):
agent = Agent(reactor)
req_result = agent.request('GET', 'http://google.com')
req_result.addCallbacks(self.cp_process_request, self.cb_process_error)
def main():
looping_stuff_holder = LoopingStuff()
list_call = task.LoopingCall(looping_stuff_holder.send_request)
looping_stuff_holder.loopstopper = list_call.stop
list_call.start(2)
reactor.callLater(10, reactor.stop)
reactor.run()
if __name__ == '__main__':
main()
Assuming you can get to google.com this will fetch pages for 10 seconds, if you change the second arg of the agent.request to something like http://127.0.0.1:12999 (assuming that port 12999 will give a connection refused) then you'll see 1 errback printout (which will have also shutdown the loopingcall) and have a 10 second wait until the reactor shuts down.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Errno 9 using the multiprocessing module with Tornado in Python - python

add_handler takes a valid file descriptor, not a PID. As an example of what's expected, tornado itself uses add_handler normally by passing in a socket object's fileno(), which returns the object's file descriptor. PID is irrelevant in this case.

Check out this project: https://github.com/vukasin/tornado-subprocess it allows you to start arbitrary processes from tornado and get a callback when they finish (with access to their status, stdout and stderr).

Related

Python - improving logging to file and console using multiprocessing

Sharing asyncio objects between processes

Win32com events not raising inside thread?

explicit switch() with gevent

Stopping task.LoopingCall if exception occurs

Categories

Resources