Running several ApplicationSessions non-blockingly using autbahn.asyncio.wamp - python

I'm trying to run two autobahn.asyncio.wamp.ApplicationSessions in python at the same time. Previously, I did this using a modification of the autobahn library as suggested in this post's answer. I now
require a bit more professional solution.
After googling about for a while, this post appeared quite promising, but uses the twisted library, instead of asyncio. I wasn't able to identify a similar solution for the asyncio branch of the autobahn library, since it doesn't appear to be using Reactors.
The main problem I have, is that ApplicationRunner.run() is blocking (which is why I previously outsourced it to a thread), so I can't just run a second ApplicationRunner after it.
I do need to access 2 websocket channels at the same time, which I cannot appear to do with a single ApplicationSession.
My Code so far:
from autobahn.asyncio.wamp import ApplicationSession
from autobahn.asyncio.wamp import ApplicationRunner
from asyncio import coroutine
import time
channel1 = 'BTC_LTC'
channel2 = 'BTC_XMR'
class LTCComponent(ApplicationSession):
def onConnect(self):
self.join(self.config.realm)
#coroutine
def onJoin(self, details):
def onTicker(*args, **kwargs):
print('LTCComponent', args, kwargs)
try:
yield from self.subscribe(onTicker, channel1)
except Exception as e:
print("Could not subscribe to topic:", e)
class XMRComponent(ApplicationSession):
def onConnect(self):
self.join(self.config.realm)
#coroutine
def onJoin(self, details):
def onTicker(*args, **kwargs):
print('XMRComponent', args, kwargs)
try:
yield from self.subscribe(onTicker, channel2)
except Exception as e:
print("Could not subscribe to topic:", e)
def main():
runner = ApplicationRunner("wss://api.poloniex.com:443", "realm1", extra={})
runner.run(LTCComponent)
runner.run(XMRComponent) # <- is not being called
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
quit()
except Exception as e:
print(time.time(), e)
My knowledge of the autobahn library is limited, and I'm afraid the documentation isn't improving my situation much. Am I overlooking something here? A function, a parameter, which would enable me to either compound my components or run them both at once?
Perhaps a similar solution as provided here, which implements an alternative ApplicationRunner ?
Related Topics
Running two ApplicationSessions in twisted
Running Autobahn ApplicationRunner in Thread
Autobahn.wamp.ApplicationSession Source
Autobahn.wamp.Applicationrunner Source
As Requested, the Traceback from #stovfl's answer using multithreading code:
Exception in thread Thread-2:
Traceback (most recent call last):
File "/home/nils/anaconda3/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/home/nils/git/tools/gemini_wss/t2.py", line 27, in run
self.appRunner.run(self.__ApplicationSession)
File "/home/nils/anaconda3/lib/python3.5/site-packages/autobahn- 0.14.1-py3.5.egg/autobahn/asyncio/wamp.py", line 143, in run
transport_factory = WampWebSocketClientFactory(create, url=self.url, serializers=self.serializers)
File "/home/nils/anaconda3/lib/python3.5/site-packages/autobahn- 0.14.1-py3.5.egg/autobahn/asyncio/websocket.py", line 319, in __init__
WebSocketClientFactory.__init__(self, *args, **kwargs)
File "/home/nils/anaconda3/lib/python3.5/site-packages/autobahn- 0.14.1-py3.5.egg/autobahn/asyncio/websocket.py", line 268, in __init__
self.loop = loop or asyncio.get_event_loop()
File "/home/nils/anaconda3/lib/python3.5/asyncio/events.py", line 626, in get_event_loop
return get_event_loop_policy().get_event_loop()
File "/home/nils/anaconda3/lib/python3.5/asyncio/events.py", line 572, in get_event_loop
% threading.current_thread().name)
RuntimeError: There is no current event loop in thread 'Thread-2'.
Exception in thread Thread-1:
**Same as in Thread-2**
...
RuntimeError: There is no current event loop in thread 'Thread-1'.

As I see from the traceback, we only reach Step 2 of 4
From the asyncio docs:
This module provides infrastructure for writing single-threaded concurrent code using coroutines, multiplexing I/O access over sockets and other resources
So I drop my first proposal using multithreading.
I could imagin the following three options:
Do it with multiprocessing instead of multithreading
Do it with coroutine inside asyncio loop
Switch between channels in def onJoin(self, details)
Second proposal, first option using multiprocessing.
I can start two asyncio loops, so appRunner.run(...) should work.
You can use one class ApplicationSession if the channel are the only different.
If you need to pass different class ApplicationSession add it to the args=
class __ApplicationSession(ApplicationSession):
# ...
try:
yield from self.subscribe(onTicker, self.config.extra['channel'])
except Exception as e:
# ...
import multiprocessing as mp
import time
def ApplicationRunner_process(realm, channel):
appRunner = ApplicationRunner("wss://api.poloniex.com:443", realm, extra={'channel': channel})
appRunner.run(__ApplicationSession)
if __name__ == "__main__":
AppRun = [{'process':None, 'channel':'BTC_LTC'},
{'process': None, 'channel': 'BTC_XMR'}]
for app in AppRun:
app['process'] = mp.Process(target = ApplicationRunner_process, args = ('realm1', app['channel'] ))
app['process'].start()
time.sleep(0.1)
AppRun[0]['process'].join()
AppRun[1]['process'].join()

Following the approach you linked for twisted I managed to get same behaviour with asyncio setting start_loop=False
import asyncio
from autobahn.asyncio.wamp import ApplicationSession, ApplicationRunner
runner1 = ApplicationRunner(url, realm, extra={'cli_id': 1})
coro1 = runner1.run(MyApplicationSession, start_loop=False)
runner2 = ApplicationRunner(url, realm, extra={'cli_id': 2})
coro2 = runner2.run(MyApplicationSession, start_loop=False)
asyncio.get_event_loop().run_until_complete(coro1)
asyncio.get_event_loop().run_until_complete(coro2)
asyncio.get_event_loop().run_forever()
class MyApplicationSession(ApplicationSession):
def __init__(self, cfg):
super().__init__(cfg)
self.cli_id = cfg.extra['cli_id']
def onJoin(self, details):
print("session attached", self.cli_id)

Related

Python multiprocessing.Queue.get throws OSError (Handle is closed)

I have a Python Singleton class which exposes an API put_msg_to_queue to users. This API puts a string message to queue. The Singleton Tester class creates a thread which gets the message and just prints it.
Complete code used is given below. This code was working fine with Python 3.9.12, but seems broken with Python 3.9.14. The queue.get API throws OSError when process exits.
Other than handling this exception (the commented out code given below), please suggest how to adapt this code with new python version.
Probably below change mentioned in changelog caused this change in behavior.
Always close the read end of the pipe used by multiprocessing.Queue
after the last write of buffered data to the write end of the pipe to
avoid BrokenPipeError at garbage collection and at
multiprocessing.Queue.close() calls. Patch by Géry Ogam.
# python -V
Python 3.9.12
#
# python sample.py
Closing..
Received msg: sample msg
cleaning
#
# python -V
Python 3.9.14
#
# python sample.py
Closing..
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python3.9/threading.py", line 980, in _bootstrap_inner
self.run()
File "/usr/lib/python3.9/threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
File "/root/sample.py", line 32, in print_data
record = self.myqueue.get(timeout=0.3)
File "/usr/lib/python3.9/multiprocessing/queues.py", line 117, in get
res = self._recv_bytes()
File "/usr/lib/python3.9/multiprocessing/connection.py", line 217, in recv_bytes
self._check_closed()
File "/usr/lib/python3.9/multiprocessing/connection.py", line 141, in _check_closed
raise OSError("handle is closed")
OSError: handle is closed
#
# cat sample.py
#!/usr/bin/python
import queue
import multiprocessing
import time
import threading
import atexit
class Singleton(type):
_instances = {}
def __call__(cls, *args, **kwargs):
if cls not in cls._instances:
cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
return cls._instances[cls]
class Tester(metaclass = Singleton):
def __init__(self):
self._is_close = False
atexit.register(self.close)
self.myqueue = multiprocessing.Queue(-1)
self.reader_thread = threading.Thread(target=self.print_data)
self.reader_thread.daemon = True
self.reader_thread.start()
def put_msg_to_queue(self, msg):
self.myqueue.put(msg)
def print_data(self):
while (not self._is_close):
try:
record = self.myqueue.get(timeout=0.3)
print("Received msg: " + str(record))
except (KeyboardInterrupt, SystemExit):
raise
except EOFError:
break
except queue.Empty:
pass
#except OSError as ex:
# if str(ex) == "handle is closed":
# print("Handle is closed, breaking")
# break
print("cleaning")
self.myqueue.close()
self.myqueue.join_thread()
def close(self):
print("Closing..")
self._is_close=True
self.reader_thread.join(5.0)
tester = Tester()
tester.put_msg_to_queue("sample msg")
Your problem can be resolved if you do not use the call to atexit.register to do the closing of the queue but rather explicitly call it after all messages have been added:
... # code omitted for brevity
tester = Tester()
tester.put_msg_to_queue("sample msg")
tester.close()
But you have so many timing dependencies and are using a multiprocessing.Queue instance with threading when all you need is queue.Queue instance. So if I may suggest some changes:
First, you do have an error in your Singleton class: if the singleton has already been created (is in the _instances dictionary, your __call__ method returns None.
Second, as I have already mentioned, you only need to use a queue.Queue instance since you are using multithreading.
Third, since your reader_thread method is a worker function for a daemon thread, it can do simple blocking calls in an infinite loop. To be sure that all messages placed on the queue have been read by this thread before exiting, you can call join on the queue. The thread must issue a call to task_done on the queue at the completion of processing each message it retrieves from the queue. In the following code all timing dependencies have been removed. The main thread can place as many messages on the queue it wants and the reader_thread can take as much time as it needs to process all messages. The program will not terminate until all messages by the reader_thread have been successfully processed.
import queue
import threading
import atexit
class Singleton(type):
_instances = {}
def __call__(cls, *args, **kwargs):
if cls not in cls._instances:
cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
# Following did not belong with the if block:
return cls._instances[cls]
class Tester(metaclass = Singleton):
def __init__(self):
atexit.register(self.close)
# Only need to use a queue.Queue:
self.myqueue = queue.Queue(-1)
self.reader_thread = threading.Thread(target=self.print_data)
self.reader_thread.daemon = True
self.reader_thread.start()
def put_msg_to_queue(self, msg):
self.myqueue.put(msg)
def print_data(self):
# Simplfied:
while True:
record = self.myqueue.get()
print("Received msg: ", record)
# Show message is processed
self.myqueue.task_done()
def close(self):
print("Closing..")
# Wait for all messages to be processed:
self.myqueue.join()
tester = Tester()
tester.put_msg_to_queue("sample msg 1")
tester.put_msg_to_queue("sample msg 2")
tester.put_msg_to_queue("sample msg 3")
Prints:
Closing..
Received msg: sample msg 1
Received msg: sample msg 2
Received msg: sample msg 3

How to find the cause of CancelledError in asyncio?

I have a big project which depends some third-party libraries, and sometimes its execution gets interrupted by a CancelledError.
To demonstrate the issue, let's look at a small example:
import asyncio
async def main():
task = asyncio.create_task(foo())
# Cancel the task in 1 second.
loop = asyncio.get_event_loop()
loop.call_later(1.0, lambda: task.cancel())
await task
async def foo():
await asyncio.sleep(999)
if __name__ == '__main__':
asyncio.run(main())
Traceback:
Traceback (most recent call last):
File "/Users/ss/Library/Application Support/JetBrains/PyCharm2021.2/scratches/async.py", line 19, in <module>
asyncio.run(main())
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 579, in run_until_complete
return future.result()
concurrent.futures._base.CancelledError
As you can see, there's no information about the place the CancelledError originates from. How do I find out the exact cause of it?
One approach that I came up with is to place a lot of try/except blocks which would catch the CancelledError and narrow down the place where it comes from. But that's quite tedious.
I've solved it by applyting a decorator to every async function in the project. The decorator's job is simple - log a message when a CancelledError is raised from the function. This way we will see which functions (and more importantly, in which order) get cancelled.
Here's the decorator code:
def log_cancellation(f):
async def wrapper(*args, **kwargs):
try:
return await f(*args, **kwargs)
except asyncio.CancelledError:
print(f"Cancelled {f}")
raise
return wrapper
In order to add this decorator everywhere I used regex. Find: (.*)(async def). Replace with: $1#log_cancellation\n$1$2.
Also to avoid importing log_cancellation in every file I modified the builtins:
builtins.log_cancellation = log_cancellation
The rich package has helped us to identify the cause of CancelledError, without much code change required.
from rich.console import Console
console = Console()
if __name__ == "__main__":
try:
asyncio.run(main()) # replace main() with your entrypoint
except BaseException as e:
console.print_exception(show_locals=True)

AWS lambda, scrapy and catching exceptions

I'm running scrapy as a AWS lambda function. Inside my function I need to have a timer to see whether it's running longer than 1 minute and if so, I need to run some logic. Here is my code:
def handler():
x = 60
watchdog = Watchdog(x)
try:
runner = CrawlerRunner()
runner.crawl(MySpider1)
runner.crawl(MySpider2)
d = runner.join()
d.addBoth(lambda _: reactor.stop())
reactor.run()
except Watchdog:
print('Timeout error: process takes longer than %s seconds.' % x)
# some other logic here
watchdog.stop()
Watchdog timer class I took from this answer. The problem is the code never hits that except Watchdog block, but rather throws an exception outside:
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 1182, in run
self.function(*self.args, **self.kwargs)
File "./functions/python/my_scrapy/index.py", line 174, in defaultHandler
raise self
functions.python.my_scrapy.index.Watchdog: 1
I need to catch exception in the function. How would I go about that.
PS: I'm very new to Python.
Alright this question had me going a little crazy, here is why that doesn't work:
What the Watchdog object does is create another thread where the exception is raised but not handled (the exception is only handled in the main process). Luckily, twisted has some neat features.
You can do it running the reactor in another thread:
import time
from threading import Thread
from twisted.internet import reactor
runner = CrawlerRunner()
runner.crawl(MySpider1)
runner.crawl(MySpider2)
d = runner.join()
d.addBoth(lambda _: reactor.stop())
Thread(target=reactor.run, args=(False,)).start() # reactor will run in a different thread so it doesn't lock the script here
time.sleep(60) # Lock script here
# Now check if it's still scraping
if reactor.running:
# do something
else:
# do something else
I'm using python 3.7.0
Twisted has scheduling primitives. For example, this program runs for about 60 seconds:
from twisted.internet import reactor
reactor.callLater(60, reactor.stop)
reactor.run()

Multiple thread with Autobahn, ApplicationRunner and ApplicationSession

python-running-autobahnpython-asyncio-websocket-server-in-a-separate-subproce
can-an-asyncio-event-loop-run-in-the-background-without-suspending-the-python-in
Was trying to solve my issue with this two links above but i have not.
I have the following error : RuntimeError: There is no current event loop in thread 'Thread-1'.
Here the code sample (python 3):
from autobahn.asyncio.wamp import ApplicationSession
from autobahn.asyncio.wamp import ApplicationRunner
from asyncio import coroutine
import time
import threading
class PoloniexWebsocket(ApplicationSession):
def onConnect(self):
self.join(self.config.realm)
#coroutine
def onJoin(self, details):
def on_ticker(*args):
print(args)
try:
yield from self.subscribe(on_ticker, 'ticker')
except Exception as e:
print("Could not subscribe to topic:", e)
def poloniex_worker():
runner = ApplicationRunner("wss://api.poloniex.com:443", "realm1")
runner.run(PoloniexWebsocket)
def other_worker():
while True:
print('Thank you')
time.sleep(2)
if __name__ == "__main__":
polo_worker = threading.Thread(None, poloniex_worker, None, (), {})
thank_worker = threading.Thread(None, other_worker, None, (), {})
polo_worker.start()
thank_worker.start()
polo_worker.join()
thank_worker.join()
So, my final goal is to have 2 threads launched at the start. Only one need to use ApplicationSession and ApplicationRunner. Thank you.
A separate thread must have it's own event loop. So if poloniex_worker needs to listen to a websocket, it needs its own event loop:
def poloniex_worker():
asyncio.set_event_loop(asyncio.new_event_loop())
runner = ApplicationRunner("wss://api.poloniex.com:443", "realm1")
runner.run(PoloniexWebsocket)
But if you're on a Unix machine, you will face another error if you try to do this. Autobahn asyncio uses Unix signals, but those Unix signals only work in the main thread. You can simply turn off Unix signals if you don't plan on using them. To do that, you have to go to the file where ApplicationRunner is defined. That is wamp.py in python3.5 > site-packages > autobahn > asyncio on my machine. You can comment out the signal handling section of the code like so:
# try:
# loop.add_signal_handler(signal.SIGTERM, loop.stop)
# except NotImplementedError:
# # signals are not available on Windows
# pass
All this is a lot of work. If you don't absolutely need to run your ApplicationSession in a separate thread from the main thread, it's better to just run the ApplicationSession in the main thread.

How to interrupt Tornado coroutine

Suppose I have two functions that work like this:
#tornado.gen.coroutine
def f():
for i in range(4):
print("f", i)
yield tornado.gen.sleep(0.5)
#tornado.gen.coroutine
def g():
yield tornado.gen.sleep(1)
print("Let's raise RuntimeError")
raise RuntimeError
In general, function f might contain endless loop and never return (e.g. it can process some queue).
What I want to do is to be able to interrupt it, at any time it yields.
The most obvious way doesn't work. Exception is only raised after function f exits (if it's endless, it obviously never happens).
#tornado.gen.coroutine
def main():
try:
yield [f(), g()]
except Exception as e:
print("Caught", repr(e))
while True:
yield tornado.gen.sleep(10)
if __name__ == "__main__":
tornado.ioloop.IOLoop.instance().run_sync(main)
Output:
f 0
f 1
Let's raise RuntimeError
f 2
f 3
Traceback (most recent call last):
File "/tmp/test/lib/python3.4/site-packages/tornado/gen.py", line 812, in run
yielded = self.gen.send(value)
StopIteration
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
<...>
File "test.py", line 16, in g
raise RuntimeError
RuntimeError
That is, exception is only raised when both of the coroutines return (both futures resolve).
This's partially solved by tornado.gen.WaitIterator, but it's buggy (unless I'm mistaken). But that's not the point.
It still doesn't solve the problem of interrupting existing coroutines. Coroutine continues to run even though the function that started it exits.
EDIT: it seems like coroutine cancellation is something not really supported in Tornado, unlike in Python's asyncio, where you can easily throw CancelledError at every yield point.
If you use WaitIterator according to the instructions, and use a toro.Event to signal between coroutines, it works as expected:
from datetime import timedelta
import tornado.gen
import tornado.ioloop
import toro
stop = toro.Event()
#tornado.gen.coroutine
def f():
for i in range(4):
print("f", i)
# wait raises Timeout if not set before the deadline.
try:
yield stop.wait(timedelta(seconds=0.5))
print("f done")
return
except toro.Timeout:
print("f continuing")
#tornado.gen.coroutine
def g():
yield tornado.gen.sleep(1)
print("Let's raise RuntimeError")
raise RuntimeError
#tornado.gen.coroutine
def main():
wait_iterator = tornado.gen.WaitIterator(f(), g())
while not wait_iterator.done():
try:
result = yield wait_iterator.next()
except Exception as e:
print("Error {} from {}".format(e, wait_iterator.current_future))
stop.set()
else:
print("Result {} received from {} at {}".format(
result, wait_iterator.current_future,
wait_iterator.current_index))
if __name__ == "__main__":
tornado.ioloop.IOLoop.instance().run_sync(main)
For now, pip install toro to get the Event class. Tornado 4.2 will include Event, see the changelog.
Since version 5, Tornado runs on asyncio event loop.
On Python 3, the IOLoop is always a wrapper around the asyncio event loop, and asyncio.Future and asyncio.Task are used instead of their Tornado counterparts.
Hence you can use asyncio Task cancellation, i.e. asyncio.Task.cancel.
Your example with a queue reading while-true loop, might look like this.
import logging
from asyncio import CancelledError
from tornado import ioloop, gen
async def read_off_a_queue():
while True:
try:
await gen.sleep(1)
except CancelledError:
logging.debug('Reader cancelled')
break
else:
logging.debug('Pretend a task is consumed')
async def do_some_work():
await gen.sleep(5)
logging.debug('do_some_work is raising')
raise RuntimeError
async def main():
logging.debug('Starting queue reader in background')
reader_task = gen.convert_yielded(read_off_a_queue())
try:
await do_some_work()
except RuntimeError:
logging.debug('do_some_work failed, cancelling reader')
reader_task.cancel()
# give the task a chance to clean up, in case it
# catches CancelledError and awaits something
try:
await reader_task
except CancelledError:
pass
if __name__ == '__main__':
logging.basicConfig(level='DEBUG')
ioloop.IOLoop.instance().run_sync(main)
If you run it, you should see:
DEBUG:asyncio:Using selector: EpollSelector
DEBUG:root:Starting queue reader in background
DEBUG:root:Pretend a task is consumed
DEBUG:root:Pretend a task is consumed
DEBUG:root:Pretend a task is consumed
DEBUG:root:Pretend a task is consumed
DEBUG:root:do_some_work is raising
DEBUG:root:do_some_work failed, cancelling reader
DEBUG:root:Reader cancelled
Warning: This is not a working solution. Look at the commentary. Still if you're new (as myself), this example can show the logical flow. Thanks #nathaniel-j-smith and #wgh
What is the difference using something more primitive, like global variable for instance?
import asyncio
event = asyncio.Event()
aflag = False
async def short():
while not aflag:
print('short repeat')
await asyncio.sleep(1)
print('short end')
async def long():
global aflag
print('LONG START')
await asyncio.sleep(3)
aflag = True
print('LONG END')
async def main():
await asyncio.gather(long(), short())
if __name__ == '__main__':
asyncio.run(main())
It is for asyncio, but I guess the idea stays the same. This is a semi-question (why Event would be better?). Yet solution yields exact result author needs:
LONG START
short repeat
short repeat
short repeat
LONG END
short end
UPDATE:
this slides may be really helpful in understanding core of a problem.

Categories