How to debug a stuck asyncio coroutine in Python?

How to debug a stuck asyncio coroutine in Python? - python

There are lots of coroutines in my production code, which are stuck at unknown position while processing request. I attached gdb with Python support extension to the process, but it doesn't show the exact line in the coroutine where the process is stuck, only primary stack trace. Here is a minimal example:
import asyncio
async def hello():
await asyncio.sleep(30)
print('hello world')
asyncio.run(hello())
(gdb) py-bt
Traceback (most recent call first):
File "/usr/lib/python3.8/selectors.py", line 468, in select
fd_event_list = self._selector.poll(timeout, max_ev)
File "/usr/lib/python3.8/asyncio/base_events.py", line 2335, in _run_once
File "/usr/lib/python3.8/asyncio/base_events.py", line 826, in run_forever
None, getaddr_func, host, port, family, type, proto, flags)
File "/usr/lib/python3.8/asyncio/base_events.py", line 603, in run_until_complete
self.run_forever()
File "/usr/lib/python3.8/asyncio/runners.py", line 299, in run
File "main.py", line 7, in <module>
GDB shows a trace that ends on line 7, but the code is obviously stuck on line 4. How to make it show a more complete trace with nested coroutines?

You can use the aiodebug.log_slow_callbacks.enable(0.05)
Follow for more : https://pypi.org/project/aiodebug/

Related

RuntimeError: File descriptor 8 is used by transport

Minimal demonstration example:
import asyncio
async def main():
c1_reader, c1_writer = await asyncio.open_connection(host='google.com', port=80)
c1_socket = c1_writer.get_extra_info('socket')
c1_socket.close()
c2_reader, c2_writer = await asyncio.open_connection(host='google.com', port=80)
asyncio.run(main())
Running this program gives this error:
$ python3 asyncio_fd_used.py
Traceback (most recent call last):
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/selector_events.py", line 469, in _sock_connect
sock.connect(address)
BlockingIOError: [Errno 36] Operation now in progress
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "asyncio_fd_used.py", line 11, in <module>
asyncio.run(main())
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 579, in run_until_complete
return future.result()
File "asyncio_fd_used.py", line 9, in main
c2_reader, c2_writer = await asyncio.open_connection(host='google.com', port=80)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/streams.py", line 77, in open_connection
lambda: protocol, host, port, **kwds)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 941, in create_connection
await self.sock_connect(sock, address)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/selector_events.py", line 463, in sock_connect
self._sock_connect(fut, sock, address)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/selector_events.py", line 477, in _sock_connect
self.add_writer(fd, self._sock_connect_cb, fut, sock, address)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/selector_events.py", line 333, in add_writer
self._ensure_fd_no_transport(fd)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/selector_events.py", line 244, in _ensure_fd_no_transport
f'File descriptor {fd!r} is used by transport '
RuntimeError: File descriptor 8 is used by transport <_SelectorSocketTransport fd=8 read=polling write=<idle, bufsize=0>>
Just for explanation why I am doing the low-level socket.close() and not asyncio-level writer.close(): I was trying some code to send RST packet. But I can imagine other reasons why people would call socket.close(), maybe even unintentionally.

The problem is that the low-level socket is closed, but asyncio doesn't know about that and thinks it is still open. For some reason (performance?) asyncio remembers the socket file descriptor (fileno).
When a new connection is opened, operating system gives to it the same file descriptor number, and asyncio starts panicking, because it has the same exact fd number associated with that previous connection.
Solution: tell asyncio the socket is closed :)
import asyncio
async def main():
c1_reader, c1_writer = await asyncio.open_connection(host='google.com', port=80)
c1_socket = c1_writer.get_extra_info('socket')
c1_socket.close()
c1_writer.close() # <<< here
c2_reader, c2_writer = await asyncio.open_connection(host='google.com', port=80)
asyncio.run(main())
This code runs without raising an error.

MemoryError with Discord selfbot during 'bot.run'

Before you tell me, yes I am aware that selfbots can get you banned. My selfbot is for work purposes in a server with me and three others. I'm doing nothing shady or weird over here.
I'm using the following selfbot code: https://github.com/Supersebi3/Selfbot
Upon logging in, being that I'm in about 50 servers, I experience the following:
This carries on for several minutes, until I eventually get a MemoryError:
File "main.py", line 96, in <module>
bot.run(token, bot=False)
File "D:\Python\Python36-32\lib\site-packages\discord\client.py", line 519, in run
self.loop.run_until_complete(self.start(*args, **kwargs))
File "D:\Python\Python36-32\lib\asyncio\base_events.py", line 468, in run_until_complete
return future.result()
File "D:\Python\Python36-32\lib\site-packages\discord\client.py", line 491, in start
yield from self.connect()
File "D:\Python\Python36-32\lib\site-packages\discord\client.py", line 448, in connect
yield from self.ws.poll_event()
File "D:\Python\Python36-32\lib\site-packages\discord\gateway.py", line 431, in poll_event
yield from self.received_message(msg)
File "D:\Python\Python36-32\lib\site-packages\discord\gateway.py", line 327, in received_message
log.debug('WebSocket Event: {}'.format(msg))
MemoryError
Can anyone explain to why this is happening and how I can fix it? Is there any way I can skip the chunk processing for the members of every server my selfbot account is in?

asyncio: unable to create new event loop

I am using Python 3.6.2, on Fedora 26 Workstation.
Below is some scrapbook code which demonstrates my issue:
EDIT: added Sam Hartman's suggestion to code.
import asyncio, json
from autobahn.asyncio.websocket import WebSocketClientProtocol, WebSocketClientFactory
class MyClientProtocol(WebSocketClientProtocol):
def onConnect(self, response):
print(response.peer)
def onOpen(self):
print("open")
self.sendMessage(json.dumps({'command': 'subscribe', 'channel': "1010"}).encode("utf8"))
def onMessage(self, payload, isBinary):
print("message")
print(json.loads(payload))
factory1 = WebSocketClientFactory("wss://api2.poloniex.com:443")
factory1.protocol = MyClientProtocol
loop1 = asyncio.get_event_loop()
loop1.run_until_complete(loop1.create_connection(factory1, "api2.poloniex.com", 443, ssl=True))
try:
loop1.run_forever()
except KeyboardInterrupt:
pass
loop1.close()
asyncio.set_event_loop(asyncio.new_event_loop())
factory2 = WebSocketClientFactory("wss://api2.poloniex.com:443")
factory2.protocol = MyClientProtocol
loop2 = asyncio.get_event_loop()
loop2.run_until_complete(loop2.create_connection(factory2, "api2.poloniex.com", 443, ssl=True))
try:
loop2.run_forever()
except KeyboardInterrupt:
pass
loop2.close()
After having closed an initial asyncio event loop, creating another and setting it as the global event loop, attempting to use the new event loop yields the following errors:
Fatal write error on socket transport
protocol: <asyncio.sslproto.SSLProtocol object at 0x7f8a84ed4748>
transport: <_SelectorSocketTransport fd=6>
Traceback (most recent call last):
File "/usr/lib64/python3.6/asyncio/selector_events.py", line 762, in write
n = self._sock.send(data)
OSError: [Errno 9] Bad file descriptor
Fatal error on SSL transport
protocol: <asyncio.sslproto.SSLProtocol object at 0x7f8a84ed4748>
transport: <_SelectorSocketTransport closing fd=6>
Traceback (most recent call last):
File "/usr/lib64/python3.6/asyncio/selector_events.py", line 762, in write
n = self._sock.send(data)
OSError: [Errno 9] Bad file descriptor
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib64/python3.6/asyncio/sslproto.py", line 648, in _process_write_backlog
self._transport.write(chunk)
File "/usr/lib64/python3.6/asyncio/selector_events.py", line 766, in write
self._fatal_error(exc, 'Fatal write error on socket transport')
File "/usr/lib64/python3.6/asyncio/selector_events.py", line 646, in _fatal_error
self._force_close(exc)
File "/usr/lib64/python3.6/asyncio/selector_events.py", line 658, in _force_close
self._loop.call_soon(self._call_connection_lost, exc)
File "/usr/lib64/python3.6/asyncio/base_events.py", line 574, in call_soon
self._check_closed()
File "/usr/lib64/python3.6/asyncio/base_events.py", line 357, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Exception in callback _SelectorSocketTransport._read_ready()
handle: <Handle _SelectorSocketTransport._read_ready()>
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/txaio/_common.py", line 63, in call_later
self._buckets[real_time][1].append(call)
KeyError: 412835000
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib64/python3.6/asyncio/events.py", line 127, in _run
self._callback(*self._args)
File "/usr/lib64/python3.6/asyncio/selector_events.py", line 731, in _read_ready
self._protocol.data_received(data)
File "/usr/lib64/python3.6/asyncio/sslproto.py", line 503, in data_received
ssldata, appdata = self._sslpipe.feed_ssldata(data)
File "/usr/lib64/python3.6/asyncio/sslproto.py", line 204, in feed_ssldata
self._handshake_cb(None)
File "/usr/lib64/python3.6/asyncio/sslproto.py", line 619, in _on_handshake_complete
self._app_protocol.connection_made(self._app_transport)
File "/usr/lib/python3.6/site-packages/autobahn/asyncio/websocket.py", line 97, in connection_made
self._connectionMade()
File "/usr/lib/python3.6/site-packages/autobahn/websocket/protocol.py", line 3340, in _connectionMade
WebSocketProtocol._connectionMade(self)
File "/usr/lib/python3.6/site-packages/autobahn/websocket/protocol.py", line 1055, in _connectionMade
self.onOpenHandshakeTimeout,
File "/usr/lib/python3.6/site-packages/txaio/_common.py", line 72, in call_later
self._notify_bucket, real_time,
File "/usr/lib/python3.6/site-packages/txaio/aio.py", line 382, in call_later
return self._config.loop.call_later(delay, real_call)
File "/usr/lib64/python3.6/asyncio/base_events.py", line 543, in call_later
timer = self.call_at(self.time() + delay, callback, *args)
File "/usr/lib64/python3.6/asyncio/base_events.py", line 553, in call_at
self._check_closed()
File "/usr/lib64/python3.6/asyncio/base_events.py", line 357, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
It seems reasonable that one might need to reopen an event loop after having closed an earlier one. Indeed this question even shows how: Asyncio Event Loop is Closed
The code below should achieve this:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
so I am clearly doing something wrong. Can somebody see something missing?

I have fairly high confidence that your factory object is maintaining a reference to the old event loop presumably that it gets from asyncio.get_event_loop. Asyncio consumers are bad about getting hidden references to loops.
My recommendation is to reconstruct the web socket factory after closing the loop

Python Streamhandler over ftp doesn't work after second import

I have the following problem:
I wrote a FTPHandler(StreamHandler), which connects via 'transport=paramiko.Transport(...)' and 'transport.connect(...)' to a server and opens a sftp connection with 'SFTPClient.from_transport(...)'.
I am importing this handler in a class named 'JUS_Logger.py', which is my module for logging. This 'FMP_Logger' is imported by another class, 'JUS_Reader'.
The problem is, that if I start 'JUS_Reader', the transport is being initialized, but the Connection fails. There is no exception, the program only hangs. If I kill it, I get the stacktrace
CTraceback (most recent call last):
File "./JUS_Reader.py", line 24, in <module>
from JUS_Logger import logger
File "/<home>/.../JUS_Logger.py", line 74, in <module>
ftpHandler=FTPHandler(ftpOut,10)
File "/<home>/FTPHandler.py", line 21, in __init__
self.transport.connect(username=ftpOut['user'].decode('base64'),password=ftpOut['passwd'].decode('base64'))
File "/usr/lib/python2.7/dist-packages/paramiko/transport.py", line 1004, in connect
self.auth_password(username, password)
File "/usr/lib/python2.7/dist-packages/paramiko/transport.py", line 1165, in auth_password
return self.auth_handler.wait_for_response(my_event)
File "/usr/lib/python2.7/dist-packages/paramiko/auth_handler.py", line 158, in wait_for_response
event.wait(0.1)
File "/usr/lib/python2.7/threading.py", line 403, in wait
self.__cond.wait(timeout)
File "/usr/lib/python2.7/threading.py", line 262, in wait
_sleep(delay)
However, if I'm running the 'JUS_Logger.py' by itself, everything works, the transport's connection establishes and the SFTClient connects also.
Any ideas? Or further questions?

time.sleep is hanging

This is some strange regression that I can only reproduce on the more powerful production machine we have.
def test_foo(self):
res = self._run_job( ....)
self.assertTrue("Hello Input!" in res.json()["stdout"], res.text)
.........
def _run_job(self, cbid, auth, d):
.........
while True:
res = requests.get(URL+"/status/"+status_id, auth=auth) <--- hangs here
if res.json()["status"] != "Running":
break
else:
time.sleep(2)
..........
I have to break the process and this is the traceback:
Traceback (most recent call last):
File "test_full.py", line 231, in <module>
unittest.main()
File "/opt/graphyte/vens/gcs/local/lib/python2.7/site-packages/unittest2/main.py", line 98, in __init__
self.runTests()
File "/opt/graphyte/vens/gcs/local/lib/python2.7/site-packages/unittest2/main.py", line 232, in runTests
self.result = testRunner.run(self.test)
File "/opt/graphyte/vens/gcs/local/lib/python2.7/site-packages/unittest2/runner.py", line 162, in run
test(result)
File "/opt/graphyte/vens/gcs/local/lib/python2.7/site-packages/unittest2/suite.py", line 64, in __call__
return self.run(*args, **kwds)
File "/opt/graphyte/vens/gcs/local/lib/python2.7/site-packages/unittest2/suite.py", line 84, in run
self._wrapped_run(result)
File "/opt/graphyte/vens/gcs/local/lib/python2.7/site-packages/unittest2/suite.py", line 114, in _wrapped_run
test._wrapped_run(result, debug)
File "/opt/graphyte/vens/gcs/local/lib/python2.7/site-packages/unittest2/suite.py", line 116, in _wrapped_run
test(result)
File "/opt/graphyte/vens/gcs/local/lib/python2.7/site-packages/unittest2/case.py", line 398, in __call__
return self.run(*args, **kwds)
File "/opt/graphyte/vens/gcs/local/lib/python2.7/site-packages/unittest2/case.py", line 340, in run
testMethod()
File "test_full.py", line 59, in test_session
"cmd": "python helloworld.py"
File "test_full.py", line 129, in _run_job
time.sleep(2)
File "/opt/graphyte/vens/gcs/local/lib/python2.7/site-packages/gevent/hub.py", line 79, in sleep
switch_result = get_hub().switch()
File "/opt/graphyte/vens/gcs/local/lib/python2.7/site-packages/gevent/hub.py", line 164, in switch
return greenlet.switch(self)
KeyboardInterrupt
Exception KeyError: KeyError(155453036,) in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
Why is gevent involved? This is a functional test. It only makes HTTP requests through requests library so maybe the switch refers to requests.
But being a simple loop, how could this fail?

Are you monkey patching in gevent?
It could be switching on the network request and never getting back for some reason. I'd say stop monkey patching for now, and put in gevent where you need it.
It could be that now that requests is asynchronous, it's returning immediately, then sleeping (again asynchronously) and the requesting, and rinse / repeat...

Why is gevent involved?
The gevent library monkey-patches some standard modules to make them cooperative. Replacing time.sleep by gevent.sleep is one of the changes.
http://www.gevent.org/gevent.monkey.html#gevent.monkey.patch_time

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to debug a stuck asyncio coroutine in Python? - python

You can use the aiodebug.log_slow_callbacks.enable(0.05) Follow for more : https://pypi.org/project/aiodebug/

Related

RuntimeError: File descriptor 8 is used by transport

MemoryError with Discord selfbot during 'bot.run'

asyncio: unable to create new event loop

Python Streamhandler over ftp doesn't work after second import

time.sleep is hanging

Categories

Resources