Python threading.Thread.join() is blocking - python

I'm working with asynchronous programming and wrote a small wrapper class for thread-safe execution of co-routines based on some ideas from this thread here: python asyncio, how to create and cancel tasks from another thread. After some debugging, I found that it hangs when calling the Thread class's join() function (I overrode it only for testing). Thinking I made a mistake, I basically copied the code that the OP said he used and tested it to find the same issue.
His mildly altered code:
import threading
import asyncio
from concurrent.futures import Future
import functools
class EventLoopOwner(threading.Thread):
class __Properties:
def __init__(self, loop, thread, evt_start):
self.loop = loop
self.thread = thread
self.evt_start = evt_start
def __init__(self):
threading.Thread.__init__(self)
self.__elo = self.__Properties(None, None, threading.Event())
def run(self):
self.__elo.loop = asyncio.new_event_loop()
asyncio.set_event_loop(self.__elo.loop)
self.__elo.thread = threading.current_thread()
self.__elo.loop.call_soon_threadsafe(self.__elo.evt_start.set)
self.__elo.loop.run_forever()
def stop(self):
self.__elo.loop.call_soon_threadsafe(self.__elo.loop.stop)
def _add_task(self, future, coro):
task = self.__elo.loop.create_task(coro)
future.set_result(task)
def add_task(self, coro):
self.__elo.evt_start.wait()
future = Future()
p = functools.partial(self._add_task, future, coro)
self.__elo.loop.call_soon_threadsafe(p)
return future.result() # block until result is available
def cancel(self, task):
self.__elo.loop.call_soon_threadsafe(task.cancel)
async def foo(i):
return 2 * i
async def main():
elo = EventLoopOwner()
elo.start()
task = elo.add_task(foo(10))
x = await task
print(x)
elo.stop(); print("Stopped")
elo.join(); print("Joined") # note: giving it a timeout does not fix it
if __name__ == "__main__":
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
assert isinstance(loop, asyncio.AbstractEventLoop)
try:
loop.run_until_complete(main())
finally:
loop.close()
About 50% of the time when I run it, It simply stalls and says "Stopped" but not "Joined". I've done some debugging and found that it is correlated to when the Task itself sent an exception. This doesn't happen every time, but since it occurs when I'm calling threading.Thread.join(), I have to assume it is related to the destruction of the loop. What could possibly be causing this?
The exception is simply: "cannot join current thread" which tells me that the .join() is sometimes being run on the thread from which I called it and sometimes from the ELO thread.
What is happening and how can I fix it?
I'm using Python 3.5.1 for this.
Note: This is not replicated on IDE One: http://ideone.com/0LO2D9

Related

Callback from Ctypes sometimes fails

I have registered a python callback with a dll using the ctypes library. When the callback is triggered, i try to free up an asyncio future i have set up. Since the callback happens in a separate thread that gets spawned by the dll, i use the loop.call_soon_threadsafe() function to get back to the eventloop that started it all.
Mostly this works fine, but every once in a while the future fails to be unblocked. In the minimal example here this also happens sometimes, but here i see that in those cases the callback doesn't even arrive (or at least the corresponding print doesn't happen).
I tried this only with python 3.8.5 so far. Is there some race condition here that i did not notice?
Here's a minimal example:
import asyncio
import os
class testClass:
loop = None
future = None
exampleDll = None
def finish(self):
#now in the right c thread and eventloop.
print("callback in eventloop")
self.future.set_result(999)
def trampoline(self):
#still in the other c thread
self.loop.call_soon_threadsafe(self.finish)
def example_callback(self):
#in another c thread, so we need to do threadsafety stuff
print("callback has arrived")
self.trampoline()
return
async def register_and_wait(self):
self.loop = asyncio.get_event_loop()
self.future=self.loop.create_future()
callback_type = ctypes.CFUNCTYPE(None)
callback_as_cfunc = callback_type(self.example_callback)
#now register the callback and wait
self.exampleDll.fnminimalExample(callback_as_cfunc, ctypes.c_int(1))
await self.future
print("future has finished")
def main(self):
path = os.path.join(os.path.dirname(os.path.abspath(__file__)), "minimalExample.dll")
#print(path)
ctypes.cdll.LoadLibrary(path)
#for easy access
self.exampleDll = ctypes.cdll.minimalExample
asyncio.run(self.register_and_wait())
if __name__ == "__main__":
for i in range(0,100000):
print(i)
test = testClass()
test.main()
You can get the compiled example dll and its source from the repository here to reproduce.
The issue (at least in this minimal example) does not show up any more if i reuse the same eventloop instead of spawning a new one for every iteration with asyncio.run
The problem is thus fixed, but it doesn't feel right.

await does not return the value of a future with the value set in a thread

This code should print "hi" 3 times, but it doesn't always print.
I made a gif that shows the code being executed:
from asyncio import get_event_loop, wait_for, new_event_loop
from threading import Thread
class A:
def __init__(self):
self.fut = None
def start(self):
"""
Expects a future to be created and puts "hi" as a result
"""
async def foo():
while True:
if self.fut:
self.fut.set_result('hi')
self.fut = None
new_event_loop().run_until_complete(foo())
async def make(self):
"""
Create a future and print your result when it runs out
"""
future = get_event_loop().create_future()
self.fut = future
print(await future)
a = A()
Thread(target=a.start).start()
for _ in range(3):
get_event_loop().run_until_complete(a.make())
This is caused by await future, because when I change
print(await future)
by
while not future.done():
pass
print(future.result())
the code always prints "hi" 3 times.
Is there anything in my code that causes this problem in await future?
Asyncio functions are not thread-safe, except where explicitly noted. For set_result to work from another thread, you'd need to call it through call_soon_threadsafe.
But in your case this wouldn't work because A.start creates a different event loop than the one the main thread executes. This creates issues because the futures created in one loop cannot be awaited in another one. Because of this, and also because there is just no need to create multiple event loops, you should pass the event loop instance to A.start and use it for your async needs.
But - when using the event loop from the main thread, A.start cannot call run_until_complete() because that would try to run an already running event loop. Instead, it must call asyncio.run_coroutine_threadsafe to submit the coroutine to the event loop running in the main thread. This will return a concurrent.futures.Future (not to be confused with an asyncio Future) whose result() method can be used to wait for it to execute and propagate the result or exception, just like run_until_complete() would have done. Since foo will now run in the same thread as the event loop, it can just call set_result without call_soon_threadsafe.
One final problem is that foo contains an infinite loop that doesn't await anything, which blocks the event loop. (Remember that asyncio is based on cooperative multitasking, and a coroutine that spins without awaiting doesn't cooperate.) To fix that, you can have foo monitor an event that gets triggered when a new future is available.
Applying the above to your code can look like this, which prints "hi" three times as desired:
import asyncio
from asyncio import get_event_loop
from threading import Thread
class A:
def __init__(self):
self.fut = None
self.have_fut = asyncio.Event()
def start(self, loop):
async def foo():
while True:
await self.have_fut.wait()
self.have_fut.clear()
if self.fut:
self.fut.set_result('hi')
self.fut = None
asyncio.run_coroutine_threadsafe(foo(), loop).result()
async def make(self):
future = get_event_loop().create_future()
self.fut = future
self.have_fut.set()
print(await future)
a = A()
Thread(target=a.start, args=(get_event_loop(),), daemon=True).start()
for _ in range(3):
get_event_loop().run_until_complete(a.make())

Running an event loop within its own thread

I'm playing with Python's new(ish) asyncio stuff, trying to combine its event loop with traditional threading. I have written a class that runs the event loop in its own thread, to isolate it, and then provide a (synchronous) method that runs a coroutine on that loop and returns the result. (I realise this makes it a somewhat pointless example, because it necessarily serialises everything, but it's just as a proof-of-concept).
import asyncio
import aiohttp
from threading import Thread
class Fetcher(object):
def __init__(self):
self._loop = asyncio.new_event_loop()
# FIXME Do I need this? It works either way...
#asyncio.set_event_loop(self._loop)
self._session = aiohttp.ClientSession(loop=self._loop)
self._thread = Thread(target=self._loop.run_forever)
self._thread.start()
def __enter__(self):
return self
def __exit__(self, *e):
self._session.close()
self._loop.call_soon_threadsafe(self._loop.stop)
self._thread.join()
self._loop.close()
def __call__(self, url:str) -> str:
# FIXME Can I not get a future from some method of the loop?
future = asyncio.run_coroutine_threadsafe(self._get_response(url), self._loop)
return future.result()
async def _get_response(self, url:str) -> str:
async with self._session.get(url) as response:
assert response.status == 200
return await response.text()
if __name__ == "__main__":
with Fetcher() as fetcher:
while True:
x = input("> ")
if x.lower() == "exit":
break
try:
print(fetcher(x))
except Exception as e:
print(f"WTF? {e.__class__.__name__}")
To avoid this sounding too much like a "Code Review" question, what is the purpose of asynchio.set_event_loop and do I need it in the above? It works fine with and without. Moreover, is there a loop-level method to invoke a coroutine and return a future? It seems a bit odd to do this with a module level function.
You would need to use set_event_loop if you called get_event_loop anywhere and wanted it to return the loop created when you called new_event_loop.
From the docs
If there’s need to set this loop as the event loop for the current context, set_event_loop() must be called explicitly.
Since you do not call get_event_loop anywhere in your example, you can omit the call to set_event_loop.
I might be misinterpreting, but i think the comment by #dirn in the marked answer is incorrect in stating that get_event_loop works from a thread. See the following example:
import asyncio
import threading
async def hello():
print('started hello')
await asyncio.sleep(5)
print('finished hello')
def threaded_func():
el = asyncio.get_event_loop()
el.run_until_complete(hello())
thread = threading.Thread(target=threaded_func)
thread.start()
This produces the following error:
RuntimeError: There is no current event loop in thread 'Thread-1'.
It can be fixed by:
- el = asyncio.get_event_loop()
+ el = asyncio.new_event_loop()
The documentation also specifies that this trick (creating an eventloop by calling get_event_loop) only works on the main thread:
If there is no current event loop set in the current OS thread, the OS thread is main, and set_event_loop() has not yet been called, asyncio will create a new event loop and set it as the current one.
Finally, the docs also recommend to use get_running_loop instead of get_event_loop if you're on version 3.7 or higher

Is there a way to use asyncio.Queue in multiple threads?

Let's assume I have the following code:
import asyncio
import threading
queue = asyncio.Queue()
def threaded():
import time
while True:
time.sleep(2)
queue.put_nowait(time.time())
print(queue.qsize())
#asyncio.coroutine
def async():
while True:
time = yield from queue.get()
print(time)
loop = asyncio.get_event_loop()
asyncio.Task(async())
threading.Thread(target=threaded).start()
loop.run_forever()
The problem with this code is that the loop inside async coroutine is never finishing the first iteration, while queue size is increasing.
Why is this happening this way and what can I do to fix it?
I can't get rid of separate thread, because in my real code I use a separate thread to communicate with a serial device, and I haven't find a way to do that using asyncio.
asyncio.Queue is not thread-safe, so you can't use it directly from more than one thread. Instead, you can use janus, which is a third-party library that provides a thread-aware asyncio queue.
import asyncio
import threading
import janus
def threaded(squeue):
import time
while True:
time.sleep(2)
squeue.put_nowait(time.time())
print(squeue.qsize())
#asyncio.coroutine
def async_func(aqueue):
while True:
time = yield from aqueue.get()
print(time)
loop = asyncio.get_event_loop()
queue = janus.Queue(loop=loop)
asyncio.create_task(async_func(queue.async_q))
threading.Thread(target=threaded, args=(queue.sync_q,)).start()
loop.run_forever()
There is also aioprocessing (full-disclosure: I wrote it), which provides process-safe (and as a side-effect, thread-safe) queues as well, but that's overkill if you're not trying to use multiprocessing.
Edit
As pointed it out in other answers, for simple use-cases you can use loop.call_soon_threadsafe to add to the queue, as well.
If you do not want to use another library you can schedule a coroutine from the thread. Replacing the queue.put_nowait with the following works fine.
asyncio.run_coroutine_threadsafe(queue.put(time.time()), loop)
The variable loop represents the event loop in the main thread.
EDIT:
The reason why your async coroutine is not doing anything is that
the event loop never gives it a chance to do so. The queue object is
not threadsafe and if you dig through the cpython code you find that
this means that put_nowait wakes up consumers of the queue through
the use of a future with the call_soon method of the event loop. If
we could make it use call_soon_threadsafe it should work. The major
difference between call_soon and call_soon_threadsafe, however, is
that call_soon_threadsafe wakes up the event loop by calling loop._write_to_self() . So let's call it ourselves:
import asyncio
import threading
queue = asyncio.Queue()
def threaded():
import time
while True:
time.sleep(2)
queue.put_nowait(time.time())
queue._loop._write_to_self()
print(queue.qsize())
#asyncio.coroutine
def async():
while True:
time = yield from queue.get()
print(time)
loop = asyncio.get_event_loop()
asyncio.Task(async())
threading.Thread(target=threaded).start()
loop.run_forever()
Then, everything works as expected.
As for the threadsafe aspect of
accessing shared objects,asyncio.queue uses under the hood
collections.deque which has threadsafe append and popleft.
Maybe checking for queue not empty and popleft is not atomic, but if
you consume the queue only in one thread (the one of the event loop)
it could be fine.
The other proposed solutions, loop.call_soon_threadsafe from Huazuo
Gao's answer and my asyncio.run_coroutine_threadsafe are just doing
this, waking up the event loop.
BaseEventLoop.call_soon_threadsafe is at hand. See asyncio doc for detail.
Simply change your threaded() like this:
def threaded():
import time
while True:
time.sleep(1)
loop.call_soon_threadsafe(queue.put_nowait, time.time())
loop.call_soon_threadsafe(lambda: print(queue.qsize()))
Here's a sample output:
0
1443857763.3355968
0
1443857764.3368602
0
1443857765.338082
0
1443857766.3392274
0
1443857767.3403943
What about just using threading.Lock with asyncio.Queue?
class ThreadSafeAsyncFuture(asyncio.Future):
""" asyncio.Future is not thread-safe
https://stackoverflow.com/questions/33000200/asyncio-wait-for-event-from-other-thread
"""
def set_result(self, result):
func = super().set_result
call = lambda: func(result)
self._loop.call_soon_threadsafe(call) # Warning: self._loop is undocumented
class ThreadSafeAsyncQueue(queue.Queue):
""" asyncio.Queue is not thread-safe, threading.Queue is not awaitable
works only with one putter to unlimited-size queue and with several getters
TODO: add maxsize limits
TODO: make put corouitine
"""
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.lock = threading.Lock()
self.loop = asyncio.get_event_loop()
self.waiters = []
def put(self, item):
with self.lock:
if self.waiters:
self.waiters.pop(0).set_result(item)
else:
super().put(item)
async def get(self):
with self.lock:
if not self.empty():
return super().get()
else:
fut = ThreadSafeAsyncFuture()
self.waiters.append(fut)
result = await fut
return result
See also - asyncio: Wait for event from other thread

Python multiprocessing with twisted's reactor

I am working on a xmlrpc server which has to perform certain tasks cyclically. I am using twisted as the core of the xmlrpc service but I am running into a little problem:
class cemeteryRPC(xmlrpc.XMLRPC):
def __init__(self, dic):
xmlrpc.XMLRPC.__init__(self)
def xmlrpc_foo(self):
return 1
def cycle(self):
print "Hello"
time.sleep(3)
class cemeteryM( base ):
def __init__(self, dic): # dic is for cemetery
multiprocessing.Process.__init__(self)
self.cemRPC = cemeteryRPC()
def run(self):
# Start reactor on a second process
reactor.listenTCP( c.PORT_XMLRPC, server.Site( self.cemRPC ) )
p = multiprocessing.Process( target=reactor.run )
p.start()
while not self.exit.is_set():
self.cemRPC.cycle()
#p.join()
if __name__ == "__main__":
import errno
test = cemeteryM()
test.start()
# trying new method
notintr = False
while not notintr:
try:
test.join()
notintr = True
except OSError, ose:
if ose.errno != errno.EINTR:
raise ose
except KeyboardInterrupt:
notintr = True
How should i go about joining these two process so that their respective joins doesn't block?
(I am pretty confused by "join". Why would it block and I have googled but can't find much helpful explanation to the usage of join. Can someone explain this to me?)
Regards
Do you really need to run Twisted in a separate process? That looks pretty unusual to me.
Try to think of Twisted's Reactor as your main loop - and hang everything you need off that - rather than trying to run Twisted as a background task.
The more normal way of performing this sort of operation would be to use Twisted's .callLater or to add a LoopingCall object to the Reactor.
e.g.
from twisted.web import xmlrpc, server
from twisted.internet import task
from twisted.internet import reactor
class Example(xmlrpc.XMLRPC):
def xmlrpc_add(self, a, b):
return a + b
def timer_event(self):
print "one second"
r = Example()
m = task.LoopingCall(r.timer_event)
m.start(1.0)
reactor.listenTCP(7080, server.Site(r))
reactor.run()
Hey asdvawev - .join() in multiprocessing works just like .join() in threading - it's a blocking call the main thread runs to wait for the worker to shut down. If the worker never shuts down, then .join() will never return. For example:
class myproc(Process):
def run(self):
while True:
time.sleep(1)
Calling run on this means that join() will never, ever return. Typically to prevent this I'll use an Event() object passed into the child process to allow me to signal the child when to exit:
class myproc(Process):
def __init__(self, event):
self.event = event
Process.__init__(self)
def run(self):
while not self.event.is_set():
time.sleep(1)
Alternatively, if your work is encapsulated in a queue - you can simply have the child process work off of the queue until it encounters a sentinel (typically a None entry in the queue) and then shut down.
Both of these suggestions means that prior to calling .join() you can send set the event, or insert the sentinel and when join() is called, the process will finish it's current task and then exit properly.

Categories