The Story:
I am currently looking through the asyncio basic examples, in particular this one - the simplest possible HTTP client. The main function starts an event loop, runs until the data fetching is complete and closes the event loop:
def main():
loop = get_event_loop()
try:
body = loop.run_until_complete(fetch())
finally:
loop.close()
print(body.decode('latin-1'), end='')
But, the code also works if I omit the loop.close():
def main():
loop = get_event_loop()
body = loop.run_until_complete(fetch())
print(body.decode('latin-1'), end='')
The Question:
While there is an example, the question is a generic one - what can potentially go wrong if one would forget to close the asyncio event loop? Is the event loop going to be always implicitly closed?
.close() can be used by different event loop implementations to free up system resources allocated by the loop (or do anything else). If you'll take a look at the code of _UnixSelectorEventLoop, which is the (default) IOLoop used in Linux, you would find the following code:
def close(self):
super().close()
for sig in list(self._signal_handlers):
self.remove_signal_handler(sig)
Here, for example, close() removes signal handlers registered with loop.add_signal_handler().
As multiple IOLoops can be started on different threads, or new IOLoops can be created after an old one is closed, (see asyncio.new_event_loop()), closing them should be considered as a good habit.
Update
Starting with Python 3.7 it is recommended to use asyncio.run instead of run_until_complete():
# Python 3.7+
def main():
body = asyncio.run(fetch())
print(body.decode('latin-1'), end='')
Among other things, asyncio.run takes care of finally close()ing the loop.
Related
My application needs remote control over SSH.
I wish to use this example: https://asyncssh.readthedocs.io/en/latest/#simple-server-with-input
The original app is rather big, using GPIO and 600lines of code, 10 libraries. so I've made a simple example here:
import asyncio, asyncssh, sys, time
# here would be 10 libraries in the original 600line application
is_open = True
return_value = 0;
async def handle_client(process):
process.stdout.write('Enter numbers one per line, or EOF when done:\n')
process.stdout.write(is_open)
total = 0
try:
async for line in process.stdin:
line = line.rstrip('\n')
if line:
try:
total += int(line)
except ValueError:
process.stderr.write('Invalid number: %s\n' % line)
except asyncssh.BreakReceived:
pass
process.stdout.write('Total = %s\n' % total)
process.exit(0)
async def start_server():
await asyncssh.listen('', 8022, server_host_keys=['key'],
authorized_client_keys='key.pub',
process_factory=handle_client)
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(start_server())
except (OSError, asyncssh.Error) as exc:
sys.exit('Error starting server: ' + str(exc))
loop.run_forever()
# here is the "old" program: that would not run now as loop.run_forever() runs.
#while True:
# print(return_value)
# time.sleep(0.1)
The main app is mostly driven by a while True loop with lots of functions and sleep.
I've commented that part out in the simple example above.
My question is: How should I implement the SSH part, that uses loop.run_forever() - and still be able to run my main loop?
Also: the handle_client(process) - must be able to interact with variables in the main program. (read/write)
You have basically three options:
Rewrite your main loop to be asyncio compatible
A main while True loop with lots of sleeps is exactly the kind of code you want to write asynchronously. Convert this:
while True:
task_1() # takes n ms
sleep(0.2)
task_2() # takes n ms
sleep(0.4)
into this:
async def task_1():
while True:
stuff()
await asyncio.sleep(0.6)
async def task_2():
while True:
stuff()
await asyncio.sleep(0.01)
other_stuff()
await asyncio.sleep(0.8)
loop = asyncio.get_event_loop()
loop.add_task(task_1())
loop.add_task(task_2())
...
loop.run_forever()
This is the most work, but it is almost certain that your current code will be better written, clearer, easier to maintain and easier to develop if written as a bunch of coroutines. If you do this the problem goes away: with cooperative multitasking you tell the code when to yield, so sharing state is generally pretty easy. By not awaiting anything in between getting and using a state var you prevent race conditions: no need for any kind of thread-safe var.
Run your asyncio loop in a thread
Leave your current loop intact, but run your ascynio loop in a thread (or process) with either threading or multiprocessing. Expose some kind of thread-safe variable to allow the background thread to change state, or transition to a (thread safe) messaging paradigm, where the ssh thread emits messages into a queue which your main loop handles in its own time (a message could be something like ("a", 5) which would be handled by doing something like state_dict[msg[0]] == msg[1] for everything in the queue).
If you want to go this way, have a look at the multiprocessing and/or threading docs for examples of the right ways to pass variables or messages between threads. Note that this version will likely be less performant than a pure asyncio solution, particularly if your code is mostly sleeping in the main loop anyhow.
Run your synchronous code in a thread, and have asyncio in the foreground
As #MisterMiyagi points out, asyncio has loop.run_in_executor() for launching a process to run blocking code. It's more generally used to run the odd blocking bit of code without tying up the whole loop, but you can run your whole main loop in it. The same concerns about some kind of thread safe variable or message sharing apply. This has the advantage (as #MisterMiyagi points out) of keeping asyncio where it expects to be. I have a few projects which use background asyncio threads in generally non-asyncio code (event-driven gui code with an asyncio thread interacting with custom hardware over usb). It can be done, but you do have to be careful as to how you write it.
Note btw that if you do decide to use multiple threads, message-passing (with a queue) is usually easier than directly sharing variables.
I want to use kqueue to monitor files for changes. I can see how to use select.kqueue() in a threaded way.
I'm searching for a way to use it with asyncio. I may have missed something really obvious here. I know that python uses kqueue for asyncio on macos. I'm happy for any solution to only work when kqueue selector is used.
So far the only way I can see to do this is create a thread to continually kqueue.control() from another thread and then inject the events in with asyncio.loop.call_soon_threadsafe(). I feel like there should be a better way.
You can add the FD from the kqueue objet as a reader to the control loop using loop.add_reader(). The control loop will then inform you events are ready to collect.
There's two features of doing this which might be odd to those familiar with kqueue:
select.kqueue.control is a one-shot method which first changes the monitor and waits for new events to arrive. Because we don't ever want it to block, the two actions must be split into one non-blocking call to modify the monitor and a second, later, non-blocking call to collect the resulting events.
Because we don't ever want to block, the timeout can never be used. This can be re-implemented with asyncio.wait_for()
There are more efficient ways to write this, but here's an example of how to completely replace select.kqueue.control with an async method (here named kqueue_control):
async def kqueue_control(kqueue: select.kqueue,
changes: Optional[Iterable[select.kevent]],
max_events: int,
timeout: Optional[int]):
def receive_result():
try:
# Events are ready to collect; fetch them but do not block
results = kqueue.control(None, max_events, 0)
except Exception as ex:
future.set_exception(ex)
else:
future.set_result(results)
finally:
loop.remove_reader(kqueue.fileno())
# If this call is non-blocking then just execute it
if timeout == 0 or max_events == 0:
return kqueue.control(changes, max_events, 0)
# Apply the changes, but DON'T wait for events
kqueue.control(changes, 0)
loop = asyncio.get_running_loop()
future = loop.create_future()
loop.add_reader(kqueue.fileno(), receive_result)
if timeout is None:
return await future
else:
return await asyncio.wait_for(future, timeout)
Here is my code:
async def runTaskWrapped(options):
layoutz = [[sg.Text("Running...", key="runstatus")]];
windowz = sg.Window("New Task", layoutz);
x = threading.Thread(target=runTask, args=(options,));
x.start();
startTime = time.time();
while True:
eventz, valuesz = windowz.read(timeout=100)
if eventz == sg.WIN_CLOSED:
if x.is_alive():
continue
break
if x.is_alive() == False:
x.join()
windowz.FindElement('runstatus').Update(value='Done! Check the new log.txt for more info.');
break;
else:
windowz.FindElement('runstatus').Update(value='Running... (' + str(math.floor(time.time()-startTime)) + ')')
asyncio.run(runTaskWrapped(options));
I have tried everything and it still seems that execution pauses after asyncio.run(runTaskWrapped(options));
Any idea why this may be?
EDIT:
I tried threading.Thread, and although it didn't pause execution, pysimplegui (imported as sg) didnt do anything and no window showed up for it like it does when called synchronously.
I tried trio too, but trio paused execution.
trio.run(runTaskWrapped, options);
When you call asyncio.run(some_function()), your program will not go to the next line until some_function() returns. In your case, runTaskWrapped doesn't return until you execute one of its "break" statements.
We deal with this sort of thing all the time. If you call any function f(), your program won't continue until f() returns. That's a familiar concept.
What's different about asyncio is that it creates a loop of its own, called the event loop, and launches some_function() from inside that loop. That allows you to start other tasks from within some_function(), and those other tasks get a chance to execute when some_function() encounters an "await" statement. That's a powerful concept if that's what you need. But it's only useful if you have two or more tasks that need to wait on external resources, like a network or a serial communications link, and one of the tasks can proceed while the other is waiting.
Your function runTaskWrapped does not contain any "await" statements. So asyncio creates an event loop, hands control to runTaskWrapped. That's a blind alley. It is more-or-less an infinite loop and doesn't "await" anything. Therefore there is no way out of runTaskWrapped, and your program is effectively dead at that point.
In order to make use of asyncio you must structure your program to have more than one task containing "await"s.
You are writing a GUI program, which typically means that it already has an event loop of its own. In some cases it is possible to run the GUI's event loop and the asyncio event loop together, but unless you have a specific need to do this it doesn't gain you anything.
You are also trying to use asyncio with multiple threads, and although this is possible it needs to be done quite carefully. You can start other threads just as in any other Python program, but the presence of those other threads doesn't change what happens in your main thread. You must specifically write code to synchronize events between threads.
No matter what you do in those other threads, asyncio.run(some_function()) will not return until its argument is finished.
I'm running into some strange errors with initialising Locks and running asynchronous code. Suppose we had a class to use with some resource protected by a lock.
import asyncio
class C:
def __init__(self):
self.lock = asyncio.Lock()
async def foo(self):
async with self.lock:
return 'foo'
def async_foo():
c = C()
asyncio.run(c.foo())
if __name__ == '__main__':
async_foo()
async_foo()
This throws an error when run. It occurs on lock initialisation in init.
RuntimeError: There is no current event loop in thread 'MainThread'.
So duplicating the asyncio.run call in the function does not have this effect. It seems that the object needs to be initialised multiple times. It is also not enough to instantiate multiple locks in a single constructor. So perhaps it has something to do with the event loops state after asyncio.run is called.
What is going on? And how could I modify this code to work? Let me also clarify a bit, the instance is created outside asyncio.run and async functions for a reason. I'd like for it to be usable elsewhere too. If that makes a difference.
Alternatively, can threading.Lock be used for async things also? It would have the added benefit of being thread-safe, which asyncio.Lock reportedly is not.
What is going on?
When async object is created (asyncio.Lock()) it is attached to current event loop and can only be used with it
Main thread have some default current event loop (but other threads you create won't have default event loop)
asyncio.run() internally creates new event loop, set it current and close it after finished
So you're trying to use lock with event loop other than one it was attached to on creation. It leads to errors.
And how could I modify this code to work?
Ideal solution is following:
import asyncio
async def main():
# all your code is here
if __name__ == "__main__":
asyncio.run(main())
This will guarantee that every async object created is attached to proper event loop asyncio.run has created.
Running event loop (inside asyncio.run) is meant to be global "entry point" of your async program.
I'd like for it to be usable elsewhere too.
You're able to create an object outside asyncio.run, but then you should you should move creating async object from __init__ somewhere elsewhere so that asyncio.Lock() wouldn't be created until asyncio.run() is called.
Alternatively, can threading.Lock be used for async things also?
No, it is used to work with threads, while asyncio operates coroutines inside a single thread (usually).
It would have the added benefit of being thread-safe, which asyncio.Lock reportedly is not.
In asyncio you usually don't need threads other than main. There're still some reasons to do it, but thread-unsafety of asyncio.Lock shouldn't be an issue.
Consider reading following links. It may help to comprehend a situation better:
why we need asyncio/threads at all
When should I write asynchronous code instead of synchronous?
I have an infinite loop running async but I can't terminate it. Here is a similiar version of my code :
from multiprocessing import Pool
test_pool = Pool(processes=1)
self.button1.clicked.connect(self.starter)
self.button2.clicked.connect(self.stopper)
def starter(self):
global test_pool
test_pool.apply_async(self.automatizer)
def automatizer(self):
i = 0
while i != 0 :
self.job1()
# safe stop point
self.job2()
# safe stop point
self.job3()
# safe stop point
def job1(self):
# doing some stuff
def job2(self):
# doing some stuff
def job3(self):
# doing some stuff
def stopper(self):
global test_pool
test_pool.terminate()
My problem is terminate() inside stopper function doesn't work. I tried to put terminate() inside job1,job2,job3 functions still not working, tried putting at the end of the loop in starter function, again not working. How can I stop this async process ?
While stopping the process at anytime is good enough, is it possible to make it stop at the points I want ? I mean if a stop command (not sure about what command it is) is given to process, I want it to complete the steps to "# safe stop point" marker then terminate the process.
You really should be avoiding the use of terminate() in normal operation. It should only be used in unusual cases, such as hanging or unresponsive processes. The normal way to end a process pool is to call pool.close() followed by pool.join().
These methods do require the function that your pool is executing to return, and your call to pool.join() will block your main process until it does so. I would suggest you add a multiprocess.Queue to give yourself a way to tell your subprocess to exit:
# this import is NOT the same as multiprocessing.Queue - this is here for the
# queue.Empty exception
import Queue
queue = multiprocessing.Queue() # not the same as a Queue.Queue()
def stopper(self):
# don't need "global" keyword to call a global object's method
# it's only necessary if we want to modify a global
queue.put("Stop")
test_pool.close()
test_pool.join()
def automatizer(self):
while True: # cleaner infinite loop - yours was never executing
for func in [self.job1, self.job2, self.job3]: # iterate over methods
func() # call each one
# between each function call, check the queue for "poison pill"
try:
if queue.get(block=False) == "Stop":
return
except Queue.Empty:
pass
Since you didn't provide a more complete code sample, you'll have to figure out where to actually instantiate the multiprocessing.Queue and how to pass things around. Also, the comment from Janne Karila was correct. You should switch your code to use a single Process instead of a pool if you're only using one process at a time anyway. The Process class also uses a blocking join() method to tell it to end once it has returned. The only safe way to end processes at "known safe points" is to implement some kind of interprocess communication like I've done here. Pipes would work as well.