How to use kqueue for file monitoring in asyncio?

How to use kqueue for file monitoring in asyncio? - python

I want to use kqueue to monitor files for changes. I can see how to use select.kqueue() in a threaded way.
I'm searching for a way to use it with asyncio. I may have missed something really obvious here. I know that python uses kqueue for asyncio on macos. I'm happy for any solution to only work when kqueue selector is used.
So far the only way I can see to do this is create a thread to continually kqueue.control() from another thread and then inject the events in with asyncio.loop.call_soon_threadsafe(). I feel like there should be a better way.

You can add the FD from the kqueue objet as a reader to the control loop using loop.add_reader(). The control loop will then inform you events are ready to collect.
There's two features of doing this which might be odd to those familiar with kqueue:
select.kqueue.control is a one-shot method which first changes the monitor and waits for new events to arrive. Because we don't ever want it to block, the two actions must be split into one non-blocking call to modify the monitor and a second, later, non-blocking call to collect the resulting events.
Because we don't ever want to block, the timeout can never be used. This can be re-implemented with asyncio.wait_for()
There are more efficient ways to write this, but here's an example of how to completely replace select.kqueue.control with an async method (here named kqueue_control):
async def kqueue_control(kqueue: select.kqueue,
changes: Optional[Iterable[select.kevent]],
max_events: int,
timeout: Optional[int]):
def receive_result():
try:
# Events are ready to collect; fetch them but do not block
results = kqueue.control(None, max_events, 0)
except Exception as ex:
future.set_exception(ex)
else:
future.set_result(results)
finally:
loop.remove_reader(kqueue.fileno())
# If this call is non-blocking then just execute it
if timeout == 0 or max_events == 0:
return kqueue.control(changes, max_events, 0)
# Apply the changes, but DON'T wait for events
kqueue.control(changes, 0)
loop = asyncio.get_running_loop()
future = loop.create_future()
loop.add_reader(kqueue.fileno(), receive_result)
if timeout is None:
return await future
else:
return await asyncio.wait_for(future, timeout)

Related

Process I/O bound co-routines in an infinite loop

I have an always-on video stream being processed in an infinite loop. Once a certain object is detected, a second I/O bound method (let's refer to this as FuncIO) is triggered. Ideally, only 1 of FuncIO should run at a time. Once FuncIO completes, the parent loop should continue (i.e., wait for the next trigger of FuncIO).
Here is the pseudocode:
def FuncIO(self):
if self._funcio_running:
# Only 1 instance of FuncIO should run at a time.
# Is this the best place to enforce this?
return
self._funcio_running = true
PerformsBlockingIO()
self._funcio_running = false
return
def main_loop(self):
while True:
if detect_object():
# Run FuncIO asynchronously
else:
# Performs other tasks.
I'm a bit new to asyncio so I would like to know if there is an existing design pattern I can use to handle this scenario.
Thanks!

As far as I understand from your question, you don't leverage the advantages of asynchronous functionality with this approach, and maybe you don't even need it here.
If I didn't understand your question correctly, so there is a special lock mechanism in asyncio, you can read more here, would be something like that:
async def FuncIO(self):
await lock.acquire()
try:
PerformsBlockingIO()
finally:
lock.release()

asyncio loops: how to implement asynio in an existing python program - and share variables/data?

My application needs remote control over SSH.
I wish to use this example: https://asyncssh.readthedocs.io/en/latest/#simple-server-with-input
The original app is rather big, using GPIO and 600lines of code, 10 libraries. so I've made a simple example here:
import asyncio, asyncssh, sys, time
# here would be 10 libraries in the original 600line application
is_open = True
return_value = 0;
async def handle_client(process):
process.stdout.write('Enter numbers one per line, or EOF when done:\n')
process.stdout.write(is_open)
total = 0
try:
async for line in process.stdin:
line = line.rstrip('\n')
if line:
try:
total += int(line)
except ValueError:
process.stderr.write('Invalid number: %s\n' % line)
except asyncssh.BreakReceived:
pass
process.stdout.write('Total = %s\n' % total)
process.exit(0)
async def start_server():
await asyncssh.listen('', 8022, server_host_keys=['key'],
authorized_client_keys='key.pub',
process_factory=handle_client)
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(start_server())
except (OSError, asyncssh.Error) as exc:
sys.exit('Error starting server: ' + str(exc))
loop.run_forever()
# here is the "old" program: that would not run now as loop.run_forever() runs.
#while True:
# print(return_value)
# time.sleep(0.1)
The main app is mostly driven by a while True loop with lots of functions and sleep.
I've commented that part out in the simple example above.
My question is: How should I implement the SSH part, that uses loop.run_forever() - and still be able to run my main loop?
Also: the handle_client(process) - must be able to interact with variables in the main program. (read/write)

You have basically three options:
Rewrite your main loop to be asyncio compatible
A main while True loop with lots of sleeps is exactly the kind of code you want to write asynchronously. Convert this:
while True:
task_1() # takes n ms
sleep(0.2)
task_2() # takes n ms
sleep(0.4)
into this:
async def task_1():
while True:
stuff()
await asyncio.sleep(0.6)
async def task_2():
while True:
stuff()
await asyncio.sleep(0.01)
other_stuff()
await asyncio.sleep(0.8)
loop = asyncio.get_event_loop()
loop.add_task(task_1())
loop.add_task(task_2())
...
loop.run_forever()
This is the most work, but it is almost certain that your current code will be better written, clearer, easier to maintain and easier to develop if written as a bunch of coroutines. If you do this the problem goes away: with cooperative multitasking you tell the code when to yield, so sharing state is generally pretty easy. By not awaiting anything in between getting and using a state var you prevent race conditions: no need for any kind of thread-safe var.
Run your asyncio loop in a thread
Leave your current loop intact, but run your ascynio loop in a thread (or process) with either threading or multiprocessing. Expose some kind of thread-safe variable to allow the background thread to change state, or transition to a (thread safe) messaging paradigm, where the ssh thread emits messages into a queue which your main loop handles in its own time (a message could be something like ("a", 5) which would be handled by doing something like state_dict[msg[0]] == msg[1] for everything in the queue).
If you want to go this way, have a look at the multiprocessing and/or threading docs for examples of the right ways to pass variables or messages between threads. Note that this version will likely be less performant than a pure asyncio solution, particularly if your code is mostly sleeping in the main loop anyhow.
Run your synchronous code in a thread, and have asyncio in the foreground
As #MisterMiyagi points out, asyncio has loop.run_in_executor() for launching a process to run blocking code. It's more generally used to run the odd blocking bit of code without tying up the whole loop, but you can run your whole main loop in it. The same concerns about some kind of thread safe variable or message sharing apply. This has the advantage (as #MisterMiyagi points out) of keeping asyncio where it expects to be. I have a few projects which use background asyncio threads in generally non-asyncio code (event-driven gui code with an asyncio thread interacting with custom hardware over usb). It can be done, but you do have to be careful as to how you write it.
Note btw that if you do decide to use multiple threads, message-passing (with a queue) is usually easier than directly sharing variables.

1-item asyncio queue - is this some standard thing?

In one of my asyncio projects I use one synchronisation method quite a lot and was wondering, if it is some kind of standard tool with a name I could give to google to learn more. I used the term "1-item queue" only because I don't have a better name. It is a degraded queue and it is NOT related to Queue(maxsize=1).
# [controller] ---- commands ---> [worker]
The controller sends commands to a worker (queue.put, actually put_nowait) and the worker waits for them (queue.get) and executes them, but the special rule is that the only the last command is important and immediately replaces all prior unfinished commands. For this reason, there is never more than 1 command waiting for the execution in the queue.
To implement this, the controller clears the queue before the put. There is no queue.clear, so it must discard (with get_nowait) the waiting item, if any. (The absence of queue.clear started my doubts resulting in this question.)
On the worker's side, if a command execution requires a sleep, it is replaced by a newcmd=queue.get with a timeout. When the timeout occurs, it was a sleep; when the get succeeds, the current work is aborted and the execution of newcmd starts.

The type of queue you are using is not standard - there is such a thing as a one-shot queue, but it's a different thing altogether.
The queue doesn't really fit your use case, though you made it work with some effort. You don't really need queuing of any kind, you need a slot that holds a single object (which can be replaced) and a wakeup mechanism. asyncio.Event can be used for the wakeup and you can attach the payload object (the command) to an attribute of the event. For example:
async def worker(evt):
while True:
await evt.wait()
evt.clear()
if evt.last_command is None:
continue
last_command = evt.last_command
evt.last_command = None
# execute last_command, possibly with timeout
print(last_command)
async def main():
evt = asyncio.Event()
workers = [asyncio.create_task(worker(evt)) for _ in range(5)]
for i in itertools.count():
await asyncio.sleep(1)
evt.last_command = f"foo {i}"
evt.set()
asyncio.run(main())
One difference between this and the queue-based approach is that setting the event will wake up all workers (if there is more than one), even if the first worker immediately calls evt.clear(). A queue item, on the other hand, will be guaranteed to be handed off to a single awaiter of queue.get().

Calling another function asynchronously and never wait it to finish in Python

I am working on a chatbot, where before I reply to the user I make a DB call to save the chat in a table. This will be done each time user types something, and it increases the response time.
So to decrease the response time, we need to call this asynchronously.
How to do this in Python 3?
I have read tutorials of asyncio library, but did not understand it completely and could not understand how to make it work.
Another workaround is to use queuing system, but that sounds like an overkill.
Example:
request = get_request_from_chat
res = call_some_function_to_prepare_response()
save_data() # this will be call asynchronously
reply() # this should not wait save_data() to finish
Any suggestions are welcome.

Use loop.create_task(some_async_function()) to run an async function "in the background". For example, this answer shows how to do that in case of a trivial client-server communication.
In your case the pseudo-code would look like this:
request = await get_request_from_chat()
res = call_some_function_to_prepare_response()
loop = asyncio.get_event_loop()
loop.create_task(save_data()) # runs in the "background"
reply() # doesn't wait for save_data() to finish
For this to work, of course, the program must be written for asyncio and save_data must be a coroutine. For a chat server it's a good approach to follow anyway, so I would recommend to give asyncio a chance.

Because you mentioned
Another workaround is to use queuing system, but that sounds like an
overkill.
I assume you are open to other solutions so I will propose multi-threading approach:
from concurrent.futures import ThreadPoolExecutor
from time import sleep
def long_runnig_funciton(param1):
print(param1)
sleep(10)
return "Complete"
with ThreadPoolExecutor(max_workers=10) as executor:
future = executor.submit(long_runnig_funciton,["Param1"])
print(future.result(timeout=12))
Steps:
1) You create a ThreadPoolExecutor and define maximum number of concurrent tasks.
2) You submit a function with arguments it needs
3) You call result() on the return value from submit() when you need the results
Note that the result() can throw exception if exception was thrown in the submitted function
You can also check if the result of your call is ready with future.done() which returns True or False

Necessity of closing asyncio event loop explicitly

The Story:
I am currently looking through the asyncio basic examples, in particular this one - the simplest possible HTTP client. The main function starts an event loop, runs until the data fetching is complete and closes the event loop:
def main():
loop = get_event_loop()
try:
body = loop.run_until_complete(fetch())
finally:
loop.close()
print(body.decode('latin-1'), end='')
But, the code also works if I omit the loop.close():
def main():
loop = get_event_loop()
body = loop.run_until_complete(fetch())
print(body.decode('latin-1'), end='')
The Question:
While there is an example, the question is a generic one - what can potentially go wrong if one would forget to close the asyncio event loop? Is the event loop going to be always implicitly closed?

.close() can be used by different event loop implementations to free up system resources allocated by the loop (or do anything else). If you'll take a look at the code of _UnixSelectorEventLoop, which is the (default) IOLoop used in Linux, you would find the following code:
def close(self):
super().close()
for sig in list(self._signal_handlers):
self.remove_signal_handler(sig)
Here, for example, close() removes signal handlers registered with loop.add_signal_handler().
As multiple IOLoops can be started on different threads, or new IOLoops can be created after an old one is closed, (see asyncio.new_event_loop()), closing them should be considered as a good habit.
Update
Starting with Python 3.7 it is recommended to use asyncio.run instead of run_until_complete():
# Python 3.7+
def main():
body = asyncio.run(fetch())
print(body.decode('latin-1'), end='')
Among other things, asyncio.run takes care of finally close()ing the loop.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to use kqueue for file monitoring in asyncio? - python

Related

Process I/O bound co-routines in an infinite loop

asyncio loops: how to implement asynio in an existing python program - and share variables/data?

1-item asyncio queue - is this some standard thing?

Calling another function asynchronously and never wait it to finish in Python

Necessity of closing asyncio event loop explicitly

Categories

Resources