The goal is to create a docx document in parallel with the process of executing the rest of the program.
The "first" function should just call the asynchronous "second" which will create the docx.
Now i use modules asyncio, multiprocessing, concurrent.futures, but isn't create the docx:
def first(self, event):
pool = ThreadPoolExecutor(max_workers=multiprocessing.cpu_count())
loop = asyncio.get_event_loop()
loop.run_in_executor(pool, self.second)
async def second(self):
document = Document()
document.save('test.docx')
I'm sure the problem is with the "first" function, the way it calls "second", but one man said to me, that it's not the fault of asynchrony. Until I found the closest method to the solution, I was constantly faced with the problem that the document is created only after the completion of the entire program execution process - that's not the goal.
I'm working on an old project that doesn't have time to fix; there are a lot of errors in basic things inside, so the browser didn't help - it need something specific for the situation. Even so, please tell me how to solve the problem.
Thanks.
There's no need to create second async. I will presume that you can change it to regular function.
You probably just want to start file creation in background OS thread:
def first():
with ThreadPoolExecutor(max_workers=1) as executor:
fut = executor.submit(second) # start `second` in background
# rest of the program
fut.result() # make sure `second` is finished
def second():
document = Document()
document.save('test.docx')
In case bottleneck is disk I/O this should do the trick. In case bottleneck is CPU, you should consider using ProcessPoolExecutor instead of ThreadPoolExecutor.
Here's reproducible code to play with:
import time
from concurrent.futures import ThreadPoolExecutor
def first():
with ThreadPoolExecutor(max_workers=1) as executor:
fut = executor.submit(second) # start `second` in background
print('Rest of the program started')
time.sleep(2) # rest of the program
print('Rest of the program finished')
fut.result() # make sure `second` is finished
def second():
time.sleep(1) # create doc
print('Doc created')
first()
Related
I'm using a library that itself makes the call to asyncio.run(internal_function) so I can't control that at all. I do however have access to the event loop, it's something that I pass into this library.
Given that, is there some way I can set up an recurring async event that will execute every X seconds while the main library is running.
This doesn't exactly work, but maybe it's close?
import asyncio
from third_party import run
loop = asyncio.new_event_loop()
async def periodic():
while True:
print("doing a thing...")
await asyncio.sleep(30)
loop.create_task(periodic())
run(loop) # internally this will call asyncio.run() using the given loop
The problem here of course is that the task I've created is never awaited. But I can't just await it, because that would block.
Edit: Here's a working example of what I'm facing. When you run this code you will only ever see "third party code executing" and never see "doing my stuff...".
import asyncio
# I don't know how the loop argument is used
# by the third party's run() function,
def third_party_run(loop):
async def runner():
while True:
print("third party code executing")
await asyncio.sleep(5)
# but I do know that this third party eventually runs code
# that looks **exactly** like this.
try:
asyncio.run(runner())
except KeyboardInterrupt:
return
loop = asyncio.new_event_loop()
async def periodic():
while True:
print("doing my stuff...")
await asyncio.sleep(1)
loop.create_task(periodic())
third_party_run(loop)
If you run the above code you get:
third party code executing
third party code executing
third party code executing
^CTask was destroyed but it is pending!
task: <Task pending name='Task-1' coro=<periodic() running at example.py:22>>
/usr/local/Cellar/python#3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py:674: RuntimeWarning: coroutine 'periodic' was never awaited
You don't need to await on a created task.
It will run in the background as long as the event loop is active and is not stuck in a CPU bound operation.
According to your comment, you don't have an access to the event loop. In this case you don't have many options other than running in a different thread (which will have its own loop), or changing the loop creation policy in order to get the event loop, which is a very bad idea in most cases.
I found a way to make your test program run. However, it's a hack. It could fail, depending on the internal design of your third party library. From the information you provided, the library has been structured to be a black box. You can't interact with the event loop or schedule a callback. It seems like there might be a very good reason for this.
If I were you I would try to contact the library designer and let him know what your problem is. Perhaps there is a better solution. If this is a commercial project, I would make 100% certain that the team understands the issue, before attempting to use my below solution or anything like it.
The script below overrides one method (new_event_loop) in the DefaultEventLoopPolicy. When this method is called, I create a task in this loop to execute your periodic function. I don't know how often, or for what purpose, the library will call this function. Also, if the library internally overrides the EventLoopPolicy then this solution will not work. In both of these cases it may lead to unforeseeable consequences.
OK, enough disclaimers.
The only significant change to your test script was to replace the infinite loop in runner with a one that times out. This allowed me to verify that the program shuts down cleanly.
import asyncio
# I don't know how the loop argument is used
# by the third party's run() function,
def third_party_run():
async def runner():
for _ in range(4):
print("third party code executing")
await asyncio.sleep(5)
# but I do know that this third party eventually runs code
# that looks **exactly** like this.
try:
asyncio.run(runner())
except KeyboardInterrupt:
return
async def periodic():
while True:
print("doing my stuff...")
await asyncio.sleep(1)
class EventLoopPolicyHack(asyncio.DefaultEventLoopPolicy):
def __init__(self):
self.__running = None
super().__init__()
def new_event_loop(self):
# Override to create our periodic task in the new loop
# Get a loop from the superclass.
# This method must return that loop.
print("New event loop")
loop = super().new_event_loop()
if self.__running is not None:
self.__running.cancel() # I have no way to test this idea
self.__running = loop.create_task(periodic())
return loop
asyncio.set_event_loop_policy(EventLoopPolicyHack())
third_party_run()
We have a rather big project that is doing a lot of networking (API calls, Websocket messages) and that also has a lot of internal jobs running in intervals in threads. Our current architecture involves spawning a lot of threads and the app is not working very well when the system is under a big load, so we've decided to give asyncio a try.
I know that the best way would be to migrate the whole codebase to async code, but that is not realistic in the very near future because of the size of the codebase and the limited development resources. However, we would like to start migrating parts of our codebase to use asyncio event loop and hopefully, we will be able to convert the whole project at some point.
The problem we have encountered so far is that the whole codebase has sync code and in order to add non-blocking asyncio code inside, the code needs to be run in different thread since you can't really run async and sync code in the same thread.
In order to combine async and sync code, I came up with this approach of running the asyncio code in a separate thread that is created on app start. Other parts of the code add jobs to this loop simply by calling add_asyncio_task.
import threading
import asyncio
_tasks = []
def threaded_loop(loop):
asyncio.set_event_loop(loop)
global _tasks
while True:
if len(_tasks) > 0:
# create a copy of needed tasks
needed_tasks = _tasks.copy()
# flush current tasks so that next tasks can be easily added
_tasks = []
# run tasks
task_group = asyncio.gather(*needed_tasks)
loop.run_until_complete(task_group)
def add_asyncio_task(task):
_tasks.append(task)
def start_asyncio_loop():
loop = asyncio.get_event_loop()
t = threading.Thread(target=threaded_loop, args=(loop,))
t.start()
and somewhere in app.py:
start_asyncio_loop()
and anywhere else in the code:
add_asyncio_task(some_coroutine)
Since I am new to asyncio, I am wondering if this is a good approach in our situation or if this approach is considered an anti-pattern and has some problems that will hit us later down the road? Or maybe asyncio already has some solution for this and I'm just trying to invent the wheel here?
Thanks for your inputs!
The approach is fine in general. You have some issues though:
(1) Almost all asyncio objects are not thread safe
(2) Your code is not thread safe on its own. What if a task appears after needed_tasks = _tasks.copy() but before _tasks = []? You need a lock here. Btw making a copy is pointless. Simple needed_tasks = _tasks will do.
(3) Some asyncio constructs are thread safe. Use them:
import threading
import asyncio
# asyncio.get_event_loop() creates a new loop per thread. Keep
# a single reference to the main loop. You can even try
# _loop = asyncio.new_event_loop()
_loop = asyncio.get_event_loop()
def get_app_loop():
return _loop
def asyncio_thread():
loop = get_app_loop()
asyncio.set_event_loop(loop)
loop.run_forever()
def add_asyncio_task(task):
asyncio.run_coroutine_threadsafe(task, get_app_loop())
def start_asyncio_loop():
t = threading.Thread(target=asyncio_thread)
t.start()
I am trying to run a function in the background till some work is done in the main function and then finish the thread. I have implemented the threading logic in a separate class and the main in another file. But every time I run it the target function only seems to run once and then waits
Here is the main function
from ThreadRipper import *
thread_obj=ThreadRipper()
thread_obj.start_thread()
squish.snooze(10)
print("Main Continuing")
thread_obj.stop_thread()
And the implemented class is as follows
class ThreadRipper():
def __init__(self):
lib_path="iCOMClient.dll"
self.vx = IcomVx(lib_path)
config = ConfigParser.SafeConfigParser(allow_no_value=True)
config.readfp(open("vx_logger_config.cfg"))
self.vx.connect(config.get("icom","ip"), timeout_millis = 30000)
self.t = threading.Thread(target=self.task_to_do, args=(self.vx,))
def stop_thread(self):
self.t.do_run=False
self.t.join()
def start_thread(self):
self.t.start()
def task_to_do(self,arg):
current_thread=threading.currentThread()
while getattr(current_thread,"do_run",True):
with open("vx.txt",'a') as f:
f.write(str(arg.get_next_message()._rawmsg)+"\n")
time.sleep(1)
print("Stopping")
arg.disconnect()
When I run this I get the vx file created but with only one entry, I expect it to be written to continuously till the while loop exits. I am quite new at threading and may have understood it incorrectly. Please advise
Thank you
The reason is probably because
print("Stopping")
arg.disconnect()
are both inside the while loop. After disconnecting, arg doesn't seem to produce any more messages.
(Unless, of course, your code in the question is not what you really have, but in this case, you surely would have edited your question so it matches.)
I want to run a function independently. From the function I call, I want return without waiting for the other function ending.
I tried with threadind, but this will wait, the end.
thread = threading.Thread(target=myFunc)
thread.daemon = True
thread.start()
return 'something'
Is it possible to return immediately and the other process still run?
Thanks for the Answers.
EDITED
The working code looks like:
import concurrent.futures
executor = concurrent.futures.ThreadPoolExecutor(2)
executor.submit(myFunc, arg1, arg2)
You are more or less asking the following question:
Is it possible to run function in a subprocess without threading or writing a separate file/script
You have to change the example code from the link like this:
from multiprocessing import Process
def myFunc():
pass # whatever function you like
p = Process(target=myFunc)
p.start() # start execution of myFunc() asychronously
print)'something')
p.start() is executed asychronously, i.e. 'something' is printed out immediately, no matter how time consuming the execution of myFunc() is. The script executes myFunc() and does not wait for it to finish.
if I understood your request correctly, you might want to take a look on worker queues
https://www.djangopackages.com/grids/g/workers-queues-tasks/
Basically it's not a good idea to offload the work to thread created in view, this is usually handled by having a pool of background workers (processes, threads) and the queue for incoming requests.
I think the syntax you are using is correct and I don't see why your request shouldn't return immediately. Did you verify the request actually hang till the thread is over?
I would suggest to set myFunc to write to a file for you to track this
def myFunc():
f = open('file.txt', 'w')
while True:
f.write('hello world')
I'm trying to accomplish something without using threading
I'd like to execute a function within a function, but I dont want the first function's flow to stop. Its just a procedure and I don't expect any return and I also need this to keep the execution for some reasons.
Here is a snippet code of what I'd like to do:
function foo():
a = 5
dosomething()
# I dont wan't to wait until dosomething finish. Just call and follow it
return a
Is there any way to do this?
Thanks in advance.
You can use https://docs.python.org/3/library/concurrent.futures.html to achieve fire-and-forget behavior.
import concurrent.futures
def foo():
a = 5
with ThreadPoolExecutor(max_workers=1) as executor:
future = executor.submit(dosomething)
future.add_done_callback(on_something_done)
#print(future.result())
#continue without waiting dosomething()
#future.cancel() #To cancel dosomething
#future.done() #return True if done.
return a
def on_something_done(future):
print(future.result())
[updates]
concurrent.futures is built-in since python 3
for Python 2.x you can download futures 2.1.6 here
Python is synchronous, you'll have to use asynchronous processing to accomplish this.
While there are many many ways that you can execute a function asynchronously, one way is to use python-rq. Python-rq allows you to queue jobs for processing in the background with workers. It is backed by Redis and it is designed to have a low barrier to entry. It should be integrated in your web stack easily.
For example:
from rq import Queue, use_connection
def foo():
use_connection()
q = Queue()
# do some things
a = 5
# now process something else asynchronously
q.enqueue(do_something)
# do more here
return a