Tracking the progress of a function in another function - python

I have a Python back end running Flask, and it is going to have a function (or a few functions chained together) which will run several AJAX calls and perform some database operations.
This will take a while, so on the front end I'm looking to poll the server at regular intervals and update the UI as progress is made. A general outline might be something like this:
app.route('/update', methods=['GET'])
def getUpdate():
# return a response with the current status of the update
#app.route('/update', methods=['POST'])
def runUpdate():
# asynchronously call update() and return status
def update():
# perform ajax calls
# update database
# query database
# ...
I considered WebSockets, but I don't know if that's making things a little too complex for just a simple update in the UI. I know I could also use a module-scoped variable or store the status in a database table, but either of those feels like bad design to me. Is there a simple pattern I can use to achieve this?

Use a database to store the status. If you use something like redis, you can even do it realtime with pub/sub and websockets.
The module-scoped variable is a bad choice. It doesn't scale.
If it is a long running task, consider using a task queue, like rq or celery.

Related

Update variable in the background

so I'm currently writing a Flask application and I am new to flask.
There is some processing going on, which I outsourced to a separate function.
As this processing takes some time, I wanted to give the user a progress update on how many iterations have passed. No problem so far.
However, as soon I call the render template, the function ends and I cannot update that variable anymore.
I was imagining an if loop. if that variable changes, render template with the new variable as input.
But after the first iteration, the if loop will brake.
Currently, the render template renders an html function, which just displays the variable it receives. I want to update that variable as soon as it changes.
Do you guys have any suggestion, on how I could achieve this "background update"?
Cheers and thanks!
You need some kind of ongoing request/response cycle.
As soon as your app sends the response with the rendered template back to the browser, this connection is closed and there's no way to send any more data.
There's a few things that need to happen in order to accomplish what you want:
The long running function needs to run in the background so it doesn't block execution of the rest of the application
There has to be a way to get a status update from the long running function
The client (ie browser) needs a way to receive the status updates
1 and 2 can be solved using celery. It allows you to run tasks in the background and the task to send information via a side channel to be consumed elsewhere.
The easiest way to achieve 3 would be to set up a route in your flask application that returns information about the task, and request it periodically from the browser using some JavaScript. The more favorable method in my opinion would be to use websockets to actively send out the information to the client, but this is a bit more complicated.
This is just a rough outline, but there's a tutorial by Miguel Grinberg about how to set this up using celery and polling from JS.

submitting multiple POST requests as multi threads in Python/Django

My development stack is Django/Python. (that includes Django-REST-framework)
I am currently looking for ways to do multiple distinct API calls.
client.py
def submit_order(list_of_orders):
//submit each element in list_of_orders in thread
for order in list_of_orders:
try:
response=request.POST(order, "www.confidential.url.com")
except:
//retry_again
else:
if !response.status==200:
//retry_again
In the above method, I am currently submitting order one by one, I want to submit all orders at once. Secondly, I want to retry submission for x times if it fails.
I currently do not know how well to achieve it.
I am looking for ways that python libraries or Django application provide rather than re-inventing the wheel.
Thanks
As #Selcuk said you can try django-celery which is a recommended approach in my opinion, but you will need to make some configuration and read some manuals.
On the other hand, you can try using multiprocessing like this:
from multiprocessing import Pool
def process_order(order):
#Handle each order here doing requests.post and then retrying if neccesary
pass
def submit_order(list_of_orders):
orders_pool = Pool(len(list_of_orders))
results = orders_pool.map(process_order, list_of_orders)
#Do something with the results here
It will depend on what you need to get done, if you can do the requests operations on the background and your api user can be notified later, just use django-celery and then notify the user accordingly, but if you want a simple approach to react immediately, you can use the one I "prototyped" for you.
You should consider some kind of delay on the responses for your requests (as you are doing some POST request). So make sure your POST request don't grow a lot, because it could affect the experience of the API clients calling your services.

Perform Task Directly After Returning JSON

I need to perform a task whenever the mobile app requests certain data. The user does not need the task performed right away, but may need it within the next 2 minutes.
I am still fairly new to Python / web dev so I am not quite sure how to accomplish this.
I don't want the user to wait for the task performed, it'll probably take 30 seconds, but I'd still it rather be 30 seconds faster.
Is there anyway that I can send a response, so that the user gets the required info immediately, and then the task is performed right after sending the JSON.
Is it possible to send a Response to the mobile app that asked for the data without using return so that the method can continue to perform the task the user does not need to wait for?
#app.route('/image/<image_id>/')
def images(image_id):
# get the resource (unnecessary code removed)
return Response(js, status=200, mimetype='application/json')
# once the JSON response is returned, do some action
# (what I would like to do somehow, but don't know how to get it to work
On second thought maybe I need to do this action somehow asynchronously so it does not block the router (but it still needs to be done right after returning the JSON)
UPDATE - in response to some answers
For me to perform such tasks, is a Worker server on Heroku recommended / a must or is there another, cheaper way to do this?
you can create a second thread to do the extra work :
t = threading.Thread(target=some_function, args=[argument])
t.setDaemon(False)
t.start()
you should also take a look at celery or python-rq
Yes, you need a task queue. There are a couple of options.
Look at this other question: uWSGI for uploading and processing files
And of course your code is wrong since once you return your terminating code execution of that function you're in.

Is anyone doing asynchronous DB commits?

Most of the longest (most time-consuming) logic I've encountered basically involves two things: sending email and committing items to the database.
Is there any kind of built-in mechanism for doing these things asynchronously so as not to slow down page load?
Validation should be handled synchronously, but it really seems that the most performant way to email and write to the database should be asynchronously.
For example, let's say that I want to track pageviews. Thus, every time I get a view, I do:
pv = PageView.objects.get(page = request.path)
pv.views = pv.views + 1
pv.save() # SLOWWWWWWWWWWWWWW
Is it natural to think that I should speed this up by making the whole process asynchronous?
Take a look at Celery. It gives you asynchronous workers to offload tasks exactly like you're asking about: sending e-mails, counting page views, etc. It was originally designed to work only with Django, but now works in other environments too.
I use this pattern to update the text index (which is slow), since this can be done in background. This way the user sees a fast response time:
# create empty file
dir=os.path.join(settings.DT.HOME, 'var', 'belege-changed')
file=os.path.join(dir, str(self.id))
fd=open(file, 'a') # like "touch" shell command
fd.close()
A cron-job scans this directory every N minutes and updates the text index.
In your case, I would write the request.path to a file, and update the PageView model in background. This would improve the performance, since you don't need to hit the database for every increment operator.
You can have a python ThreadPool and assign the writes to the database. Although GIL prevent the Python threads to work concurrently this allow to continue the response flow before the write is finished.
I use this technique when the result of the write is not important to render the response.
Of course, if you want to post request and want to return a 201, this is not a god practice.
http://www.mongodb.org/ can do this.

Is there any way to make an asynchronous function call from Python [Django]?

I am creating a Django application that does various long computations with uploaded files. I don't want to make the user wait for the file to be handled - I just want to show the user a page reading something like 'file is being parsed'.
How can I make an asynchronous function call from a view?
Something that may look like that:
def view(request):
...
if form.is_valid():
form.save()
async_call(handle_file)
return render_to_response(...)
Rather than trying to manage this via subprocesses or threads, I recommend you separate it out completely. There are two approaches: the first is to set a flag in a database table somewhere, and have a cron job running regularly that checks the flag and performs the required operation.
The second option is to use a message queue. Your file upload process sends a message on the queue, and a separate listener receives the message and does what's needed. I've used RabbitMQ for this sort of thing, but others are available.
Either way, your user doesn't have to wait for the process to finish, and you don't have to worry about managing subprocesses.
I have tried to do the same and failed after multiple attempt due of the nature of django and other asynchronous call.
The solution I have come up which could be a bit over the top for you is to have another asynchronous server in the background processing messages queues from the web request and throwing some chunked javascript which get parsed directly from the browser in an asynchronous way (ie: ajax).
Everything is made transparent for the end user via mod_proxy setting.
Unless you specifically need to use a separate process, which seems to be the gist of the other questions S.Lott is indicating as duplicate of yours, the threading module from the Python standard library (documented here) may offer the simplest solution. Just make sure that handle_file is not accessing any globals that might get modified, nor especially modifying any globals itself; ideally it should communicate with the rest of your process only through Queue instances; etc, etc, all the usual recommendations about threading;-).
threading will break runserver if I'm not mistaken. I've had good luck with multiprocess in request handlers with mod_wsgi and runserver. Maybe someone can enlighten me as to why this is bad:
def _bulk_action(action, objs):
# mean ponies here
def bulk_action(request, t):
...
objs = model.objects.filter(pk__in=pks)
if request.method == 'POST':
objs.update(is_processing=True)
from multiprocessing import Process
p = Process(target=_bulk_action,args=(action,objs))
p.start()
return HttpResponseRedirect(next_url)
context = {'t': t, 'action': action, 'objs': objs, 'model': model}
return render_to_response(...)
http://docs.python.org/library/multiprocessing.html
New in 2.6

Categories