How to handle simultaneous requests in Django? - python

I have a standard function-based view in Django which receives some parameters via POST after the user has clicked a button, computes something and then returns a template with context.
#csrf_exempt
def myview(request, param1, param2):
if request.method == 'POST':
return HttpResponseRedirect(reverse("app1:view_name", args=[param1, param2]))
'''Calculate and database r/w'''
template = loader.get_template('showData.html')
return HttpResponse(template.render(context, request))
It works with no problem as long as one request is processed at the time (tested both with runserver and in an Apache server).
However, when I use two devices and click on the button simultaneously in each, both requests are mixed up, run simultaneously, and the website ends up trowing a 500 error, or 404 or sometimes success but cannot GET static files.. (again, tested both with runserver and Apache).
How can I force Django to finish the execution of the current request before starting the next?
Or is there a better way to tackle this?
Any light on this will be appreciated. Thanks!

To coordinate threads within a single server process, use
from threading import RLock
lock = RLock()
and then within myview:
lock.acquire()
... # get template, render it
lock.release()
You might start your server with $ uwsgi --processes 1 --threads 2 ...

Django web server on local machine is not for production environment. So it processes one request at a time. In production, you need to use WSGI server, like uwsgi. With that your app can be set up to serve more than one request at a time. Check https://docs.djangoproject.com/en/2.1/howto/deployment/wsgi/uwsgi/

I post my solution in case its of any help to other.
Finally I configured Apache with a pre-forking to isolate requests from each other. According to the documentation the pre-forking is advised for sites using non-thread-safe libraries (my case, apparently).
With this fix Apache can handle well simultaneous requests. However I will still be glad to hear if someone else has other suggestions!

There should be ways to rewrite the code such, that things do not get mixed up. (At least in many cases this is possible)
One of the pre-requirements (if your server uses threading) is to write thread safe code
This means not using global variables (which is bad practice anyway) (or protecting them with Locks)
and using no calls to functions that aren't thread safe. (or protect them with Locks)
As you don't provide any details we cannot help with this. (this = finding a way to not make the whole request blocking, but keep data integrity)
Otherwise you could use a mutex / Lock, that works across multiple processes.
you could for example try to access a locked file
https://pypi.org/project/filelock/ and block until the file is unlocked by the other view.
example code (after pip installing filelock)
from filelock import FileLock
lock = FileLock("my.lock")
with lock:
if request.method == 'POST':
return HttpResponseRedirect(reverse("app1:view_name", args=[param1, param2]))
'''Calculate and database r/w'''
template = loader.get_template('showData.html')
return HttpResponse(template.render(context, request))
If you use uwsgi, then you could look at the uwsgi implementation of locks:
https://uwsgi-docs.readthedocs.io/en/latest/Locks.html
Here the example code from the uwsgi documentation:
def use_lock_zero_for_important_things():
uwsgi.lock() # Implicit parameter 0
# Critical section
uwsgi.unlock() # Implicit parameter 0
def use_another_lock():
uwsgi.lock(1)
time.sleep(1) # Take that, performance! Ha!
uwsgi.unlock(1)

Related

Degraded performance using python's 'threading' on a Nginx server comparing to a local machine

I've built a Flask application, that computes some paths in a graph. Usually, it's a very greedy task and it takes a lot of time to finish calculations. While I was busy with configuring algorithm, I didn't really pay attention to the server side implementations. We've set up an Nginx server, that servers that whole thing. Here's main Flask route:
#app.route('/paths', methods=['POST'])
def paths():
form = SampleForm(request.form)
if form.validate_on_submit():
point_a = form.point_a.data
point_b = form.point_b.data
start = form.start.data.strftime('%Y%m%d')
end = form.end.data.strftime('%Y%m%d')
hops = form.hops.data
rendering_time, collections = make_collection(point_a, point_b, start, end, hops)
return render_template(
'result.html',
searching_time=rendering_time,
collections=collections)
else:
logger.warning('Bad form: {}'.format(form.errors))
return render_template('index.html', form=form)
The whole calculation thing lies under make_collection method. So, whenever user sends request to the server.com/path, he will have to wait, until the method completes calculations and returns something. This is not a pleasing solution, sometimes Nginx just goes timeout.
The next version of this was with a simple idea of delegating labor work to some thread and just returning an empty page to the user. Later on we can just update page contents with the latest searhing results.
#app.route('/paths', methods=['POST'])
def paths():
form = SampleForm(request.form)
if form.validate_on_submit():
point_a = form.point_a.data
point_b= form.point_b.data
start = form.start.data.strftime('%Y%m%d')
end = form.end.data.strftime('%Y%m%d')
hops = form.hops.data
finder = threading.Thread(
target=make_collection,
kwargs={
'point_a': point_a,
'point_b': point_b,
'start': start,
'end': end,
'hops': hops})
finder.start()
rendering_time, collections = 0, []
return render_template(
'result.html',
searching_time=rendering_time
collections=collections)
else:
logger.warning('Bad form: {}'.format(form.errors))
return render_template('index.html', form=form)
The code above works fine and with accepatable searching time(didn't changed from the first version, like expected). The problem is, it works like that only on my local machine. When I deploy this to the Nginx, the total performance is not even nearly close to what I'm expecting. For comparison, results that I find on my local machine under 30 seconds, Nginx cannot fully find even under 300 seconds. What to do?
P.S. Originially, setting up Nginx server wasn't my part of the job and I'm not very familiar how Nginx works, but if you need any info, please, ask.
First code snippet looks like an easy way to let client fetch calculations results.
However, make_collection is a blocking one and Nginx will keep one of its workers busy with it. Since usual way of Nginx configuration is to have one worker per CPU core, that leaves you with one worker less each time you make a HTTP request to /paths. If there are multiple requests to /paths then is not surprise that you get a poor performance. Not to mention WSGI server that you probably have e.g. uwsgi, gunicorn, etc. and their workers and threads per worker process.
Solution with threads might look like a good solution, but you can end up with a lot of threads. Pay attention to threads in Python and try to avoid CPU bound work from being delegated to threads in Python unless you really know what you are doing.
In general you should try to avoid these blocking calls, like the one you make and offload them to a separate worker queue while keeping a reference for getting results later on.

Entry point for a Django app which is simply a redis subscribe loop that needs access to models - no urls/views

I currently have an external non-Django python process which is a simple redis subscribe loop that simply munges the messages it receives and inserts the result in a user mailbox (redis list), which my main app accesses on requests.
My listener now needs access to models, so it makes sense (to me) to make this a Django app. Being a loop, however, I imagine it's probably best to run this as a separate process.
Edit: removed my own proposed solution using AppConfig.ready() and running the separate process via gunicorn.
What I'm doing is pretty simple, but I'm a bit confused as to where the entry point for this app should be. Any ideas?
Any help/suggestions would be appreciated,
-Scott
I went ahead with #DanielRoseman's suggested and used a management command as the entry point.
I simply added a management command 'runsubscriber' which looks like:
my_app/management/commands/redis_subscriber.py
def handle(self, *args, **options):
rsl = RedisSubcribeLoop()
try:
rsl.start()
except KeyboardInterrupt:
rsl.stop()
I can now run this as a separate process via ./manage.py runsubscriber
and kill it with a ^C. my stop() looks like:
myapp/redis_subscribe_loop.py
def stop(self):
self.rc.punsubscribe() # unsubscribe from all channels
self.rc.close()
so that it shuts down cleanly.
Thanks for all your help,
-Scott

too many sse connections hangs the webpage

What limits the number of SSE(server sent event) connections?
I have been working on a project using django/gunicorn/django-sse.
My project works great when i limit the number of sse connections to the page (5 works 6 hangs), this isnt a huge problem cause i use pagination so can limit the number per page. but i would prefer to be able to have as many as i like.
My question is: is it the number of connections that is slowing it down, or is it the amount of data being transfered?
the first problem i think i could fix by making them share a connection but the second would probably limit me a bit more.
Any ideas which it may be?
EDIT:
client side JS SSE code:
function event(url, resource_name, yes, no, audio_in, audio_out, current_draw){
/**
* Listens for events posted by the server
*
* Useful site for understanding Server Sent Events:
* http://www.w3.org/TR/eventsource/
*/
var source = new EventSource(url);
source.addEventListener("message", function(e) {
resetTime(resource_name);
data = updateStatus(e.data, yes, no, audio_in, audio_out, current_draw);
document.getElementById(resource_name+"-in").src = data.audio_in_src
document.getElementById(resource_name+"-in").alt = data.audio_in_alt
document.getElementById(resource_name+"-out").src = data.audio_out_src
document.getElementById(resource_name+"-out").alt = data.audio_out_alt
document.getElementById(resource_name+"-current").innerHTML = data.current_draw + " A"
});
}
in views.py
class ServerSentEvent(RedisQueueView):
def get_redis_channel(self):
"""
Overrides the RedisQueueView method to select the channel to listen to
"""
return self.kwargs["resource_name"]
in urls.py
urlpatterns = patterns('',
url(r'^$',
views.Resources_page.as_view(),
name='resources_page'),
url(r'^(?P<resource_name>\w+)/$',
views.StatusPage.as_view(),
name='status_page'),
url(r'^(?P<resource_name>\w+)/sse/$',
views.ServerSentEvent.as_view(),
name='sse'),)
If you're using the sync worker for gunicorn (the default), then you can only have as many concurrent connections to your server as you have worker processes.
The sync worker is designed for CPU-bound tasks, hence the recommendation to use 2N + 1 workers (where N is the number of cores available). If your SSE endpoint is the logical equivalent of this...
while True:
msg = "foo"
yield msg
sleep(1)
...then you have an I/O-bound view. No matter how much CPU time you throw at that block of code, it's designed to never end. If you use the django_sse project, then this is almost exactly what your SSE view is doing.
The solution is to use an asynchronous worker class for gunicorn. Install gevent and pass --worker-class=gevent option to gunicorn and you're on your way to an asynchronous utopia.
I had the exact same problem. I was sure I was using gevent as the worker, but I only got around 6 connections.
The solution was stupid. This was a browser limitation. I am writing it down here for the next person to stumble on this to see..
In firefox, there is a parameter in about:config that is named network.http.max-persistent-connections-per-server that controls this. So the solution in my case was increasing that number from 6 (default), or using several browsers..
The browser usually limit the connections to same server, you can check that from the configuration of your browser, e.g. in Firefox, you can check on page "about:config", where you can see such key:value { network.http.speculative-parallel-limit 6 }, of course, you can change the number to more and test.
So this is not the problem on server side.

Unable to use Python's threading.Thread in Django app

I am trying to create a web application as a front end to another Python app. I have the user enter data into a form, and upon submitting, the idea is for the data to be saved in a database, and for the data to be passed to a thread object class. The thread is something that is strictly kicked-off based on a user action. My problem is that I can import threading, but cannot access threading.Thread. When the thread ends, it will update the server, so when the user views the job information, they'll see the results.
View:
#login_required(login_url='/login')
def createNetworkView(request):
if request.method == "POST":
# grab my variables from POST
job = models.MyJob()
# load my variables into MyJob object
job.save()
t = ProcessJobThread(job.id, my, various, POST, inputs, here)
t.start()
return HttpResponseRedirect("/viewJob?jobID=" + str(job.id))
else:
return HttpResponseRedirect("/")
My thread class:
import threading # this works
print "About to make thread object" # This works, I see this in the log
class CreateNetworkThread(threading.Thread): # failure here
def __init__(self, jobid, blah1, blah2, blah3):
threading.Thread.__init__(self)
def run(self):
doCoolStuff()
updateDB()
I get:
Exception Type: ImportError
Exception Value: cannot import name Thread
However, if I run python on the command line, I can import threading and also do from threading import Thread. What's the deal?
I have seen other things, like How to use thread in Django and Celery but that seemed overkill, and I don't see how that example could import threading and use threading.Thread, when I can't.
Thank you.
Edit: I'm using Django 1.4.1, Python 2.7.3, Ubuntu 12.10, SQLite for the DB, and I'm running the web application with ./manage.py runserver.
This was a silly issue I had. First, I had made a file called "threading.py" and someone suggest I delete it, which I did (or thought I did). The problem was because of me using Eclipse, the PyDev (Python) plugin for Eclipse only deleted the threading.py file I created, and hides the *.pyc file. I had a lingering threading.pyc file lingering around, even though PyDev has an option that I had enabled to delete orphaned .pyc files.

Specifying timeout while fetching/putting data in memcache (django)

I have a django based http server and I use django.core.cache.backends.memcached.MemcachedCache as the client library to access memcache. I want to know whether we can set a timeout or something (say 500ms.) so that the call to memcached returns False if it is not able to access the cache for 500ms. and we make the call to the DB. Is there any such setting to do that?
Haven't tried this before, but you may be able to use threading and set up a timeout for the function call to cache. As an example, ignore the example provided in the main body at this link, but look at Jim Carroll's comment:
http://code.activestate.com/recipes/534115-function-timeout/
Adapted for something you might use:
from threading import Timer
import thread, time, sys
def timeout():
thread.interrupt_main()
try:
Timer(0.5, timeout).start()
cache.get(stuff)
except:
print "Use a function to grab it from the database!"
I don't have time to test it right now, but my concern would be whether Django itself is threaded, and if so, is interrupting the main thread what you really want to do? Either way, it's a potential starting point. I did look for a configuration option that would allow for this and found nothing.

Categories