Django http connection timeout - python

I have Django + mod_wsgi + Apache server. I need to change default HTTP connection timeout. There is Timeout directive in apache config but it's not working.
How can I set this up?

I solved this problem with:
python manage.py runserver --http_timeout 120

There is few timeout options in mod_wsgi WSGIDaemonProcess directive(check out request-timeout):
https://modwsgi.readthedocs.io/en/develop/configuration-directives/WSGIDaemonProcess.html
inactivity-timeout=sss (2.0+)
Defines the maximum number of seconds allowed to pass before the
daemon process is shutdown and restarted when the daemon process has
entered an idle state. For the purposes of this option, being idle
means no new requests being received, or no attempts by current
requests to read request content or generate response content for the
defined period. This option exists to allow infrequently used
applications running in a daemon process to be restarted, thus
allowing memory being used to be reclaimed, with process size dropping
back to the initial startup size before any application had been
loaded or requests processed.
request-timeout=sss
Defines the maximum number of seconds that a request is allowed to run
before the daemon process is restarted. This can be used to recover
from a scenario where a request blocks indefinitely, and where if all
request threads were consumed in this way, would result in the whole
WSGI application process being blocked.
How this option is seen to behave is different depending on whether a
daemon process uses only one thread, or more than one thread for
handling requests, as set by the threads option.
If there is only a single thread, and so the process can only handle
one request at a time, as soon as the timeout has passed, a restart of
the process will be initiated.
If there is more than one thread, the request timeout is applied to
the average running time for any requests, across all threads. This
means that a request can run longer than the request timeout. This is
done to reduce the possibility of interupting other running requests,
and causing a user to see a failure. So where there is still capacity
to handle more requests, restarting of the process will be delayed if
possible.
deadlock-timeout=sss (2.0+)
Defines the maximum number of seconds allowed to pass before the
daemon process is shutdown and restarted after a potential deadlock on
the Python GIL has been detected. The default is 300 seconds. This
option exists to combat the problem of a daemon process freezing as
the result of a rouge Python C extension module which doesn't properly
release the Python GIL when entering into a blocking or long running
operation.
shutdown-timeout=sss
Defines the maximum number of seconds allowed to pass when waiting for
a daemon process to gracefully shutdown as a result of the maximum
number of requests or inactivity timeout being reached, or when a user
initiated SIGINT signal is sent to a daemon process. When this timeout
has been reached the daemon process will be forced to exited even if
there are still active requests or it is still running Python exit
functions. If this option is not defined, then the shutdown timeout
will be set to 5 seconds. Note that this option does not change the
shutdown timeout applied to daemon processes when Apache itself is
being stopped or restarted. That timeout value is defined internally
to Apache as 3 seconds and cannot be overridden.
...
Docs about WSGIDaemonProcess:
Using mod_wsgi daemon mode
Defining Process Groups

Related

Python function with request might hang, how to timeout?

I'm building a script to process messages using the O365 module (https://pypi.org/project/O365/).
The script runs great but for some reason, after a random time (usually about 20 hours) it get's stuck on a request without response and the script just hangs there waiting for a response.
It's not a server throttling issue as I've slowed my script down to one request every minute and it still hangs.
I think it might be a bug in O365 module where it doesn't timeout the requests, so I'm thinking on making the calls on a separate thread and if it doesn't return in a certain amount of time, kill it.
But from what I understand, if I just try to join the thread it will try to wait until it finishes (which is never), is there a way to avoid this?
Thanks!
You can use multithreading and the join method. As explained in the documentation: "This blocks the calling thread until the thread whose join() method is called terminates – either normally or through an unhandled exception – or until the optional timeout occurs."
Your request will either terminate because it has been completed or because the maximum time limit has been reached.

Celery doesn't acknowledge tasks if stopped too quickly

For a project using Celery, I would like to test the execution of a task.
I know that the documentation advises to mock it but since I'm not using the official client I would want to check for a specific test that everything works well.
Then I set up a very simple task that takes as parameters an Unix socket name and a message to write to it: the task opens the connection on the socket, writes the message and closes the connection.
Inside the tests, the Celery worker is launched with a subprocess: I start it before sending the task, send it a SIGTERM when I receive the message on the socket and then wait for the process to close.
Everything goes well : the message is received, it matches what is expected and the worker correctly terminates.
But I found that when the tests stop, a message still remains within the RabbitMQ queue, as if the task had never been acknowledged.
I confirmed this by looking at the RabbitMQ graphical interface: a "Deliver" occurs after the task is executed but no "Acknowledge".
This seems strange because using the default configuration the acknowledge should be sent before task execution.
Going further in my investigations I noticed that if I add a sleep of a split second just before sending SIGTERM to the worker, the task is acknowledged.
I tried to inspect the executions with or without sleep using strace, here are the logs:
Execution with a sleep of 0.5s.
Execution without sleep.
The only noticeable difference I see is that with sleep the worker has time to start a new communication with the broker. It receives an EAGAIN from a recvfrom and sends a frame "\1\0\1\0\0\0\r\0<\0P\0\0\0\0\0\0\0\1\0\316".
Is this the acknowledge? Why does this occur so late?
I give you the parameters with which I launch the Celery worker: celery worker --app tests.functional.tasks.app --concurrency 1 --pool solo --without-heartbeat.
The --without-heartbeat is just here to reduce differences between executions with or without sleep. Otherwise an additional heartbeat frame would occur in the execution with sleep.
Thanks.

How to do graceful application shutdown from mod_wsgi

So I have a Flask application which I've been running on Flask's built-in server, and am ready to move it to production. This application manages several child processes. To this point, I've been handling graceful shutdown using signals. In particular, one shutdown mode I've used is to have sending a SIGHUP to the flask server cause the application to propagate that signal to its children (so they can gracefully shutdown), and then let the application process shutdown.
In production, we're planning on using mod_wsgi. I've read that wsgi applications really shouldn't be handling signals.
So my question is, how should I achieve the following behavior with this setup?:
When apache receives SIGTERM, it notifies the wsgi daemons before terminating them
The wsgi daemons are given a chance to do some cleanup on their own before shutting down
Send SIGTERM to the Apache parent process and that is what sort of happens now.
What happens is that when the Apache parent process receives SIGTERM, it in turn sends SIGTERM to all its child worker processes, as well as to the managed mod_wsgi daemon processes if using daemon mode. Those sub process will stop accepting new requests and will be given up to 3 seconds to complete existing requests, before the sub processes are forcibly shutdown.
So the default behaviour of SIGTERM is to allow a bit of time to complete requests, but long running requests will not be allowed to hold up complete server shutdown. How long it does wait for sub processes to shutdown is not configurable and is fixed at 3 seconds.
Instead of SIGTERM, you can send a SIGWINCH signal instead. This will cause Apache to do a graceful stop, but this has issues.
What happens in the case of SIGWINCH, is that Apache will send SIGTERM to its child worker process again, but instead of forcibly killing off the processes after 3 seconds, it will allow them to run until at least any active requests have completed.
A problem with this is that there is no fail safe. If those requests never finish, there is no timeout that I know of which will see the child worker processes forcibly shutdown. As a result, your server could end up hanging on shutdown.
A second issue is that Apache will still forcibly kill off the managed mod_wsgi daemon processes after 3 seconds and there isn't (or wasn't last time looked) a way to override how Apache manages those processes, to enable a more graceful shutdown on the managed daemon processes. So graceful stop signal doesn't change anything when using daemon mode.
The closest you can get to a graceful stop, is in a front end routing layer, divert new traffic away from the Apache instance. Then through some mechanism trigger within the host running Apache a script which sends a SIGUSR2 to the mod_wsgi daemon processes. Presuming you have set graceful-timeout option on the daemon process group to some adequate failsafe, this will result in the daemon processes exiting if all active requests finish. If the timeout expires, then it will go into its normal process shutdown sequence of not accept new requests from the Apache child worker processes and after the shutdown-timeout (default 5 seconds) fires, if requests still not complete, the process is forcibly shutdown.
In this case, it isn't actually shutting down the processes, but causing them to exit, which will result in them being replaced as we aren't telling the whole Apache to stop, but just telling the mod_wsgi daemon processes to do a graceful restart. In this situation, unless you monitor the set of daemon processes and know when they have all restarted, you don't have a clear indication that they are all done and can then shutdown the whole Apache instance.
So it is a little bit fiddly to do and it is hard for any server to do this in a nice generic way as what is appropriate really also depends on the hosted application and what its requirements are.
The question is if you really need to go to these lengths. Requests will inevitably fail anyway and users have to deal with that, so often interruption of a handful of requests on a restart is not a big deal. What is so special about the application that you need to set a higher bar and attempt to ensure that zero requests are interrupted?
Since mod_wsgi 4.8.0 you can also do the following:
import mod_wsgi
def shutdown_handler(event, **kwargs):
# do whatever you want on shutdown here
mod_wsgi.subscribe_shutdown(shutdown_handler)

when stopping twisted, will the factory wait the sql execution done?

I wonder if I stop the twistd process using
kill `cat twistd.pid`
What will happen if there are exactly some sql execution committing?
Will it waiting for the execution done? or just unknown, it could be done, or abandon?
I know if I put the execution in the stopFactory method, the factory will do such things like waiting for the execution done. But if I don't, I mean the execution out the stopFactory method, will it waiting for the execution done before the factory stopping?
Thanks.
kill sends SIGTERM by default. Twisted installs a SIGTERM handler which calls reactor.stop(). Anything that would happen when you call reactor.stop() will happen when you use that kill command.
More specifically, any shutdown triggers will run. This means any services attached to an Application will have their stopService method called (and if a Deferred is returned, it will be allowed to finish before shutdown proceeds). It also means worker threads in the reactor threadpool will be shutdown in an orderly manner - ie, allowed to complete whatever job they have in progress.
If you're using adbapi, then the ConnectionPool uses its own ThreadPool and also registers a shutdown trigger to shut that pool down in a similar orderly manner.
So, when you use kill to stop a Twisted-based process, any SQL mid-execution will be allowed to complete before shutdown takes place.

Can AppEngine python threads last longer than the original request?

We're trying to use the new python 2.7 threading ability in Google App Engine and it seems like the created thread is getting killed before it finishes running. Our scenario:
User sends a message to the server
We update the user's data
We spawn a thread to do some more heavy duty processing
We return a response to the user before waiting for the heavy duty processing to finish
My assumption was that the thread would continue to run after the request had returned, as long as it did not exceed the total request time limit. What we're seeing though is that the thread is randomly killed partway through it's execution. No exceptions, no errors, nothing. It just stops running.
Are threads allowed to exist after the response has been returned? This does not repro on the dev server, only on live servers.
We could of course use a task queue instead, but that's a real pain since we'd have to set up a url for the action and serialize/deserialize the data.
The 'Sandboxing' section of this page:
http://code.google.com/appengine/docs/python/python27/using27.html#Sandboxing
indicates that threads cannot run past the end of the request.
Deferred tasks are the way to do this. You don't need a URL or serialization to use them:
from google.appengine.ext import deferred
deferred.defer(myfunction, arg1, arg2)

Categories