Memory model for apache/modwsgi application in python?

Memory model for apache/modwsgi application in python? - python

In a regular application (like on Windows), when objects/variables are created on a global level it is available to the entire program during the entire duration the program is running.
In a web application written in PHP for instance, all variables/objects are destroyed at the end of the script so everything has to be written to the database.
a) So what about python running under apache/modwsgi? How does that work in regards to the memory?
b) How do you create objects that persist between web page requests and how do you ensure there isn't threading issues in apache/modwsgi?

Go read the following from the official mod_wsgi documentation:
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
It explains the various modes things can be run in and gives some general guidelines about data scope and sharing.

All Python globals are created when the module is imported. When module is re-imported the same globals are used.
Python web servers do not do threading, but pre-forked processes. Thus there is no threading issues with Apache.
The lifecycle of Python processes under Apache depends. Apache has settings how many child processes are spawned, keep in reserve and killed. This means that you can use globals in Python processes for caching (in-process cache), but the process may terminate after any request so you cannot put any persistent data in the globals. But the process does not necessarily need to terminate and in this regard Python is much more efficient than PHP (the source code is not parsed for every request - but you need to have the server in reload mode to read source code changes during the development).
Since globals are per-process and there can be N processes, the processes share "web server global" state using mechanisms like memcached.
Usually Python globals only contain
Setting variables set during the process initialization
Cached data (session/user neutral)

Related

Keeping global state in Flask Application [duplicate]

This question already has answers here:
Are global variables thread-safe in Flask? How do I share data between requests?
(4 answers)
Closed 4 years ago.
This seems like a pretty obvious question, but a lot of the document around this is very confusing, and warn me not to keep a global state instead of telling me how to.
For example, if I need to have a database connection pool (I'm not using SQLAlchemy), or a pool of object instances (both of which need to be global pools, centrally managed), how do I do that?
If I use flask.g, that's not shared between threads, and if I use a python global, that's not shared between multiple processes of the same application (which, as I understand, can be spawned in the case of large production flask servers). Do I use flask.current_app? Do I make the pool a separate process itself? Something else?

The warnings about "not keeping a per-process global state" in a web backend app (you'll have the very same issue with Django or any wsgi app) only applies to state that you expect to be shared between requests AND processes.
If it's ok for you to have per-process state (for example db connection is typically a per-process state) then it's not an issue. wrt/ connections pooling, you could (or not) decide that having distinct pools per server process is ok.
For anything else - any state that needs to be shared amongst processes -, this is usually handled by some using some external database or cache process, so if you want to have one single connection pool for all your Flask processes you will have to use a distinct server process for maintaining the pool indeed.
Also note that:
multiple processes of the same application (which, as I understand, can be spawned in the case of large production flask servers)
Actually this has nothing to do with being "large". With a traditional "blocking" server, you can only handle concurrent requests by using either multithreading or multiprocessing. The unix philosophy traditionnally favors multiprocessing ("prefork" model) for various reasons, and Python's multithreading is bordering on useless anyway (at least in this context) so you don't have much choice if you hope to serve one more one single request at a time.
To make a long story short, consider that just any production setup for a wsgi app will run multiple processes in the background, period.

Managed VM Background thread loses environment

I am having some problems using Background Threads in a Managed VM in Google App Engine.
I am getting callbacks from a library linked via Ctypes which need to be executed in the background as I am explaining in a previous question.
The problem is: The Application loses its execution context (wsgi application) and is missing environment variables like the Application id. Without those I cannot make calls to the database as they fail.
I do call the background thread like
background_thread.start_new_background_thread(saveItemsToDatabase, [])
Is there a way to copy the environment to the background thread or maybe execute the task in a different context?
Update: The traceback which makes it already clear what the problem is:
_ToDatastoreError(err)google.appengine.api.datastore_errors.BadRequestError: Application Id (app) format is invalid: '_'

application context is thread local in appengine when created through standard app handler. Remember the applications in appengine run in python27 with thread enabled already have threads. So each wsgi call then environment variables has to be thread local, or information would leak between handled requests.
This means that additional threads you create will need to be passed the app context explicitly.
In fact when you start reading the docs on background threads it is pretty clear about what is going on, https://cloud.google.com/appengine/docs/python/modules/#Python_Background_threads - A background thread's os.environ and logging entries are independent of those of the spawning thread.
So you have to copy the env (os.environ) or the parts you need and pass it to the thread as arguments. The problem may not be limited to appid you may find thats only the first thing missing. For instance if you use namespaces.

Python, WSGI, multiprocessing and shared data

I am a bit confused about multiproessing feature of mod_wsgi and about a general design of WSGI applications that would be executed on WSGI servers with multiprocessing ability.
Consider the following directive:
WSGIDaemonProcess example processes=5 threads=1
If I understand correctly, mod_wsgi will spawn 5 Python (e.g. CPython) processes and any of these processes can receive a request from a user.
The documentation says that:
Where shared data needs to be visible to all application instances, regardless of which child process they execute in, and changes made to
the data by one application are immediately available to another,
including any executing in another child process, an external data
store such as a database or shared memory must be used. Global
variables in normal Python modules cannot be used for this purpose.
But in that case it gets really heavy when one wants to be sure that an app runs in any WSGI conditions (including multiprocessing ones).
For example, a simple variable which contains the current amount of connected users - should it be process-safe read/written from/to memcached, or a DB or (if such out-of-the-standard-library mechanisms are available) shared memory?
And will the code like
counter = 0
#app.route('/login')
def login():
...
counter += 1
...
#app.route('/logout')
def logout():
...
counter -= 1
...
#app.route('/show_users_count')
def show_users_count():
return counter
behave unpredictably in multiprocessing environment?
Thank you!

There are several aspects to consider in your question.
First, the interaction between apache MPM's and mod_wsgi applications. If you run the mod_wsgi application in embedded mode (no WSGIDaemonProcess needed, WSGIProcessGroup %{GLOBAL}) you inherit multiprocessing/multithreading from the apache MPM's. This should be the fastest option, and you end up having multiple processes and multiple threads per process, depending on your MPM configuration. On the contrary if you run mod_wsgi in daemon mode, with WSGIDaemonProcess <name> [options] and WSGIProcessGroup <name>, you have fine control on multiprocessing/multithreading at the cost of a small overhead.
Within a single apache2 server you may define zero, one, or more named WSGIDaemonProcesses, and each application can be run in one of these processes (WSGIProcessGroup <name>) or run in embedded mode with WSGIProcessGroup %{GLOBAL}.
You can check multiprocessing/multithreading by inspecting the wsgi.multithread and wsgi.multiprocess variables.
With your configuration WSGIDaemonProcess example processes=5 threads=1 you have 5 independent processes, each with a single thread of execution: no global data, no shared memory, since you are not in control of spawning subprocesses, but mod_wsgi is doing it for you. To share a global state you already listed some possible options: a DB to which your processes interface, some sort of file system based persistence, a daemon process (started outside apache) and socket based IPC.
As pointed out by Roland Smith, the latter could be implemented using a high level API by multiprocessing.managers: outside apache you create and start a BaseManager server process
m = multiprocessing.managers.BaseManager(address=('', 12345), authkey='secret')
m.get_server().serve_forever()
and inside you apps you connect:
m = multiprocessing.managers.BaseManager(address=('', 12345), authkey='secret')
m.connect()
The example above is dummy, since m has no useful method registered, but here (python docs) you will find how to create and proxy an object (like the counter in your example) among your processes.
A final comment on your example, with processes=5 threads=1. I understand that this is just an example, but in real world applications I suspect that performance will be comparable with respect to processes=1 threads=5: you should go into the intricacies of sharing data in multiprocessing only if the expected performance boost over the 'single process many threads' model is significant.

From the docs on processes and threading for wsgi:
When Apache is run in a mode whereby there are multiple child processes, each child process will contain sub interpreters for each WSGI application.
This means that in your configuration, 5 processes with 1 thread each, there will be 5 interpreters and no shared data. Your counter object will be unique to each interpreter. You would need to either build some custom solution to count sessions (one common process you can communicate with, some kind of persistence based solution, etc.) OR, and this is definitely my recommendation, use a prebuilt solution (Google Analytics and Chartbeat are fantastic options).
I tend to think of using globals to share data as a big form of global abuse. It's a bug well and portability issue in most of the environments I've done parallel processing in. What if suddenly your application was to be run on multiple virtual machines? This would break your code no matter what the sharing model of threads and processes.

If you are using multiprocessing, there are multiple ways to share data between processes. Values and Arrays only work if processes have a parent/child relation (they are shared by inheriting). If that is not the case, use a Manager and Proxy objects.

python Global Interpreter Lock GIL problem

I want to provide a service on the web that people can test out the performance of an algo, which is written in python and running on the linux machine
basically what I want to do is that, there is a very trivial PHP handler, let's say start_algo.php, which accepts the request coming from browser, and in the php code through system() or popen() (something like exec( "python algo.py" ) ) to issue a new process running the python script, I think it is doable in this part
problem is that since it is a web service, surely it has to serve multiple users at the same time, but I am quite confused by the Global Interpreter Lock GIL http://wiki.python.org/moin/GlobalInterpreterLock
that the 'standard' CPython has implemented,
does it mean, if I have 3 users running the algo now (which means 3 separated processes, correct me if I am wrong plz), at a particular moment, there is only one user is being served by the Python interpreter and the other 2 are waiting for their turns?
Many thanks in advance
Ted

If you are opening each script by invoking a new process; you will not run afoul of the GIL. Each process gets its own interpreter and therefore its own interpreter lock.

The GIL is per-process. If you start multiple python processes, each will have its own GIL that prevents the interpreter(s) in this specific process from running more than one thread at a time. But independent processes can run at the same time.
Also, multiple threads inside one Python process do take turns on running (rather frequently, IIRC once per hundred opcode instructions or a few dozen milliseconds depending on the version), so it's not like the GIL prevents concurrency at all - it just prevents multi-threading.

how to process long-running requests in python workers?

I have a python (well, it's php now but we're rewriting) function that takes some parameters (A and B) and compute some results (finds best path from A to B in a graph, graph is read-only), in typical scenario one call takes 0.1s to 0.9s to complete. This function is accessed by users as a simple REST web-service (GET bestpath.php?from=A&to=B). Current implementation is quite stupid - it's a simple php script+apache+mod_php+APC, every requests needs to load all the data (over 12MB in php arrays), create all structures, compute a path and exit. I want to change it.
I want a setup with N independent workers (X per server with Y servers), each worker is a python app running in a loop (getting request -> processing -> sending reply -> getting req...), each worker can process one request at a time. I need something that will act as a frontend: get requests from users, manage queue of requests (with configurable timeout) and feed my workers with one request at a time.
how to approach this? can you propose some setup? nginx + fcgi or wsgi or something else? haproxy? as you can see i'am a newbie in python, reverse-proxy, etc. i just need a starting point about architecture (and data flow)
btw. workers are using read-only data so there is no need to maintain locking and communication between them

The typical way to handle this sort of arrangement using threads in Python is to use the standard library module Queue. An example of using the Queue module for managing workers can be found here: Queue Example

Looks like you need the "workers" to be separate processes (at least some of them, and therefore might as well make them all separate processes rather than bunches of threads divided into several processes). The multiprocessing module in Python 2.6 and later's standard library offers good facilities to spawn a pool of processes and communicate with them via FIFO "queues"; if for some reason you're stuck with Python 2.5 or even earlier there are versions of multiprocessing on the PyPi repository that you can download and use with those older versions of Python.
The "frontend" can and should be pretty easily made to run with WSGI (with either Apache or Nginx), and it can deal with all communications to/from worker processes via multiprocessing, without the need to use HTTP, proxying, etc, for that part of the system; only the frontend would be a web app per se, the workers just receive, process and respond to units of work as requested by the frontend. This seems the soundest, simplest architecture to me.
There are other distributed processing approaches available in third party packages for Python, but multiprocessing is quite decent and has the advantage of being part of the standard library, so, absent other peculiar restrictions or constraints, multiprocessing is what I'd suggest you go for.

There are many FastCGI modules with preforked mode and WSGI interface for python around, the most known is flup. My personal preference for such task is superfcgi with nginx. Both will launch several processes and will dispatch requests to them. 12Mb is not as much to load them separately in each process, but if you'd like to share data among workers you need threads, not processes. Note, that heavy math in python with single process and many threads won't use several CPU/cores efficiently due to GIL. Probably the best approach is to use several processes (as much as cores you have) each running several threads (default mode in superfcgi).

The most simple solution in this case is to use the webserver to do all the heavy lifting. Why should you handle threads and/or processes when the webserver will do all that for you?
The standard arrangement in deployments of Python is:
The webserver start a number of processes each running a complete python interpreter and loading all your data into memory.
HTTP request comes in and gets dispatched off to some process
Process does your calculation and returns the result directly to the webserver and user
When you need to change your code or the graph data, you restart the webserver and go back to step 1.
This is the architecture used Django and other popular web frameworks.

I think you can configure modwsgi/Apache so it will have several "hot" Python interpreters
in separate processes ready to go at all times and also reuse them for new accesses
(and spawn a new one if they are all busy).
In this case you could load all the preprocessed data as module globals and they would
only get loaded once per process and get reused for each new access. In fact I'm not sure this isn't the default configuration
for modwsgi/Apache.
The main problem here is that you might end up consuming
a lot of "core" memory (but that may not be a problem either).
I think you can also configure modwsgi for single process/multiple
thread -- but in that case you may only be using one CPU because
of the Python Global Interpreter Lock (the infamous GIL), I think.
Don't be afraid to ask at the modwsgi mailing list -- they are very
responsive and friendly.

You could use nginx load balancer to proxy to PythonPaste paster (which serves WSGI, for example Pylons), that launches each request as separate thread anyway.

Another option is a queue table in the database.
The worker processes run in a loop or off cron and poll the queue table for new jobs.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Memory model for apache/modwsgi application in python? - python

Go read the following from the official mod_wsgi documentation: http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading It explains the various modes things can be run in and gives some general guidelines about data scope and sharing.

Related

Keeping global state in Flask Application [duplicate]

Managed VM Background thread loses environment

Python, WSGI, multiprocessing and shared data

python Global Interpreter Lock GIL problem

how to process long-running requests in python workers?

Categories

Resources