Django running under Apache and mod_wsgi uses "virtual" file system?

Django running under Apache and mod_wsgi uses "virtual" file system? - python

Ok, I know this is strange, but after a day of searching, I couldn't find any answer to this problem.
I've got this system running since two years with Django under Apache with a classical mod_wsgi installation. An exact mirror of the web site is used for development and testing.
In order to speed up a query, I used the inbuilt Django cache, using a file backend. In development (inbuilt Django server) everything works fine and a file is created under /var/tmp/django_cache. Everything works also in production, but no file is created.
I was surprised, so I started experimenting and inserted a bunch of prints in the django.core.cache modules and followed the execution of the cache stuff. At a certain point I got to a os.makedirs, which doesn't create anything. I inserted a open(), created a file (absolute path) and nothing is created. Tried to read back from the nonexisting file and the content was there.
I'm really puzzled. It seems that somehow there is a sort of "virtual" filesystem, which works correctly but in parallel with the real thing. I'm using Django 1.11.11.
Who is doing the magic? Django, Apache, mod_wsgi? Something else?

Ok, #DanielRoseman was right: "More likely the file is being created in another location". The reason it can impact on any filesystem operation is that it's a feature of systemd called PrivateTmp. From the documentation:
sets up a new file system namespace for the executed processes and mounts private /tmp and /var/tmp directories inside it that is not shared by processes outside of the namespace
In fact there is a bunch of folders in both /tmp and /var/tmp called something like systemd-private-273bc022d82337529673d61c0673a579-apache2.service-oKiLBu.
Somehow my find command never reached those folders. All created files are there in a very regular filesystem. Now I also understand why an Apache restart clears the Django cache. systemd deletes the process private tmp and creates a new one for the new process.
I found the answer here: https://unix.stackexchange.com/a/303327/329567

Related

Django uwsgi subprocess and permissions

I'm trying to generate PDF file from Latex template. I've done it in development environment (running python manage.py straight from eclipse)... but I can't make it work into the server, which is running using cherokee and uwsgi.
We have realized that open(filename) creates a file owning to root (also root group). This isn't taking place in development environment... but the most strange thing about this issue is that somewhere else in our code we are creating a text file (latex uses is a text file too), but it's created with the user cherokee is supposed to use, not root!
What happened? How can we fix it?
We are running this code on ubuntu linux and a virtual environment both in development and production.
We started following some instructions to do it using python's temporary file and folder creation functions, but we thought that it could be something related with them, and created them "manually" in order to try to solve this issue... but it didn't work.

As I've said in my comments this issue was related to supervisord. I've solved it assigning the right path and user into "environment" variable of supervisord's config file.

In a custom Heroku Python buildpack, how can I set a config var?

I'm trying to set a custom config var for my Python app to use. Specifically, the current SHA to use as a URL param in static files to force the CDN to re-prime on each deploy. I'm trying to do it with a custom buildpack, based on the normal Heroku Python one (https://github.com/heroku/heroku-buildpack-python).
Where I am right now, I've started modifying the compile script. So far, I have been able to get the value I need, but running up near the top, around line 30, before GIT_DIR is unset
export GIT_SHA=$(git log -1 --format="%h")
then later, around line 175, I think is where it sets the config vars for the app. I tried adding my own:
set-env GIT_SHA '$GIT_SHA'
To no avail
I've run heroku labs:enable user-env-compile which I think is a necessary step, but I can't for the life of me figure out how to get the buildpack to actually set the config var for my app to use.
EDIT
Was able to solve this with Andrew's suggestion. I created a custom buildpack which calls a Python script that uses the Heroku python bindings to set the var, reading it from the environment variable set in the build pack.

If my understanding of your question is correct, you want to set an env variable at compile time, but read it during execution (whenever a static file URL is accessed in your app). Is that accurate?
Compilation is done on a totally different dyno than the application is served under, so executing set-env during compile time might change the environment of the compilation dyno but won't affect the environment of the production dynos, which are spun up later.
I don't think heroku labs:enable user-env-compile is relevant here because that lets you read from the config during compile time, but it does not allow you to write to it.
If you really want to use env variables, you could use the Heroku API's python bindings to dynamically modify the configuration of your app. You could also try to save a temporary file somewhere with the compiled output, and then read from that file in the part of your buildpack that starts up your dyno. Or it may be possible to fetch the SHA directly from a production dyno at start-up time, without involving the compilation dyno at all.
However, all of this is fairly irregular and there is probably a cleaner way to accomplish your goal of versioning static files on your CDN.

Local development (invoking python <script>, port 8888) serves stale files

I had a similar issue when running fast-cgi and I was told there is no way to fix it: Files being served are stale / cached ; Python as fcgi + web.py + nginx without doing custom work. I was told to use the python method, which invokes a local "web server" to host the python page.
Even doing that, the files served are stale / cached. If I make edits to the files, save and refresh, the python web server is still serving the stale / cached file.
The only way to get it to serve the modified file is to kill (ctrl+c) the script, and then restart...this takes about 5 seconds every-time and seriously impedes my development workflow.
Ideally any change to the script would be reflected next time the page is requested from the web server.
EDIT
#Jordan: Thanks for the suggestions. I've tried #2, which yields the following error:
app = web.application(urls, globals(), web.reloader)
AttributeError: 'module' object has no attribute 'reloader'
Per the documentation here: http://webpy.org/tutorial2.en
I then tried suggestion #4,
web.config.debug = True
Both still cause 'stale' files to get served.

Understandably you want a simple, set it up once and never worry about it again, solution. But you might be making this problem more difficult than it needs to be.
I generally write applications for an apache/modwsgi/nginx stack. If I have a caching problem, I just restart apache and voila, my python files are re-interpreted. I don't remember the commands to restart apache on all of my different boxes (mac's, ubuntu, centos, etc), and I shouldn't need to.
That is what command line aliases are for...
A python application is interpreted before it is run, and when run on a webserver, it is run once and should be considered stateless. This is unlike javascript running in a browser, which can be considered to have state since it is a continually running VM. You can edit javascript while it is running and that is probably fine for most applications of the language.
In python you generally write the code, run it, and if that doesn't work you start over. You don't edit the code in real time. That means you are knowingly saving the source and changing contexts to run it.
I am betting that you are editing your source from a Graphical IDE instead of a command-line editor like vi or emacs (I might be wrong, and I'm not saying there is anything 'wrong' with that). I only write iOS applications using an IDE, everything else I stick to ViM. Why? Because then I am always on the command line, and I am not distracted by anything (animations, mouse pointers, notifications). I finish writing my code, i quickly type ':wq' (write and quit), and then quickly type 'restartweb' (actually i usually type 're' then <\tab> to auto-complete) which is my alias to whatever the command to restart apache is. Voila my python is reinterpreted.
My point is that you should probably keep it simple and use something like an alias to solve your problem. It might not be the coolest thing you could do. But it is what Ninja coders have been doing for the last 20 years to get work done fast and simple.
Now obviously I only suggested a solution for apache, and I have never used web.py before. But the same possible solution still applies. Make a bash script that goes in your project directory, call it something like restart.bash. In it put something like:
rm -r *.pyc
Which will recursively remove all compiled pyc files, forcing your app to reload. Then make an alias in your ~/.bashrc that runs that file
Something like:
alias restartproject="bash /full/path/to/restart.bash"
Magical, now you have a solution that works everywhere, regardless of which type of web server you choose to run your application from.
Edit:
Now you have a solution that works everywhere but on a Windows IIS server. And if you are trying to run python from Windows, you should probably Stahp! hugz
We are using virtualenv right? :) We want to keep our python nice and system-agnostic so we can sell it to anyone right? :) And you should really check out ViM and emacs if you don't use them... you will bang your head against the wall for a week getting used to it, then never want to touch a mouse again after that.

Right, so Python is a compiled language when run on a web server. It's outputting a .pyc file that's the compiled version. Your goal is to tell the web server that the .pyc file is out of date and is no longer valid.
You have a few options:
Delete the relevant .pyc file
For web.py, use the reloader middleware
Send it a HUP signal (I'm lazy and usually do killall -SIGHUP python). You can do this automatically with a file watching tool like watchdog (thanks barracel).
web.config.debug = True should be the default in your application
None of those options are working for you?

Ensuring a test case can delete the temporary directory it created

(Platform: Linux, specifically Fedora and Red Hat Enterprise Linux 6)
I have an integration test written in Python that does the following:
creates a temporary directory
tells a web service (running under apache) to run an rsync job that copies files into that directory
checks the files have been copied correctly (i.e. the configuration was correctly passed from the client through to an rsync invocation via the web service)
(tries to) delete the temporary directory
At the moment, the last step is failing because rsync is creating the files with their ownership set to that of the apache user, and so the test case doesn't have the necessary permissions to delete the files.
This Server Fault question provides a good explanation for why the cleanup step currently fails given the situation the integration test sets up.
What I currently do: I just don't delete the temporary directory in the test cleanup, so these integration tests leave dummy files around that need to be cleared out of /tmp manually.
The main solution I am currently considering is to add a setuid script specifically to handle the cleanup operation for the test suite. This should work, but I'm hoping someone else can suggest a more elegant solution. Specifically, I'd really like it if nothing in the integration test client needed to care about the uid of the apache process.
Approaches I have considered but rejected for various reasons:
Run the test case as root. This actually works, but needing to run the test suite as root is rather ugly.
Set the sticky bit on the directory created by the test suite. As near as I can tell, rsync is ignoring this because it's set to copy the flags from the remote server. However, even tweaking the settings to only copy the execute bit didn't seem to help, so I'm still not really sure why this didn't work.
Adding the test user to the apache group. As rsync is creating the files without group write permission, this didn't help.
Running up an Apache instance as the test user and testing against that. This has some advantages (in that the integration tests won't require that apache be already running), but has the downside that I won't be able to run the integration tests against an Apache instance that has been preconfigured with the production settings to make sure those are correct. So even though I'll likely add this capability to the test suite eventually, it won't be as a replacement for solving the current problem more directly.
One other thing I really don't want to do is change the settings passed to rsync just so the test suite can correctly clean up the temporary directory. This is an integration test for the service daemon, so I want to use a configuration as close to production as I can get.

Add the test user to the apache group (or httpd group, whichever has group ownership on the files).

With the assistance of the answers to that Server Fault question, I was able to figure out a solution using setfacl.
The code that creates the temporary directory for the integration test now does the following (it's part of a unittest.TestCase instance, hence the reference to addCleanup):
local_path = tempfile.mkdtemp().decode("utf-8")
self.addCleanup(shutil.rmtree, local_path)
acl = "d:u:{0}:rwX".format(os.geteuid())
subprocess.check_call(["setfacl", "-m", acl, local_path])
The first two lines just create the temporary directory and ensure it gets deleted at the end of the test.
The last two lines are the new part and set the default ACL for the directory such that the test user always has read/write access and will also have execute permissions for anything with the execute bit set.

I change Python code, but can't see results

Sorry for totally stupid question, but the situation is that I have to make some changes to the Django website, and I have about zero knowleges in python.
I've been reading Django docs and found out where to make changes, but there is very strange situation. When I change view, template, config or anything on web site - nothing happens.
It looks like code is cached. When I completely delete the site's folder - everithing works fine except css stops working.
The only file that is vital and lays outside the site's folder is starter.py whith code
#!/usr/local/bin/pthon2.3
import sys, os
.... importing some pathes and other conf stuff
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()
Please can anybody tell my what am I doing wrong?

Python web applications typically differ from PHP ones in that the software is not automatically reloaded once you change the source code. This makes sense because initialization, firing up the interpreter etc., doesn't have to be performed at each instance. It's not that the code is "cached"; it's only loaded once. (Python does cache its bytecode, but this it transparently detects changes, so you needn't worry about that.) So you need to find a means to restart the WSGI program. How this is done in your particular webhosting environment is for you to find out, with the help of the web host or system administrator.
In addition to this, Django does cache its views (if that feature is turned on). You'll need to invalidate the caches in that case.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.