Store static files on S3 but staticfiles.json manifest locally - python

I have a Django application running on Heroku. To store and serve my static files, I'm using django-storages with my S3 bucket, as well as the standard Django ManifestFilesMixin. I'm also using django-pipeline.
In code:
from django.contrib.staticfiles.storage import ManifestFilesMixin
from storages.backends.s3boto import S3BotoStorage
from pipeline.storage import PipelineMixin
class S3PipelineManifestStorage(PipelineMixin, ManifestFilesMixin, S3BotoStorage):
pass
The setup works, however the staticfiles.json manifest is also stored on S3. I can see two problems with that:
My app's storage instance would have to fetch staticfiles.json from S3, instead of just getting it from the local file system. This makes little sense performance-wise. The only consumer of the manifest file is the server app itself, so it might as well be stored on the local file system instead of remotely.
I'm not sure how significant this issue is since I suppose (or hope) that the server app caches the file after reading it once.
The manifest file is written during deployment by collectstatic, so if any already-running instances of the previous version of the server application read the manifest file from S3 before the deployment finishes and the new slug takes over, they could fetch the wrong static files - ones which should only be served for instances of the new slug.
Note that specifically on Heroku, it's possible for new app instances to pop up dynamically, so even if the app does cache the manifest file, it's possible its first fetch of it would be during the deployment of the new slug.
This scenario as described is specific to Heroku, but I guess there would be similar issues with other environments.
The obvious solution would be to store the manifest file on the local file system. Each slug would have its own manifest file, performance would be optimal, and there won't be any deployment races as described above.
Is it possible?

Some time ago I read this article which I believe fits your case well.
In there at the last paragraph exists the following:
Where is staticfiles.json located?
By default staticfiles.json will reside in STATIC_ROOT which is the
directory where all static files are collected in.
We host all our static assets on an S3 bucket which means staticfiles.json by default would end up being synced to S3. However, we wanted it to live in the code directory so we could package it and ship it to each app server.
As a result of this, ManifestStaticFilesStorage will look for
staticfiles.json in STATIC_ROOT in order to read the mappings. We had
to overwrite this behaviour, so we subclassed ManifestStaticFilesStorage:
from django.contrib.staticfiles.storage import
ManifestStaticFilesStorage from django.conf import settings
class KoganManifestStaticFilesStorage(ManifestStaticFilesStorage):
def read_manifest(self):
"""
Looks up staticfiles.json in Project directory
"""
manifest_location = os.path.abspath(
os.path.join(settings.PROJECT_ROOT, self.manifest_name)
)
try:
with open(manifest_location) as manifest:
return manifest.read().decode('utf-8')
except IOError:
return None
With the above change, Django static template tag will now read the
mappings from staticfiles.json that resides in project root directory.
Haven't used it myself, so let me know if it helps!

The answer of #John and Kogan is great but doesn't give the full code needed to make this work: As #Danra mentioned you need to also save the staticfiles.json in the source folder to make this work. Here's the code I've created based on the above answer:
import json
import os
from django.conf import settings
from django.core.files.base import ContentFile
from django.core.files.storage import FileSystemStorage
from whitenoise.storage import CompressedManifestStaticFilesStorage
# or if you don't use WhiteNoiseMiddlware:
# from django.contrib.staticfiles.storage import ManifestStaticFilesStorage
class LocalManifestStaticFilesStorage(CompressedManifestStaticFilesStorage):
"""
Saves and looks up staticfiles.json in Project directory
"""
manifest_location = os.path.abspath(settings.BASE_DIR) # or settings.PROJECT_ROOT depending on how you've set it up in your settings file.
manifest_storage = FileSystemStorage(location=manifest_location)
def read_manifest(self):
try:
with self.manifest_storage.open(self.manifest_name) as manifest:
return manifest.read().decode('utf-8')
except IOError:
return None
def save_manifest(self):
payload = {'paths': self.hashed_files, 'version': self.manifest_version}
if self.manifest_storage.exists(self.manifest_name):
self.manifest_storage.delete(self.manifest_name)
contents = json.dumps(payload).encode('utf-8')
self.manifest_storage._save(self.manifest_name, ContentFile(contents))
Now you can use LocalManifestStaticFilesStorage for your STATICFILES_STORAGE. When running manage.py collectstatic, your manifest will be saved to your root project folder and Django will look for it there when serving the content.
If you have a deployment with multiple virtual machines, make sure to run collectstatic only once and copy the staticfiles.json file to all the machines in your deployment as part of your code deployment. The nice thing about this is that even if some machines don't have the latest update yet, they will still be serving the correct content (corresponding to the current version of the code), so you can perform a gradual deploy where there is a mixed state.

There is Django ticket #27590 that addresses this question. The ticket has a pull request that implements a solution, but it has not been reviewed yet.

Related

Referencing file in Django views.py

I am building a Django web interface (overkill - I know!) for a few small python functions. One of these transforms a txt file stored in Django root. Well it aims to.
I have the following setup in a few places:
with open('file.csv','r') as source:
...
However, without setting the entire directory on my machine (e.g. /home/...), it cannot find the file. I have tried putting this in the Static directory (as ideally I would like people to be able to download the file at a later stage) but same problem.
How do you work with files within Django? What is best practice to solve the above allowing someone to download it later?
If you only need the path:
import os
from django.conf import settings
file_path = os.path.join(settings.STATIC_ROOT, 'file.txt')
with open(file_path, 'r') as source:
# do stuff
Please notice that you need to put your file in a directory like this: my_app/static/my_app/file.txt
for more information you can refer to Django docs.

Where to run Python Code in Django once the file is uploaded?

My django application has a file uploader which uploads to a specific location in my local system.
It redirects to a new html page which shows successful message after upload is done.
Once the file is uploaded I need to do some processing of csv files.
I have a python code which does the processing.
Now my question is, where do I put the python file in the django project and how do i call it to be run once the upload is done?
Any help is appreciable
Thanks in advance
You can place it anywhere you like, it's just Python. Maybe in a csv_processing.py if it fits in a single module, or as a completely independent library if it's more. Django doesn't have an opinion on this.
The best way to run it is by doing it asynchronously using Celery.
Make sure the file is within a python package, you do this by adding init.py to the directory, documentation here. In accordance to Django convention; you would place the file within the app that needs to use it, or within another app you would name utils, documentation here.
Question:
I need the file to be run completely after the application uploads the file.
Answer:
new = Storage()
new.file = request.FILES['file']
new.save()
Now we have the database id. (When file object saved into database it emits the id).
originalObj = Storage.objects.get(pk=new.id)
Now you can import the csv file and do modification here.

Downloading from a dynamic list of files in Django Admin

I currently have a django app which generates PDFs and saves them to a specific directory. From the admin interface I want to have some way to view the list of files within that directory (similar to models.FilePathField()), but also be able to download them. I realize that django was never intended to actually serve files, and I have been tinkering with django-sendfile, as a possible option. It just doesn't seem like there is any way to create a dynamic list of files other than with FilePathField (which I don't believe can suite my purposes).
Would this project fit your needs? http://code.google.com/p/django-filebrowser/
I ended up realizing that I was going about the problem in a more complicated manner than was necessary. Using two separate views trivializes the issue. I was just under the impression that the admin interface would include such a basic feature.
What I did was create a download_list view to display the files in the directory, and a download_file view which uses django-sendfile to serve the file to the end-user. Download_file simply parses through the directory with listdir(), checks if the extension is valid and sends the complete file path to the download_file function (after the user selects one).
Are the files in a directory that is served by your webserver? If all you want to do is list and download the files, it may be easier just to have a link to the directory with all the files in it and let the webserver take care of listing and serving the files. I understand this may not be the ideal solution for you, but it does avoid having Django serve static files, which is a task best left to the webserver.

Best way to structure a site with media to keep it separate from the base code. ( Opinion based )

I've recently have taken a new server under my wing that has an interesting installation of django on it. The previous developer mixed in the media uploads with the static content and in other modules created it's own directory on the root file level of the project. My first reaction to this was general annoyance. ( I'm a huge fan of modular development. ) However after working to 'correct,' it's raised a question.
Even though this question is tagged with django, feel free to post response according to java and asp.net.
How do you set up your static files? Do you stack everything inside a static directory or do you take the time link each modular independently?
One of my tricks for every django app I start is, in the init.py of said app I put the following.
import os
from django.conf import settings as djsettings
TEMPLATES_DIR = (os.path.join(os.path.dirname(__file__),'./templates'),)
djsettings.TEMPLATES_DIR += TEMPLATES_DIR
I don't think your trick is really needed (anymore).
If you use django.template.loaders.app_directories.Loader (docs)
you can put a template dir in your app dir and Django will check it for your app-specific templates
And starting with Django 1.3 you can use the staticfiles app to do something similar with all your static media files. Check the docs for the staticfiles-finders
So finally, thanks to the collectstatic management command, you're now able to keep your static media files modularized (for easier development and distribution), yet you still can bundle them at a centralized place once it is time to deploy and serve your project.
Right now I'm in the habit of putting a static folder in each app directory containing its static files. I keep templates in each app directory under templates. I alias the static path when putting the app behind nginx or apache.
One thing that I'm starting to do more of is putting static files such as javascript, css, or images behind a CDN like S3.

django authentication .htaccess static

In my app users can upload files for other users.
To make the uploaded files accesible only for the addresse I need some kind of static files authentication system.
My idea, is to create the apache hosted directory for each user and to limit access to this derectory using .htaccess.
This means that each time new django user is created, I need to create a directory and the appropriate .htaccess file in it. I know I should do it using post_save signals on User model but I don't know how to create the .htaccess in the user's directory from python level. Can you help me with that?
Or perhaps you have better solution to my problem?
Use python to rewrite the .htaccess automatically?
Use a database with users and use a Apache sessions to authenticate?
Why not have an PrivateUploadedFile object that has a field for the file and a m2m relation for any Users who are allowed to read that file? Then you don't have to mess with Apache conf at all...
from django.contrib.auth.models import User
from django.db import models
import hashlib
def generate_obfuscated_filename(instance, filename):
hashed_filename = hashlib.sha1(str(filename)) #you could salt this with something
return u"your/upload/path/%s.%s" % (hashed_filename, filename.split(".")[-1]) #includes original file format extension
class PrivateUploadedFile(models.Model):
file = models.FileField(upload_to=generate_obfuscated_filename)
recipients = models.ManyToManyField('User')
uploader = models.ForeignKey('User', related_name="files_uploaded")
def available_to(self, user):
#call this as my_uploaded_file_instance.available_to(request.user) or any other user object you want
return user in self.recipients.all() #NB: not partic. efficient, but can be tuned
Came across this django-sendfile which can be used to serve static files. Might be helpful.
Have Django handle authentication and authorization as normal, then use Apache's mod_xsendfile to have Apache handle sending the actual file. Remember to have the files uploaded to a place that cannot be accessed directly, ideally outside Apache's document root.
This question has a good example of how to implement this behaviour, but it basically boils down to setting response['X-Sendfile'] = file_path in your view.
django-sendfile does the same thing, but for several different web servers (and convenience shortcuts), and django-private-files is the same, but also implements PrivateFileField
Add a view that controls the authentication of the user, and serve the file via django's static files serving tools:
def get_file(request, some_id):
# check that the user is allowed to see the file
# obtain the file name:
path = path_from_id(some_id)
# serve the file:
return django.views.static.serve(request, path, document_root=your_doc_root)
This is a perfectly secure solution, but perhaps not ideal if you serve an enormous of files in that way.
Edit: the disclaimer on the django page does not apply here. Obviously, it would be inefficient to serve all your files with static.serve. It is however secure in the sense that you only serve the files to the users that are allowed to.

Categories