I have a page where users can upload PDF / image files to their profile. The model for these files is relativly straightforward:
class ResumeItemFile(models.Model):
item = models.ForeignKey(ResumeItem, related_name='attachment_files')
file = models.FileField(
max_length=255, upload_to=RandomizedFilePath('resume_attachments'),
verbose_name=_('Attachment'))
name = models.CharField(max_length=255, verbose_name=_('Naam'), blank=True)
I am creating a view where all files linked to a profile (item) are gathered in a .zip file. I've got this working locally, but in production I run in the following error NotImplementedError: This backend doesn't support absolute paths.
The main difference is that on production the mediafiles are served through S3
MEDIA_URL = 'https://******.s3.amazonaws.com/'
STATIC_URL = MEDIA_URL
DEFAULT_FILE_STORAGE = 'storages.backends.s3boto.S3BotoStorage'
STATICFILES_STORAGE = 'storages.backends.s3boto.S3BotoStorage'
In my view I created a list of the ResumeItemFile in the attachments variable, which is a list of dicts that look like this: {'filename', ResumeItemFileObject}
for file in attachments:
storage = DefaultStorage()
filename = file[1]
file_extension = str(file[0].file).split('.')[-1]
file_object = storage.open(file[0].file.path, mode='rb')
filename, file_object.read())
file_object.close()
Though this works fine locally, on staging it crashes on the file_object = storage.open(file[0].file.path, mode='rb') line.
If the backend does not support absolute paths, how I am to select the correct file? Does anyone have an idea of what I am doing wrong?
I think that problem comes because in the s3boto storage class, the path() method is not implemented. As per the Django documentation,
For storage systems that aren’t accessible from the local filesystem,
this will raise NotImplementedError instead.
Instead of file.path use file.name in your code.
# file_object = storage.open(file[0].file.path, mode='rb')
file_object = storage.open(file[0].file.name, mode='rb')
You may want to look into the File object. It allows you to manipulate files in a largely Pythonic manner, but leverages the Django project's storage settings. In my case, this allows me to use local, on-disk storage locally and S3 in production:
https://docs.djangoproject.com/en/2.0/ref/files/file/
This will abstract away a lot of the boilerplate you're writing. There is an example here:
https://docs.djangoproject.com/en/2.0/topics/files/#the-file-object
Good luck!
Related
How to download all files in a folder from GCS cloud bucket using python client api?
Files like .docx and .pdf
use a downloaded credentials file to create the client, see documentation
this docs tells you to export the file location, but I personally prefer the method used below as it allows for different credentials within the same application.
IMHO separation of what each serviceaccount can access increases security by tenfold. It's also usefull when dealing with different projects in the same app.
Note that you'll also have to give the serviceaccount the permission Storage Object Viewer, or one with more permissions.
Always use the least needed to due to security considerations
requirements.txt
google-cloud-storage
main.py
from google.cloud import storage
from os import makedirs
# use a downloaded credentials file to create the client, see
# https://cloud.google.com/storage/docs/reference/libraries#setting_up_authentication
# this docs tells you to export the file location, but I personally
# prefer the method used below as it allows for different credentials
# within the same application.
# IMHO separation of what each serviceaccount can access increases
# security by tenfold. It's also usefull when dealing with different
# projects in the same app.
#
#
# Note that you'll also have to give the serviceaccount the
# permission "Storage Object Viewer", or one with more permissions.
# Always use the least needed to due to security considerations
# https://cloud.google.com/storage/docs/access-control/iam-roles
cred_json_file_path = 'path/to/file/credentials.json'
client = storage.Client.from_service_account_json(cred_json_file_path)
def download_blob(bucket: storage.Bucket, remotefile: str, localpath: str='.'):
"""downloads from remotepath to localpath"""
localrelativepath = '/'.join(remotefile.split('/')[:-1])
totalpath = f'{localpath}/{localrelativepath}'
filename = f'{localpath}/{remotefile}'
makedirs(totalpath, exist_ok=True)
print(f'Current file details:\n remote file: {remotefile}\n local file: {filename}\n')
blob = storage.Blob(remotefile, bucket)
blob.download_to_filename(filename, client=client)
def download_blob_list(bucketname: str, bloblist: list, localpath: str='.'):
"""downloads a list of blobs to localpath"""
bucket = storage.Bucket(client, name=bucketname)
for blob in bloblist:
download_blob(bucket, blob, localpath)
def list_blobs(bucketname: str, remotepath: str=None, filetypes: list=[]) -> list:
"""returns a list of blobs filtered by remotepath and filetypes
remotepath and filetypes are optional"""
result = []
blobs = list(client.list_blobs(bucketname, prefix=remotepath))
for blob in blobs:
name = str(blob.name)
# skip "folder" names
if not name.endswith('/'):
# do we need to filter file types?
if len(filetypes) > 0:
for filetype in filetypes:
if name.endswith(filetype):
result.append(name)
else:
result.append(name)
return result
bucketname = 'bucketnamegoeshere'
foldername = 'foldernamegoeshere'
filetypes = ['.pdf', '.docx'] # list of extentions to return
bloblist = list_blobs(bucketname, remotepath=foldername, filetypes=filetypes)
# I'm just using the bucketname for localpath for download location.
# should work with any path
download_blob_list(bucketname, bloblist, localpath=bucketname)
I'm trying to attach a media file saved in an S3 bucket to an email, which I'm doing with this line of code:
email.attach_file(standard.download.url)
The model is defined as follows:
class Standard(models.Model):
name = models.CharField(max_length = 51)
download = models.FileField(upload_to="standard_downloads/", null=True, blank=True)
def __str__(self):
return self.name
Within settings.py I have defined my media files as follows:
AWS_DEFAULT_ACL = 'public-read'
AWS_S3_CUSTOM_DOMAIN = '%s.s3.amazonaws.com' % AWS_STORAGE_BUCKET_NAME
AWS_S3_OBJECT_PARAMETERS = {
'CacheControl': 'max-age=86400',
}
DEFAULT_FILE_STORAGE = 'sme.storage_backends.MediaStorage'
MEDIA_ROOT = 'https://%s.s3.amazonaws.com/media/' % AWS_STORAGE_BUCKET_NAME
When trying to run the code I'm getting
No such file or directory:
'https:/bucket-name.s3.amazonaws.com/media/standard_downloads/filename.ext
Please note it is showing as https:/ (a single /). How do I correct this?
Here's the source code of attach_file from Django. It clearly says - to attach file from the filesystem. It does not work with remote urls. And when you give it a url it thinks you are referring to local file, so it escapes all double slashes to single slashes.
def attach_file(self, path, mimetype=None):
"""
Attach a file from the filesystem.
Set the mimetype to DEFAULT_ATTACHMENT_MIME_TYPE if it isn't specified
and cannot be guessed.
For a text/* mimetype (guessed or specified), decode the file's content
as UTF-8. If that fails, set the mimetype to
DEFAULT_ATTACHMENT_MIME_TYPE and don't decode the content.
"""
path = Path(path)
with path.open('rb') as file:
content = file.read()
self.attach(path.name, content, mimetype)
Django does not provide anything built-in for that. You will have to write something custom on the lines of above code also using libraries like request or boto. Basically the idea is to fetch from remote url save as temp and then use attach on that.
Here's one example on how you could get the file on the fly:
from django.core.mail.message import attach
import requests
response = requests.get("http://yoururl/somefile.pdf")
email.attach('My file',response.read(),mimetype="application/pdf")
A better way to do this would be to leverage default_storage which will work whether you are using local file storage, S3 or any other storage backend.
from django.core.files.storage import default_storage
msg = EmailMessage(
subject="Your subject",
body="Your Message",
from_email="email#yourdomain.com",
to=["email#anotherdomain.com"],
)
filename = "standard_downloads/filename.ext"
with default_storage.open(filename, "r") as fh:
msg.attach(filename, fh.read())
msg.send()
my users can upload an image of themselves and use that as an avatar. Now I am struggling how to retrieve a default fallback image if they haven't uploaded an image themselves.
The path to the avatar is "//mysite.com/avatar/username".
So far I have this code, which works fine when the user has uploaded an avatar themselves, but it gives me the following error when I try to retrieve the default image:
raise IOError(errno.EACCES, 'file not accessible', filename)
IOError: [Errno 13] file not accessible: '/Users/myuser/Documents/github/mysite/static/images/profile.png'
def get(self):
path = self.request.path.split('/')
action = self.get_action(path)
if action:
e = employees.filter('username = ', action).get()
if e.avatar:
self.response.headers['Content-Type'] = "image/png"
self.response.out.write(e.avatar)
else:
self.response.headers['Content-Type'] = 'image/png'
path = os.path.join(os.path.split(__file__)[0], 'static/images/profile.png')
with open(path, 'r') as f:
print self.response.out.write(f.read())
I have defined the "/static"-folder as a static_dir in my app.yaml.
I know I can place the profile.png in the root-folder, but I prefer to have it in the "/static/images"-folder.
Any ideas?
If you declared the file itself as a static_file or its directory or any directory in its filepath a static_dir inside your app/service's .yaml config file then, by default, it's not accessible to the application code.
You need to also configure it as application_readable. From Handlers element:
application_readable
Optional. Boolean. By default, files declared in static file handlers
are uploaded as static data and are only served to end users. They
cannot be read by an application. If this field is set to true, the
files are also uploaded as code data so your application can read
them. Both uploads are charged against your code and static data
storage resource quotas.
I have a file on my computer that I'm trying to serve up as JSON from a django view.
def serve(request):
file = os.path.join(BASE_DIR, 'static', 'files', 'apple-app-site-association')
response = HttpResponse(content=file)
response['Content-Type'] = 'application/json'
What I get back is the path to the file when navigating to the URL
/Users/myself/Developer/us/www/static/files/apple-app-site-association
What am I doing wrong here?
os.path.join returns a string, it's why you get a path in the content of the response. You need to read the file at that path first.
For a static file
If the file is static and on disk, you could just return it using the webserver and avoid using python and django at all. If the file needs authenticating to be downloaded, you could still handle that with django, and return a X-Sendfile header (this is dependant on the webserver).
Serving static files is a job for a webserver, Nginx and Apache are really good at this, while Python and Django are tools to handle application logic.
Simplest way to read a file
def serve(request):
path = os.path.join(BASE_DIR, 'static', 'files', 'apple-app-site-association')
with open(path , 'r') as myfile:
data=myfile.read()
response = HttpResponse(content=data)
response['Content-Type'] = 'application/json'
This is inspired by How do I read a text file into a string variable in Python
For a more advanced solution
See dhke's answer on StreamingHttpResponse.
Additional information
Reading and writing files
Managing files with Django
If you feed HttpResponse a string a content you tell it to serve that string as HTTP body:
content should be an iterator or a string. If it’s an iterator, it should return strings, and those strings will be joined together to form the content of the response. If it is not an iterator or a string, it will be converted to a string when accessed.
Since you seem to be using your static storage directory, you might as well use staticfiles to handle content:
from django.contrib.staticfiles.storage import staticfiles_storage
from django.http.response import StreamingHttpResponse
file_path = os.path.join('files', 'apple-app-site-association')
response = StreamingHttpResponse(content=staticfiles_storage.open(file_path))
return response
As noted in #Emile Bergeron's answer, for static files, this should already be overkill, since those are supposed to be accessible from outside, anyway. So a simple redirect to static(file_path) should do the trick, too (given your webserver is correctly configured).
To serve an arbitrary file:
from django.contrib.staticfiles.storage import staticfiles_storage
from django.http.response import StreamingHttpResponse
file_path = ...
response = StreamingHttpResponse(content=open(file_path, 'rb'))
return response
Note that from Django 1.10 and on, the file handle will be closed automatically.
Also, if the file is accessible from your webserver, consider using django-sendfile, so that the file's contents don't need to pass through Django at all.
In my project I've got configured and properly working S3 storages . Now I'm trying to configure direct uploads to s3 using s3 direct. It is working almost fine. The user is able to upload the image and it get stored in S3. The problems come when I am saving a reference in the DB to the image.
models.py
class FullResPicture(Audit):
docfile = models.ImageField()
picture = models.OneToOneField(Picture, primary_key=True)
settings.py
...
S3DIRECT_DESTINATIONS = {
# Allow anybody to upload jpeg's and png's.
'imgs': ('uploads/imgs', lambda u: u.is_authenticated(), ['image/jpeg', 'image/png'], 'public-read','bucket-name'),
}
...
views.py
#Doc file is the url to the image that the user uploaded directly to S3
#https://s3-eu-west-1.amazonaws.com/bucket/uploads/imgs/picture.jpeg
fullRes = FullResPicture(docfile = form_list[1].cleaned_data['docfile'])
So if I look at my DB, I've got some images that works fine (those I upload using only django-storages) with a docfile value like this:
images/2015/08/11/image.jpg
When the application tries to access those images, S3 boto is able to get the image properly.
But then I've got the images uploaded directly from the user's browser. For those, I am storing the full url, so they look like this in the DB:
https://s3-eu-west-1.amazonaws.com/bucket/uploads/imgs/Most-Famous-Felines-034.jpg
When the application tries to access them, I've got this exception:
File "/Users/mariopersonal/Documents/dev/venv/pictures/lib/python2.7/site-packages/django/db/models/fields/files.py", line 49, in _get_file
self._file = self.storage.open(self.name, 'rb')
File "/Users/mariopersonal/Documents/dev/venv/pictures/lib/python2.7/site-packages/django/core/files/storage.py", line 35, in open
return self._open(name, mode)
File "/Users/mariopersonal/Documents/dev/venv/pictures/lib/python2.7/site-packages/storages/backends/s3boto.py", line 363, in _open
name = self._normalize_name(self._clean_name(name))
File "/Users/mariopersonal/Documents/dev/venv/pictures/lib/python2.7/site-packages/storages/backends/s3boto.py", line 341, in _normalize_name
name)
SuspiciousOperation: Attempted access to 'https:/s3-eu-west-1.amazonaws.com/bucket/uploads/imgs/Most-Famous-Felines-034.jpg' denied.
So apparently, S3 boto doesn't like the file references as full url.
For troubleshooting purpose, I tried hardcoding the value that is saved, so instead of the full url it saves only the last part, but then I've got this other exception when it tries to access the image:
IOError: File does not exist: uploads/imgs/Most-Famous-Felines-034.jpg
Anybody knows what is going wrong here? Does anybody has any working example of direct upload to s3 that stores the reference to the uploaded file in a model?
Thanks.
This is the way I fixed, in case it helps somebody else. This solution applies if you already have django-storages working properly django-s3direct uploading the images from the client side but you cannot make them to work together.
Use the same bucket
First thing I did was making sure that both, django-storages and django-s3direct were configured to use the same bucket. As you already have both django-storages and django-s3direct working separately, just check that both are using the same bucket. For most users, just need to do something like this:
settings.py
...
S3DIRECT_DESTINATIONS = {
# Allow anybody to upload jpeg's and png's.
'imgs': ('uploads/imgs', lambda u: u.is_authenticated(), ['image/jpeg', 'image/png'], 'public-read', AWS_STORAGE_BUCKET_NAME),
}
...
Note that we are using AWS_STORAGE_BUCKET_NAME, which should be defined for django-storages configuration.
In my case was little more complex as I am using different bucket for different models.
Store only the key
When using s3-direct, once the user has uploaded the image and submit the form, our view will receive the url where S3 has placed the image. If we store this url, when s3-storages tries to access the image, it won't work, so what we have to do is store only the file's key.
The file's key is the path to the image inside the bucket. E.g, for the url https://s3-eu-west-1.amazonaws.com/bucket/uploads/imgs/Most-Famous-Felines-034.jpg the key is uploads/imgs/Most-Famous-Felines-034.jpg so that is the value we need to store on our model. In my case I'am using this snippet to extract the key from the url:
def key_from_url(url, bucket):
try:
indexOf = url.index(bucket)
return url[indexOf:]
except:
raise ValueError('The url provided does not match the bucket name')
Once I made those changes, it worked seamlessly.
I hope this helps anybody in the same situation.