Django uploading file from disk to S3 using django-storages

Django uploading file from disk to S3 using django-storages - python

In my Django project I use django-storages to save files to S3 uploaded via a Form.
Model is defined as
class Uploads(models.Model):
file = models.FileField(upload_to=GetUploadPath)
I'm making changes to the file that was uploaded via Form by saving to disk and then trying to pass a File object to the model.save() method.
s='C:\Users\XXX\File.csv'
with open(os.path.join(settings.MEDIA_ROOT, s),"rb") as f:
file_to_s3 = File(f)
If I pass the file object using request.FILES.get('file') then the in-memory file gets uploaded properly, however when I try to upload the modified file from disk, I get this error,
RuntimeError: Input C:\Users\XXX\File.csv of type: <class 'django.core.files.base.File'> is not supported.
Followed this post but doesn't help, any thought's please.

Related

Django - FileField - PDF vs octet-stream - AWS S3

I have a model with a file field like so:
class Document(models.Model):
file = models.FileField(...)
Elsewhere in my application, I am trying to download a pdf file from an external url and upload it to the file field:
import requests
from django.core.files.base import ContentFile
...
# get the external file:
response = requests.get('<external-url>')
# convert to ContentFile:
file = ContentFile(response.content, name='document.pdf')
# update document:
document.file.save('document.pdf', content=file, save=True)
However, I have noticed the following behavior:
files uploaded via the django-admin portal have the content_type "application/json"
files uploaded via the script abobe have the content_type "application/octet-stream"
How can I ensure that files uploaded via the script have the "application/json" content_type? Is it possible to set the content_type on the the ContentFile object? This is important for the frontend.
Other notes:
I am using AWS S3 as my file storage system.
Uploading a file from my local file storage via the scirpt (i.e. using with open(...) as file: still uploads a file as "applicaton/octet-stream"

Uploaded Files get overwritten in Dropbox

I am trying to upload users files to DropBox in Django. When I use the built in 'open()' function, it throws the following exception:
expected str, bytes or os.PathLike object, not TemporaryUploadedFile
When I don't, the file gets uploaded successfully but is blank (write mode).
UPLOAD HANDLER:
def upload_handler(DOC, PATH):
dbx = dropbox.Dropbox(settings.DROPBOX_APP_ACCESS_TOKEN)
with open(DOC, 'rb') as f:
dbx.files_upload(f.read(), PATH)
dbx.sharing_create_shared_link_with_settings(PATH)
How do I upload files or pass a mode to DropBox API without it being overwritten?

To specify a write mode when uploading files to Dropbox, pass the desired WriteMode to the files_upload method as the mode parameter. That would look like this:
dbx.files_upload(f.read(), PATH, mode=dropbox.files.WriteMode('overwrite')
This only controls how Dropbox commits the file (see the WriteMode docs for info); it doesn't control what data you're uploading. In your code, it is uploading whatever is returned by f.read(), so make sure that's what you expect it to be.

Error 500 while Uploading CSV file to S3 bucket using boto3 and python flask

kind of looked at all possible options.
I am using boto3 and python3.6 to upload file to s3 bucket, Funny thing is while json and even .py file is getting uploaded, it is throwing Error 500 while uploading CSV. On successful uplaod i am returning an json to check all the values.
import boto3
from botocore.client import Config
#app.route("/upload",methods = ['POST','GET'])
def upload():
if request.method == 'POST':
file = request.files['file']
filename = secure_filename(file.filename)
s3 = boto3.resource('s3', aws_access_key_id= os.environ.get('AWS_ACCESS_KEY_ID'), aws_secret_access_key=os.environ.get('AWS_SECRET_ACCESS_KEY'),config=Config(signature_version='s3v4'))
s3.Bucket(os.environ.get('S3_BUCKET')).put_object(Key=filename, Body=open(filename, 'rb'), ContentEncoding='text/csv')
return jsonify({'successful upload':filename, 'S3_BUCKET':os.environ.get('S3_BUCKET'), 'ke':os.environ.get('AWS_ACCESS_KEY_ID'), 'sec':os.environ.get('AWS_SECRET_ACCESS_KEY'),'filepath': "https://s3.us-east-2.amazonaws.com/"+os.environ.get('S3_BUCKET')+"/" +filename})
Please help!!

You are getting a FileNotFoundError for file xyz.csv because the file does not exist.
This could be because the code in upload() does not actually save the uploaded file, it merely obtains a safe name for it and immediately tries to open it - which fails.
That it works for other files is probably due to the fact that those files already exist, perhaps left over from testing, so there is no problem.
Try saving the file to the file system using save() after obtaining the safe filename:
upload_file = request.files['file']
filename = secure_filename(upload_file.filename)
upload_file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
and then uploading it (assuming that you've configured an UPLOAD_FOLDER):
with open(os.path.join(app.config['UPLOAD_FOLDER'], filename), 'rb') as f:
s3.Bucket(os.environ.get('S3_BUCKET')).put_object(Key=filename, Body=f, ContentEncoding='text/csv')
return jsonify({...})
There is no need to actually save the file to the file system; it can be streamed directly to your S3 bucket using the stream attribute of the upload_file object:
upload_file = request.files['file']
filename = secure_filename(upload_file.filename)
s3 = boto3.resource('s3', aws_access_key_id='key', aws_secret_access_key='secret')
s3.Bucket('bucket').put_object(Key=filename, Body=upload_file.stream, ContentType=upload_file.content_type)
To make this more generic you should use the content_type attribute of the uploaded file as shown above.

Django and S3 direct uploads

In my project I've got configured and properly working S3 storages . Now I'm trying to configure direct uploads to s3 using s3 direct. It is working almost fine. The user is able to upload the image and it get stored in S3. The problems come when I am saving a reference in the DB to the image.
models.py
class FullResPicture(Audit):
docfile = models.ImageField()
picture = models.OneToOneField(Picture, primary_key=True)
settings.py
...
S3DIRECT_DESTINATIONS = {
# Allow anybody to upload jpeg's and png's.
'imgs': ('uploads/imgs', lambda u: u.is_authenticated(), ['image/jpeg', 'image/png'], 'public-read','bucket-name'),
}
...
views.py
#Doc file is the url to the image that the user uploaded directly to S3
#https://s3-eu-west-1.amazonaws.com/bucket/uploads/imgs/picture.jpeg
fullRes = FullResPicture(docfile = form_list[1].cleaned_data['docfile'])
So if I look at my DB, I've got some images that works fine (those I upload using only django-storages) with a docfile value like this:
images/2015/08/11/image.jpg
When the application tries to access those images, S3 boto is able to get the image properly.
But then I've got the images uploaded directly from the user's browser. For those, I am storing the full url, so they look like this in the DB:
https://s3-eu-west-1.amazonaws.com/bucket/uploads/imgs/Most-Famous-Felines-034.jpg
When the application tries to access them, I've got this exception:
File "/Users/mariopersonal/Documents/dev/venv/pictures/lib/python2.7/site-packages/django/db/models/fields/files.py", line 49, in _get_file
self._file = self.storage.open(self.name, 'rb')
File "/Users/mariopersonal/Documents/dev/venv/pictures/lib/python2.7/site-packages/django/core/files/storage.py", line 35, in open
return self._open(name, mode)
File "/Users/mariopersonal/Documents/dev/venv/pictures/lib/python2.7/site-packages/storages/backends/s3boto.py", line 363, in _open
name = self._normalize_name(self._clean_name(name))
File "/Users/mariopersonal/Documents/dev/venv/pictures/lib/python2.7/site-packages/storages/backends/s3boto.py", line 341, in _normalize_name
name)
SuspiciousOperation: Attempted access to 'https:/s3-eu-west-1.amazonaws.com/bucket/uploads/imgs/Most-Famous-Felines-034.jpg' denied.
So apparently, S3 boto doesn't like the file references as full url.
For troubleshooting purpose, I tried hardcoding the value that is saved, so instead of the full url it saves only the last part, but then I've got this other exception when it tries to access the image:
IOError: File does not exist: uploads/imgs/Most-Famous-Felines-034.jpg
Anybody knows what is going wrong here? Does anybody has any working example of direct upload to s3 that stores the reference to the uploaded file in a model?
Thanks.

This is the way I fixed, in case it helps somebody else. This solution applies if you already have django-storages working properly django-s3direct uploading the images from the client side but you cannot make them to work together.
Use the same bucket
First thing I did was making sure that both, django-storages and django-s3direct were configured to use the same bucket. As you already have both django-storages and django-s3direct working separately, just check that both are using the same bucket. For most users, just need to do something like this:
settings.py
...
S3DIRECT_DESTINATIONS = {
# Allow anybody to upload jpeg's and png's.
'imgs': ('uploads/imgs', lambda u: u.is_authenticated(), ['image/jpeg', 'image/png'], 'public-read', AWS_STORAGE_BUCKET_NAME),
}
...
Note that we are using AWS_STORAGE_BUCKET_NAME, which should be defined for django-storages configuration.
In my case was little more complex as I am using different bucket for different models.
Store only the key
When using s3-direct, once the user has uploaded the image and submit the form, our view will receive the url where S3 has placed the image. If we store this url, when s3-storages tries to access the image, it won't work, so what we have to do is store only the file's key.
The file's key is the path to the image inside the bucket. E.g, for the url https://s3-eu-west-1.amazonaws.com/bucket/uploads/imgs/Most-Famous-Felines-034.jpg the key is uploads/imgs/Most-Famous-Felines-034.jpg so that is the value we need to store on our model. In my case I'am using this snippet to extract the key from the url:
def key_from_url(url, bucket):
try:
indexOf = url.index(bucket)
return url[indexOf:]
except:
raise ValueError('The url provided does not match the bucket name')
Once I made those changes, it worked seamlessly.
I hope this helps anybody in the same situation.

How to access temporary uploaded file in web2py?

I am using a sqlform to upload a video file and want to encoding the video file while uploading. But I noticed that the upload file is not saved to uploads directory utill it is completely uploaded. Is there a temporary file and how can I access it ?Thanks.

I'm not sure how you might be able to process the file while it is uploading (i.e., process the bytes as they are received by the server), but if you can wait until the file is fully uploaded, you can access the uploaded file as a Python cgi.FieldStorage object:
def upload():
if request.vars.myfile:
video = encode_video(request.vars.myfile.file)
[do something with video]
form = SQLFORM.factory(Field('myfile', 'upload',
uploadfolder='/path/to/upload')).process()
return dict(form=form)
Upon upload, request.vars.myfile will be a cgi.FieldStorage object, and the open file object will be in request.vars.myfile.file. Note, if the encoding takes a while, you might want to pass it off to a task queue rather than handle it in the controller.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.