Django and S3 direct uploads

Django and S3 direct uploads - python

In my project I've got configured and properly working S3 storages . Now I'm trying to configure direct uploads to s3 using s3 direct. It is working almost fine. The user is able to upload the image and it get stored in S3. The problems come when I am saving a reference in the DB to the image.
models.py
class FullResPicture(Audit):
docfile = models.ImageField()
picture = models.OneToOneField(Picture, primary_key=True)
settings.py
...
S3DIRECT_DESTINATIONS = {
# Allow anybody to upload jpeg's and png's.
'imgs': ('uploads/imgs', lambda u: u.is_authenticated(), ['image/jpeg', 'image/png'], 'public-read','bucket-name'),
}
...
views.py
#Doc file is the url to the image that the user uploaded directly to S3
#https://s3-eu-west-1.amazonaws.com/bucket/uploads/imgs/picture.jpeg
fullRes = FullResPicture(docfile = form_list[1].cleaned_data['docfile'])
So if I look at my DB, I've got some images that works fine (those I upload using only django-storages) with a docfile value like this:
images/2015/08/11/image.jpg
When the application tries to access those images, S3 boto is able to get the image properly.
But then I've got the images uploaded directly from the user's browser. For those, I am storing the full url, so they look like this in the DB:
https://s3-eu-west-1.amazonaws.com/bucket/uploads/imgs/Most-Famous-Felines-034.jpg
When the application tries to access them, I've got this exception:
File "/Users/mariopersonal/Documents/dev/venv/pictures/lib/python2.7/site-packages/django/db/models/fields/files.py", line 49, in _get_file
self._file = self.storage.open(self.name, 'rb')
File "/Users/mariopersonal/Documents/dev/venv/pictures/lib/python2.7/site-packages/django/core/files/storage.py", line 35, in open
return self._open(name, mode)
File "/Users/mariopersonal/Documents/dev/venv/pictures/lib/python2.7/site-packages/storages/backends/s3boto.py", line 363, in _open
name = self._normalize_name(self._clean_name(name))
File "/Users/mariopersonal/Documents/dev/venv/pictures/lib/python2.7/site-packages/storages/backends/s3boto.py", line 341, in _normalize_name
name)
SuspiciousOperation: Attempted access to 'https:/s3-eu-west-1.amazonaws.com/bucket/uploads/imgs/Most-Famous-Felines-034.jpg' denied.
So apparently, S3 boto doesn't like the file references as full url.
For troubleshooting purpose, I tried hardcoding the value that is saved, so instead of the full url it saves only the last part, but then I've got this other exception when it tries to access the image:
IOError: File does not exist: uploads/imgs/Most-Famous-Felines-034.jpg
Anybody knows what is going wrong here? Does anybody has any working example of direct upload to s3 that stores the reference to the uploaded file in a model?
Thanks.

This is the way I fixed, in case it helps somebody else. This solution applies if you already have django-storages working properly django-s3direct uploading the images from the client side but you cannot make them to work together.
Use the same bucket
First thing I did was making sure that both, django-storages and django-s3direct were configured to use the same bucket. As you already have both django-storages and django-s3direct working separately, just check that both are using the same bucket. For most users, just need to do something like this:
settings.py
...
S3DIRECT_DESTINATIONS = {
# Allow anybody to upload jpeg's and png's.
'imgs': ('uploads/imgs', lambda u: u.is_authenticated(), ['image/jpeg', 'image/png'], 'public-read', AWS_STORAGE_BUCKET_NAME),
}
...
Note that we are using AWS_STORAGE_BUCKET_NAME, which should be defined for django-storages configuration.
In my case was little more complex as I am using different bucket for different models.
Store only the key
When using s3-direct, once the user has uploaded the image and submit the form, our view will receive the url where S3 has placed the image. If we store this url, when s3-storages tries to access the image, it won't work, so what we have to do is store only the file's key.
The file's key is the path to the image inside the bucket. E.g, for the url https://s3-eu-west-1.amazonaws.com/bucket/uploads/imgs/Most-Famous-Felines-034.jpg the key is uploads/imgs/Most-Famous-Felines-034.jpg so that is the value we need to store on our model. In my case I'am using this snippet to extract the key from the url:
def key_from_url(url, bucket):
try:
indexOf = url.index(bucket)
return url[indexOf:]
except:
raise ValueError('The url provided does not match the bucket name')
Once I made those changes, it worked seamlessly.
I hope this helps anybody in the same situation.

Related

Django uploading file from disk to S3 using django-storages

In my Django project I use django-storages to save files to S3 uploaded via a Form.
Model is defined as
class Uploads(models.Model):
file = models.FileField(upload_to=GetUploadPath)
I'm making changes to the file that was uploaded via Form by saving to disk and then trying to pass a File object to the model.save() method.
s='C:\Users\XXX\File.csv'
with open(os.path.join(settings.MEDIA_ROOT, s),"rb") as f:
file_to_s3 = File(f)
If I pass the file object using request.FILES.get('file') then the in-memory file gets uploaded properly, however when I try to upload the modified file from disk, I get this error,
RuntimeError: Input C:\Users\XXX\File.csv of type: <class 'django.core.files.base.File'> is not supported.
Followed this post but doesn't help, any thought's please.

Uploading image string to Google Drive using pydrive

I need to upload an image string (as the one you get from requests.get(url).content) to google drive using the PyDrive package. I checked a similar question but the answer accepted there was to save it in a temporary file on a local drive and then upload that.
However, I cannot do that because of local storage and permission restrictions.
The accepted answer was previously to use SetContentString(image_string.decode('utf-8')) since
SetContentString requires a parameter of type str not bytes.
However the error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte came up, as in the comments on that answer.
Is there any way to do this without using a temporary file, using PIL/BytesIO/anything that can convert it to be uploaded correctly as a string or somehow using PIL manipulated as an image and uploaded using SetContentFile()?
A basic example of what I'm trying to do is:
img_content = requests.get('https://i.imgur.com/A5gIh7W.jpeg')
file = drive.CreateFile({...})
file.setContentString(img_content.decode('utf-8'))
file.Upload()

When I saw the document (Upload and update file content) of pydrive, it says as follows.
Managing file content is as easy as managing file metadata. You can set file content with either SetContentFile(filename) or SetContentString(content) and call Upload() just as you did to upload or update file metadata.
And, I searched about the method for directly uploading the binary data to Google Drive. But, I couldn't find it. From this situation, I thought that there might not be such method. So, in this answer, I would like to propose to upload the binary data using requests module. In this case, the access token is retrieved from the authorization script of pydrive. The sample script is as follows.
Sample script:
from pydrive.auth import GoogleAuth
import io
import json
import requests
url = 'https://i.imgur.com/A5gIh7W.jpeg' # Please set the direct link of the image file.
filename = 'sample file' # Please set the filename on Google Drive.
folder_id = 'root' # Please set the folder ID. The file is put to this folder.
gauth = GoogleAuth()
gauth.LocalWebserverAuth()
metadata = {
"name": filename,
"parents": [folder_id]
}
files = {
'data': ('metadata', json.dumps(metadata), 'application/json'),
'file': io.BytesIO(requests.get(url).content)
}
r = requests.post(
"https://www.googleapis.com/upload/drive/v3/files?uploadType=multipart",
headers={"Authorization": "Bearer " + gauth.credentials.access_token},
files=files
)
print(r.text)
Note:
In this script, it supposes that your URL is the direct link of the image file. Please be careful this.
In this case, uploadType=multipart is used. The official document says as follows. Ref
Use this upload type to quickly transfer a small file (5 MB or less) and metadata that describes the file, in a single request. To perform a multipart upload, refer to Perform a multipart upload.
When you want to upload the data of the large size, please use the resumable upload. Ref
References:
Upload and update file content of pydrive
Upload file data of Drive API

TypeError: storage must be a werkzeug.FileStorage

I am trying to save images in the filesystem in Flask app. It works fine. I can upload photos and retrieve them. But when i click on Submit without seleting any image it gives me an error:
TypeError: storage must be a werkzeug.FileStorage.
Any help please!!
I have very little experience so I am out of element here. The error comes from the flask_uploads.py:
def save(self, storage, folder=None, name=None):
"""
This saves a werkzeug.FileStorage into this upload set. If the
upload is not allowed, an UploadNotAllowed error will be raised.
Otherwise, the file will be saved and its name (including the folder)
will be returned.
:param storage: The uploaded file to save.
:param folder: The subfolder within the upload set to save to.
:param name: The name to save the file as. If it ends with a dot, the
file's extension will be appended to the end. (If you
are using `name`, you can include the folder in the
`name` instead of explicitly using `folder`, i.e.
``uset.save(file, name="someguy/photo_123.")``
"""
if not isinstance(storage, FileStorage):
raise TypeError("storage must be a werkzeug.FileStorage")
from flask_uploads import UploadSet, configure_uploads, IMAGES
form = Add***()
if form.validate_on_submit():
image_url = photos.url(photos.save(form.image.data))
new_**** = ****(screenname= form.screenname.data, fullname = form.fullname.data, team = form.team.data, image=image_url)
db.session.add(new_****)
db.session.commit()
flash("Added Successfully")
return redirect (url_for('****'))
return render_template("****.html", form=form)
I expected the html to return to the page if the image was not selected.

Static files are still being served from file system instead of AWS-S3 in Flask

I have a script that generates an image.
That image is saved in a directory inside my OS static folder. image.png is saved in:
-static
-images
-monkeys
- image.png
The server endpoint function should upload the image into my S3-bucket and return that static file from my bucket, not from my OS file system.
This does not work for some reason.
The image uploading works fine, I can see the image in the bucket, I'm just not able to serve the static image, I get an error:
"The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.". Basically it cannot find the image.
I am using flask-S3 in the following way:
app = Flask(__name__)
app.config['FLASKS3_BUCKET_NAME'] = os.environ.get('S3_BUCKET_NAMEING')
app.config['USE_S3_DEBUG'] = True
s3 = FlaskS3(app)
My endpoint for serving the static image:
#app.route('/image/monkey/<address>', methods = ['GET'])
def monkey_image(address):
# There is some code here that generates that image and places it
# inside the monkeys folder. I did not include it because
# it is not relevant to the question
image = open(image_path, 'rb')
S3_path = 'images/monkeysS3/' + monkey_image_name
upload_image_to_s3_bucket(image,'static/' + S3_path)
return redirect(flask_url_for('static', filename=S3_path))
So the last 2 lines matters.
upload_image_to_s3 works. The issue comes from
return redirect(flask_url_for('static', filename=path)).
It just can't find the image inside my S3 bucket.
This goes for development and production as well.
Thanks

Flask - Handling Form File & Upload to AWS S3 without Saving to File

I am using a Flask app to receive a mutipart/form-data request with an uploaded file (a video, in this example).
I don't want to save the file in the local directory because this app will be running on a server, and saving it will slow things down.
I am trying to use the file object created by the Flask request.files[''] method, but it doesn't seem to be working.
Here is that portion of the code:
#bp.route('/video_upload', methods=['POST'])
def VideoUploadHandler():
form = request.form
video_file = request.files['video_data']
if video_file:
s3 = boto3.client('s3')
s3.upload_file(video_file.read(), S3_BUCKET, 'video.mp4')
return json.dumps('DynamoDB failure')
This returns an error:
TypeError: must be encoded string without NULL bytes, not str
on the line:
s3.upload_file(video_file.read(), S3_BUCKET, 'video.mp4')
I did get this to work by first saving the file and then accessing that saved file, so it's not an issue with catching the request file. This works:
video_file.save(form['video_id']+".mp4")
s3.upload_file(form['video_id']+".mp4", S3_BUCKET, form['video_id']+".mp4")
What would be the best method to handle this file data in memory and pass it to the s3.upload_file() method? I am using the boto3 methods here, and I am only finding examples with the filename used in the first parameter, so I'm not sure how to process this correctly using the file in memory. Thanks!

First you need to be able to access the raw data sent to Flask. This is not as easy as it seems, since you're reading a form. To be able to read the raw stream you can use flask.request.stream, which behaves similarly to StringIO. The trick here is, you cannot call request.form or request.file because accessing those attributes will load the whole stream into memory or into a file.
You'll need some extra work to extract the right part of the stream (which unfortunately I cannot help you with because it depends on how your form is made, but I'll let you experiment with this).
Finally you can use the set_contents_from_file function from boto, since upload_file does not seem to deal with file-like objects (StringIO and such).
Example code:
from boto.s3.key import Key
#bp.route('/video_upload', methods=['POST'])
def VideoUploadHandler():
# form = request.form <- Don't do that
# video_file = request.files['video_data'] <- Don't do that either
video_file_and_metadata = request.stream # This is a file-like object which does not only contain your video file
# This is what you need to implement
video_title, video_stream = extract_title_stream(video_file_and_metadata)
# Then, upload to the bucket
s3 = boto3.client('s3')
bucket = s3.create_bucket(bucket_name, location=boto.s3.connection.Location.DEFAULT)
k = Key(bucket)
k.key = video_title
k.set_contents_from_filename(video_stream)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.