RESTful way to upload file along with some data in django - python

I am creating a webservice with django using django rest framework.
Users are able to upload some images and videos. Uploading media is a two step action, first user uploads the file and receives an ID then in a separate request uses that ID to refer to the media (for example (s)he can use it as profile picture or use it in a chat message).
I need to know who is uploading the media for both HMAC authentication middleware and setting owner of media in database. All other requests are in JSON format and include a username field that it used by HMAC middleware to retrieve the secret shared key.
It first came to my mind that media upload api may look like this:
{
"username":"mjafar",
"datetime":"2015-05-08 19:05",
"media_type":"photo",
"media_data": /* base64 encoded image file */
}
But i thought that base64 encoding may have significant overhead for larger files like videos; or there may be some restrictions on size of data that can be parsed in json or be created in user side. (This webservice is supposed to communicate with a Android/iOS app, they have limited memory)! Is this a good solution? Are my concerns real problems or i shouldn't worry? Better solutions?

You could separate the two. Meta data at one interface with a URL pointing to the actual file. Depending on how you store the actual file you could then reference the file directly via URL at a later point.
You could then have the POST API directly accept the file and simply return the JSON meta data
{
"username":"mjafar", // inferred from self.request.user
"datetime":"2015-05-08 19:05", // timestamp on server
"media_type":"photo", // inferred from header content-type?
// auto-generated hashed location for file
"url": "/files/1dde/2ecf/4075/f61b/5a9c/1cec/53e0/ca9b/4b58/c153/09da/f4c1/9e09/4126/271f/fb4e/foo.jpg"
}
Creating such an interface using DRF would be more along the lines of implementing rest_framework.views.APIView
Here's what I'm doing for one of my sites:
class UploadedFile(models.Model):
creator = models.ForeignKey(auth_models.User,blank=True)
creation_datetime = models.DateTimeField(blank=True,null=True)
title = models.CharField(max_length=100)
file = models.FileField(max_length=200, upload_to=FileSubpath)
sha256 = models.CharField(max_length=64,db_index=True)
def save(self,*args,**kw_args):
if not self.creation_datetime:
self.creation_datetime = UTC_Now()
super(UploadedFile,self).save(*args,**kw_args)
serializer:
class UploadedFileSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = UploadedFile
fields = ('url', 'creator','creation_datetime','title','file')
And the view to use this:
from rest_framework.views import APIView
from qc_srvr import serializers,models
from rest_framework.response import Response
from rest_framework import status
from rest_framework import parsers
from rest_framework import renderers
import django.contrib.auth.models as auth_models
import hashlib
class UploadFile(APIView):
'''A page for uploading files.'''
throttle_classes = ()
permission_classes = ()
parser_classes = (parsers.FormParser, parsers.JSONParser,)
renderer_classes = (renderers.JSONRenderer,)
serializer_class = serializers.UploadedFileSerializer
def calc_sha256(self,afile):
hasher = hashlib.sha256()
blocksize=65536
hasher.update('af1f9847d67300b996edce88889e358ab81f658ff71d2a2e60046c2976eeebdb') # salt
buf = afile.read(blocksize)
while len(buf) > 0:
hasher.update(buf)
buf = afile.read(blocksize)
return hasher.hexdigest()
def post(self, request):
if not request.user.is_authenticated():
return Response('User is not authenticated.', status=status.HTTP_401_UNAUTHORIZED)
uploaded_file = request.FILES.get('file',None)
if not uploaded_file:
return Response('No upload file was specified.', status=status.HTTP_400_BAD_REQUEST)
# calculate sha
sha256 = self.calc_sha256(uploaded_file)
# does the file already exist?
existing_files = models.UploadedFile.objects.filter(sha256=sha256)
if len(existing_files):
serializer = self.serializer_class(instance=existing_files[0],context={'request':request})
else:
instance = models.UploadedFile.objects.create(
creator = request.user,
title= uploaded_file.name,
file = uploaded_file,
sha256 = sha256)
serializer = self.serializer_class(instance=instance,context={'request':request})
#import rpdb2; rpdb2.start_embedded_debugger('foo')
#serializer.is_valid()
return Response(serializer.data)
FYI, this is a bit of security-through-obscurity since all the uploaded files are retrievable if you have the URL to the file.
I'm still using DRF 2.4.4, so this may not work for you on 3+. I haven't upgraded due to the dropped nested-serializers support.

Related

Saving PDFs to disk as they are generated with django-wkhtmltopdf

What I'm trying to implement is this:
User sends query parameters from React FE microservice to the Django BE microservice.
URI is something like /api/reports?startingPage=12&dataView=Region
These PDFs are way too big to be generated in FE, so doing it server side
Request makes its way into the view.py where the data related to dataView=Region is queried from the database, each row is iterated through and a PDF report is generated for each item
Each dataView=Region can consist of a few hundred items and each of those items is its own report that can be a page long or several pages long
As the reports are generated, they should be saved to the server persistent volume claim and not be sent back to FE until they have all run.
When they have all run, I plan to use pypdf2 to combine all of the PDFs into one large file.
At that point, the file is sent back to the FE to download.
I'm only working on 1. and 3. at this point and I'm unable to:
Get the files to save to storage
Prevent the default behavior of the PDF being sent back to the FE after it has been generated
The PDFs are being generated, so that is good.
I'm trying to implement the suggestions as found here, but I'm not getting the desired results:
Save pdf from django-wkhtmltopdf to server (instead of returning as a response)
This is what I currently have on the Django side:
# urls.py
from django.urls import path
from .views import GeneratePDFView
app_name = 'Reports'
urlpatterns = [
path('/api/reports',
GeneratePDFView.as_view(), name='generate_pdf'),
]
# views.py
from django.conf import settings
from django.views.generic.base import TemplateView
from rest_framework.permissions import IsAuthenticated
from wkhtmltopdf.views import PDFTemplateResponse
# Create your views here.
class GeneratePDFView(TemplateView):
permission_classes = [IsAuthenticated]
template_name = 'test.html'
filename = 'test.pdf'
def generate_pdf(self, request, **kwargs):
context = {'key': 'value'}
# generate response
response = PDFTemplateResponse(
request=self.request,
template=self.template_name,
filename=self.filename,
context=context,
cmd_options={'load-error-handling': 'ignore'})
self.save_pdf(response.rendered_content, self.filename)
# Handle saving the document
# This is what I'm using elsewhere where files are saved and it works there
def save_pdf(self, file, filename):
with open(settings.PDF_DIR + '/' + filename, 'wb+') as destination:
for chunk in file.chunks():
destination.write(chunk)
# settings.py
...
DOWNLOAD_ROOT = '/mnt/files/client-downloads/'
MEDIA_ROOT = '/mnt/files/client-submissions/'
PDF_DIR = '/mnt/files/pdf-sections/'
...
I should note the other DOWNLOAD_ROOT and MEDIA_ROOT are working fine where the app uses them. I've even tried using settings.MEDIA_ROOT because I know it works, but still nothing is saved there. But as you can see, I'm starting out super basic and haven't added a query, loops, etc.
My save_pdf() is different than the SO question I linked to because that is what I'm using in other parts of my application and it is saving files fine there. I did try what they provided in the SO question, but had the same results with it not saving. That being:
with open("file.pdf", "wb") as f:
f.write(response.rendered_content)
So what do I need to do to get these PDFs to save to disk?
Perhaps I need to be using a different library for my needs as django-wkhtmltopdf seems to do a number of things out of the box that I don't want that I'm not clear I can override.
OK, my smooth brain gained a few ripples overnight and figured it out this morning:
# views.py
class GeneratePDFView(TemplateView):
permission_classes = [IsAuthenticated]
def get(self, request, *args, **kwargs):
template_name = 'test.html'
filename = 'test.pdf'
context = {'key': 'value'}
# generate response
response = PDFTemplateResponse(
request=request,
template=template_name,
filename=filename,
context=context,
cmd_options={'load-error-handling': 'ignore'})
# write the rendered content to a file
with open(settings.PDF_DIR + '/' + filename, "wb") as f:
f.write(response.rendered_content)
return HttpResponse('Hello, World!')
This saved the PDF to disk and also did not respond with the PDF. Obviously a minimally functioning example that I can expand on, but at least got those two issues figured out.

Why my image is not readable after uploaded on the Django Rest Framework?

I am newbie on Django and I would like to implement a request that allow to make upload file.
I wrote some code for this, but when I opened the file in local, my computer says it may be damaged.
I don't understand why because the size file is the same to another when i sent with postman.
here is my code :
view.py
def handle_uploaded_file(f):
with open(f.name, 'wb+') as destination:
for chunk in f.chunks():
destination.write(chunk)
class FileUploadView(APIView):
parser_classes = (FileUploadParser,)
def put(self, request, filename, format="png"):
file_obj = request.data['file']
handle_uploaded_file(file_obj)
return Response(filename, status.HTTP_201_CREATED)
Not sure if you're writing the files into the system manually but Django already has a way to handle the uploading of files - and DRF just builds on top of that. All you have to do is create a model with a FileField or any field that extends from it.
class Upload(models.Model):
user_upload = models.FileField(upload_to='path/to/upload')
Bear in mind that the database does not store the file - it only stores the path to the file. The file is directly uploaded to the path you've specified in the field. More info about upload_to here.
To upload using DRF - all you have to do is create a serializer using ModelSerializer and use a generic API view like CreateAPIView unless you have other requirements.
Your ModelSerializer can be something like:
class UploadFileSerializer(serializers.ModelSerializer):
class Meta:
model = Upload # reference the model above
fields = '__all__'
And in your views:
class UploadFileView(CreateAPIView):
serializer_class = UploadFileSerializer
def create(self, request, *args, **kwargs):
serializer = self.get_serializer(data=request.data, files=request.FILES) # <------ note the request.FILES
serializer.is_valid(raise_exception=True)
self.perform_create(serializer)
headers = self.get_success_headers(serializer.data)
return Response(serializer.data, status=status.HTTP_201_CREATED, headers=headers)
That should do the trick!

Wagtail: Serializing page model

I am using wagtail as a REST backend for a website. The website is built using react and fetches data via wagtails API v2.
The SPA website needs to be able to show previews of pages in wagtail. My thought was to override serve_preview on the page model and simply seralize the new page as JSON and write it to a cache which could be accessed by my frontend. But im having trouble serializing my page to json. All attempts made feel very "hackish"
I've made several attempts using extentions of wagtails built in serializers but without success:
Atempt 1:
def serve_preview(self, request, mode_name):
from wagtail.api.v2.endpoints import PagesAPIEndpoint
endpoint = PagesAPIEndpoint()
setattr(request, 'wagtailapi_router',
WagtailAPIRouter('wagtailapi_v2'))
endpoint.request = request
endpoint.action = None
endpoint.kwargs = {'slug': self.slug, 'pk': self.pk}
endpoint.lookup_field = 'pk'
serializer = endpoint.get_serializer(self)
Feels very ugly to use router here and set a bunch of attrs
Attempt 2:
def serve_preview(self, request, mode_name):
from wagtail.api.v2.endpoints import PagesAPIEndpoint
fields = PagesAPIEndpoint.get_available_fields(self)
if hasattr(self, 'api_fields'):
fields.extend(self.api_fields)
serializer_class = get_serializer_class(
type(self), fields, meta_fields=[PagesAPIEndpoint.meta_fields], base=PageSerializer)
serializer = serializer_class(self)
Better but i get context issues:
Traceback (most recent call last):
...
File "/usr/local/lib/python3.5/site-packages/wagtail/api/v2/serializers.py", line 92, in to_representation
self.context['view'].seen_types[name] = page.specific_class
KeyError: 'view'
Any toughts?
Solved it by diving through the source code.
First define an empty dummy view:
class DummyView(GenericViewSet):
def __init__(self, *args, **kwargs):
super(DummyView, self).__init__(*args, **kwargs)
# seen_types is a mapping of type name strings (format: "app_label.ModelName")
# to model classes. When an object is serialised in the API, its model
# is added to this mapping. This is used by the Admin API which appends a
# summary of the used types to the response.
self.seen_types = OrderedDict()
Then use this view and set the context of your serializer manually. Im also using the same router as in my api in my context. It has methods which are called by the PageSerializer to resolve some fields. Kinda strange it is so tightly coupled with the wagtail api but at least this works:
def serve_preview(self, request, mode_name):
import starrepublic.api as StarApi
fields = StarApi.PagesAPIEndpoint.get_available_fields(self)
if hasattr(self, 'api_fields'):
fields.extend(self.api_fields)
serializer_class = get_serializer_class(
type(self), fields, meta_fields=[StarApi.PagesAPIEndpoint.meta_fields], base=PageSerializer)
serializer = serializer_class(
self, context={'request': request, 'view': DummyView(), 'router': StarApi.api_router})
Dont forget to import:
from wagtail.api.v2.serializers import get_serializer_class
from rest_framework.viewsets import GenericViewSet
from rest_framework import status
from rest_framework.response import Response
from django.http import JsonResponse
from django.http import HttpResponse
Possibly a non-answer answer, but I too have had challenges in the area of DRF, Wagtail's layering on top of DRF, and the need to cache json results (DRF has no built-in caching as far as I can tell, so that's an additional challenge). In a recent project, I ended up just building a list of dictionaries in a view and sending them back out with HttpResponse(), bypassing DRF and Wagtail API altogether. The code ended up simple, readable, and was easy to cache:
import json
from django.http import HttpResponse
from django.core.cache import cache
data = cache.get('mydata')
if not data:
datalist = []
for foo in bar:
somedata = {}
# Populate somedata, "serializing" fields manually...
datalist.append(somedata)
# Cache for a week.
data = datalist
cache.set('mydata', datalist, 60 * 60 * 24 * 7)
return HttpResponse(json.dumps(data), content_type='application/json')
Not as elegant as using the pre-built REST framework, but sometimes the simpler approach is just more productive...

How can I safely look at the request body in a Django REST framework Authentication?

I've implemented a webhook. The caller provides a shared secret. They concatenate the request URL and body, sign it with the secret, and provide the signature in a header. To validate the signature, I need to repeat the process and compare the signature that I get to the one that the caller got.
I'm using Django REST framework. I'm checking the signature in a custom Authentication. The framework calls the Authentication's authenticate method with the request as a parameter. I can get the request body from request.stream.read(). However, that consumes the stream, which prevents the framework from parsing it for me: when the framework calls the view the request has no data attribute as it normally would.
request.stream doesn't allow me to seek(0). The request doesn't allow me to replace the stream with a rewound one such as a StringIO. My current workaround is to store the body in request.auth and parse it myself in the view, which works but is ugly.
Is there a way to access the request body safely in an Authentication?
Here's the version that works:
In myapp/authentications.py:
import hmac
from hashlib import sha1
from rest_framework.authentication import BaseAuthentication
from rest_framework.exceptions import AuthenticationFailed
class WebhookAuthentication(BaseAuthentication):
def authenticate(self, request):
actual_signature = request.META['X-Vendor-Signature']
body, expected_signature = self.expected_signature(request)
if actual_signature != expected_signature:
raise AuthenticationFailed("Request X-Vendor-Signature did not match expected signature")
return None, body # Hack: preserve the body in request.auth
def expected_signature(self, request):
body = request.stream.read()
raw = request.build_absolute_uri(request.META['PATH_INFO']) + body
signature = hmac.new("the shared secret", raw, sha1).digest().encode("base64")
return body, signature
In myapp/views.py:
import json
from rest_framework.views import APIView
from serializers import SomeSerializer
from myapp.authentications import WebhookAuthentication
class SomeView(APIView):
authentication_classes = (WebhookAuthentication,)
permission_classes = ()
def post(self, request, _=None):
data = json.loads(request.auth) # Hack: get the body from where we stashed it
serializer = SomeSerializer(data=data) # I'd like to be able to use request.data here
# Use the serializer ...
How can I do better than the hack?

Django REST Framework upload image: "The submitted data was not a file"

I am leaning how to upload file in Django, and here I encounter a should-be-trivial problem, with the error:
The submitted data was not a file. Check the encoding type on the form.
Below is the detail.
Note: I also looked at Django Rest Framework ImageField, and I tried
serializer = ImageSerializer(data=request.data, files=request.FILES)
but I get
TypeError: __init__() got an unexpected keyword argument 'files'
I have a Image model which I would like to interact with via Django REST framework:
models.py
class Image(models.Model):
image = models.ImageField(upload_to='item_images')
owner = models.ForeignKey(
User, related_name='uploaded_item_images',
blank=False,
)
time_created = models.DateTimeField(auto_now_add=True)
serializers.py
class ImageSerializer(serializers.ModelSerializer):
image = serializers.ImageField(
max_length=None, use_url=True,
)
class Meta:
model = Image
fields = ("id", 'image', 'owner', 'time_created', )
settings.py
'DEFAULT_PARSER_CLASSES': (
'rest_framework.parsers.JSONParser',
'rest_framework.parsers.FormParser',
'rest_framework.parsers.MultiPartParser',
),
The front end (using AngularJS and angular-restmod or $resource) send JSON data with owner and image of the form:
Input:
{"owner": 5, "image": "..."}
In the backend, request.data shows
{u'owner': 5, u'image': u'..."}
But then ImageSerializer(data=request.data).errors shows the error
ReturnDict([('image', [u'The submitted data was not a file. Check the encoding type on the form.'])])
I wonder what I should do to fix the error?
EDIT: JS part
The related front end codes consists of two parts: a angular-file-dnd directive (available here) to drop the file onto the page and angular-restmod, which provides CRUD operations:
<!-- The template: according to angular-file-dnd, -->
<!-- it will store the dropped image into variable $scope.image -->
<div file-dropzone="[image/png, image/jpeg, image/gif]" file="image" class='method' data-max-file-size="3" file-name="imageFileName">
<div layout='row' layout-align='center'>
<i class="fa fa-upload" style='font-size:50px;'></i>
</div>
<div class='text-large'>Drap & drop your photo here</div>
</div>
# A simple `Image` `model` to perform `POST`
$scope.image_resource = Image.$build();
$scope.upload = function() {
console.log("uploading");
$scope.image_resource.image = $scope.image;
$scope.image_resource.owner = Auth.get_profile().user_id;
return $scope.image_resource.$save();
};
An update concerning the problem: right now I switched to using ng-file-upload, which sends image data in proper format.
The problem that you are hitting is that Django REST framework expects files to be uploaded as multipart form data, through the standard file upload methods. This is typically a file field, but the JavaScript Blob object also works for AJAX.
You are looking to upload the files using a base64 encoded string, instead of the raw file, which is not supported by default. There are implementations of a Base64ImageField out there, but the most promising one came by a pull request.
Since these were mostly designed for Django REST framework 2.x, I've improved upon the one from the pull request and created one that should be compatible with DRF 3.
serializers.py
from rest_framework import serializers
class Base64ImageField(serializers.ImageField):
"""
A Django REST framework field for handling image-uploads through raw post data.
It uses base64 for encoding and decoding the contents of the file.
Heavily based on
https://github.com/tomchristie/django-rest-framework/pull/1268
Updated for Django REST framework 3.
"""
def to_internal_value(self, data):
from django.core.files.base import ContentFile
import base64
import six
import uuid
# Check if this is a base64 string
if isinstance(data, six.string_types):
# Check if the base64 string is in the "data:" format
if 'data:' in data and ';base64,' in data:
# Break out the header from the base64 content
header, data = data.split(';base64,')
# Try to decode the file. Return validation error if it fails.
try:
decoded_file = base64.b64decode(data)
except TypeError:
self.fail('invalid_image')
# Generate file name:
file_name = str(uuid.uuid4())[:12] # 12 characters are more than enough.
# Get the file name extension:
file_extension = self.get_file_extension(file_name, decoded_file)
complete_file_name = "%s.%s" % (file_name, file_extension, )
data = ContentFile(decoded_file, name=complete_file_name)
return super(Base64ImageField, self).to_internal_value(data)
def get_file_extension(self, file_name, decoded_file):
import imghdr
extension = imghdr.what(file_name, decoded_file)
extension = "jpg" if extension == "jpeg" else extension
return extension
This should be used in replacement of the standard ImageField provided by Django REST framework. So your serializer would become
class ImageSerializer(serializers.ModelSerializer):
image = Base64ImageField(
max_length=None, use_url=True,
)
class Meta:
model = Image
fields = ("id", 'image', 'owner', 'time_created', )
This should allow you to either specify a base64-encoded string, or the standard Blob object that Django REST framework typically expects.
I ran in the same problem few days ago. Here is my django rest framework view to handle file uploading
views.py
class PhotoUploadView(APIView):
parser_classes = (FileUploadParser,)
def post(self, request):
user = self.request.user
if not user:
return Response(status=status.HTTP_403_FORBIDDEN)
profile = None
data = None
photo = None
file_form = FileUploadForm(request.POST,request.FILES)
if file_form.is_valid():
photo = request.FILES['file']
else:
return Response(ajax_response(file_form),status=status.HTTP_406_NOT_ACCEPTABLE)
try:
profile = Organizer.objects.get(user=user)
profile.photo = photo
profile.save()
data = OrganizersSerializer(profile).data
except Organizer.DoesNotExist:
profile = Student.objects.get(user=user)
profile.photo = photo
profile.save()
data = StudentsSerializer(profile).data
return Response(data)
In front-end, I used angular-file-upload lib.
Here is my file input
<div ng-file-drop="" ng-file-select="" ng-model="organizer.photo" class="drop-box" drag-over-class="{accept:'dragover', reject:'dragover-err', delay:100}" ng-multiple="false" allow-dir="true" accept="image/*">
Drop Images or PDFs<div>here</div>
</div>
And here is my upload service
main.js
(function () {
'use strict';
angular
.module('trulii.utils.services')
.factory('UploadFile', UploadFile);
UploadFile.$inject = ['$cookies', '$http','$upload','$window','Authentication'];
/**
* #namespace Authentication
* #returns {Factory}
*/
function UploadFile($cookies, $http,$upload,$window,Authentication) {
/**
* #name UploadFile
* #desc The Factory to be returned
*/
var UploadFile = {
upload_file: upload_file,
};
return UploadFile;
function upload_file(file) {
return $upload.upload({
url: '/api/users/upload/photo/', // upload.php script, node.js route, or servlet url
//method: 'POST' or 'PUT',
//headers: {'Authorization': 'xxx'}, // only for html5
//withCredentials: true,
file: file, // single file or a list of files. list is only for html5
//fileName: 'doc.jpg' or ['1.jpg', '2.jpg', ...] // to modify the name of the file(s)
//fileFormDataName: myFile, // file formData name ('Content-Disposition'), server side request form name
// could be a list of names for multiple files (html5). Default is 'file'
//formDataAppender: function(formData, key, val){} // customize how data is added to the formData.
// See #40#issuecomment-28612000 for sample code
})
}
}
})();

Categories