Django: Create zipfile from S3 using django-storage

Django: Create zipfile from S3 using django-storage - python

I use django-storages and have user related content stored in folders on S3. Now I want users to have the ability to download all their files at once, preferably in a zip file. All the previous posts related to this are old or not working for me.
The closest to working code I have so far:
from io import BytesIO
import zipfile
from django.conf import settings
from ..models import Something
from django.core.files.storage import default_storage
class DownloadIncomeTaxFiles(View):
def get(self, request, id):
itr = Something.objects.get(id=id)
files = itr.attachments
zfname = 'somezip.zip'
b = BytesIO()
with zipfile.ZipFile(b, 'w') as zf:
for current_file in files:
try:
fh = default_storage.open(current_file.file.name, "r")
zf.writestr(fh.name, bytes(fh.read()))
except Exception as e:
print(e)
response = HttpResponse(zf, content_type="application/x-zip-compressed")
response['Content-Disposition'] = 'attachment; filename={}'.format(zfname)
return response
This creates what looks like a zipfile but the only content it has is '<zipfile.ZipFile [closed]>'
I got many different results, mainly with errors like zipfile expecting string or bytes content while a FieldFile is provided. At this point I'm completely stuck.

The problem was that I needed to revert back to the beginning of the file by adding
zf.seek(0)
just before returning the file in the HttpResponse.

Related

How to convert image address and save to imagefield in Django [duplicate]

please excuse me for my ugly english ;-)
Imagine this very simple model :
class Photo(models.Model):
image = models.ImageField('Label', upload_to='path/')
I would like to create a Photo from an image URL (i.e., not by hand in the django admin site).
I think that I need to do something like this :
from myapp.models import Photo
import urllib
img_url = 'http://www.site.com/image.jpg'
img = urllib.urlopen(img_url)
# Here I need to retrieve the image (as the same way that if I put it in an input from admin site)
photo = Photo.objects.create(image=image)
I hope that I've well explained the problem, if not tell me.
Thank you :)
Edit :
This may work but I don't know how to convert content to a django File :
from urlparse import urlparse
import urllib2
from django.core.files import File
photo = Photo()
img_url = 'http://i.ytimg.com/vi/GPpN5YUNDeI/default.jpg'
name = urlparse(img_url).path.split('/')[-1]
content = urllib2.urlopen(img_url).read()
# problem: content must be an instance of File
photo.image.save(name, content, save=True)

I just created http://www.djangosnippets.org/snippets/1890/ for this same problem. The code is similar to pithyless' answer above except it uses urllib2.urlopen because urllib.urlretrieve doesn't perform any error handling by default so it's easy to get the contents of a 404/500 page instead of what you needed. You can create callback function & custom URLOpener subclass but I found it easier just to create my own temp file like this:
from django.core.files import File
from django.core.files.temp import NamedTemporaryFile
img_temp = NamedTemporaryFile(delete=True)
img_temp.write(urllib2.urlopen(url).read())
img_temp.flush()
im.file.save(img_filename, File(img_temp))

from myapp.models import Photo
import urllib
from urlparse import urlparse
from django.core.files import File
img_url = 'http://www.site.com/image.jpg'
photo = Photo() # set any other fields, but don't commit to DB (ie. don't save())
name = urlparse(img_url).path.split('/')[-1]
content = urllib.urlretrieve(img_url)
# See also: http://docs.djangoproject.com/en/dev/ref/files/file/
photo.image.save(name, File(open(content[0])), save=True)

Combining what Chris Adams and Stan said and updating things to work on Python 3, if you install Requests you can do something like this:
from urllib.parse import urlparse
import requests
from django.core.files.base import ContentFile
from myapp.models import Photo
img_url = 'http://www.example.com/image.jpg'
name = urlparse(img_url).path.split('/')[-1]
photo = Photo() # set any other fields, but don't commit to DB (ie. don't save())
response = requests.get(img_url)
if response.status_code == 200:
photo.image.save(name, ContentFile(response.content), save=True)
More relevant docs in Django's ContentFile documentation and Requests' file download example.

ImageField is just a string, a path relative to your MEDIA_ROOT setting. Just save the file (you might want to use PIL to check it is an image) and populate the field with its filename.
So it differs from your code in that you need to save the output of your urllib.urlopen to file (inside your media location), work out the path, save that to your model.

I do it this way on Python 3, which should work with simple adaptations on Python 2. This is based on my knowledge that the files I’m retrieving are small. If yours aren’t, I’d probably recommend writing the response out to a file instead of buffering in memory.
BytesIO is needed because Django calls seek() on the file object, and urlopen responses don’t support seeking. You could pass the bytes object returned by read() to Django's ContentFile instead.
from io import BytesIO
from urllib.request import urlopen
from django.core.files import File
# url, filename, model_instance assumed to be provided
response = urlopen(url)
io = BytesIO(response.read())
model_instance.image_field.save(filename, File(io))

Recently I use the following approach within python 3 and Django 3, maybe this might be interesting for others aswell. It is similar to Chris Adams solution but for me it did not work anymore.
import urllib.request
from django.core.files.uploadedfile import SimpleUploadedFile
from urllib.parse import urlparse
from demoapp import models
img_url = 'https://upload.wikimedia.org/wikipedia/commons/f/f7/Stack_Overflow_logo.png'
basename = urlparse(img_url).path.split('/')[-1]
tmpfile, _ = urllib.request.urlretrieve(img_url)
new_image = models.ModelWithImageOrFileField()
new_image.title = 'Foo bar'
new_image.file = SimpleUploadedFile(basename, open(tmpfile, "rb").read())
new_image.save()

Just discovered that you don't have to generate a temporary file:
Stream url content directly from django to minio
I have to store my files in minio and have django docker containers without much disk space and need to download big video files, so this was really helpful to me.

Its been almost 11 years since the question and the most reputed answer has been posted. Thanks To #chris-adams for the response. I am Just reposting the same answer along with the updated packages and support.
#! /usr/bin/python3
# lib/utils.py
import urllib3 # http Request Package.
from typing import Optional
from django.core.files import File # Handle Files in Django
from django.core.files.temp import NamedTemporaryFile # handling temporary files.
def fetch_image(url: str, instance: models.Model, field: str, name: Optional[str]=None):
"""
fetch_image Fetches an image URL and adds it to the model field.
the parameter instance does not need to be a saved instance.
:url: str = A valid image URL.
:instance: django.db.models.Model = Expecting a model with image field or file field.
:field: str = image / file field name as string;
[name:str] = Preferred file name, such as product slug or something.
:return: updated instance as django.db.models.Model, status of updation as bool.
"""
conn = urllib3.PoolManager()
response = conn.request('GET', url)
if response.status <> 200:
print("[X] 404! IMAGE NOT FOUND")
print(f"TraceBack: {url}")
return instance, False
file_obj = NamedTemporaryFile(delete=True)
file_obj.write( response.data )
file_obj.flush()
img_format = url.split('.')[-1]
if name is None:
name = url.split('/')[-1]
if not name.endswith(img_format):
name += f'.{img_format}'
django_file_obj = File(file_obj)
(getattr(instance, field)).save(name, django_file_obj)
return instance, True
Tested with Django==2.2.12 in Python 3.7.5
if __name__ == '__main__':
instance = ProductImage()
url = "https://www.publicdomainpictures.net/pictures/320000/velka/background-image.png"
instance, saved = fetch_image(url, instance, field='banner_image', name='intented-image-slug')
status = ["FAILED! ", "SUCCESS! "][saved]
print(status, instance.banner_image and instance.banner_image.path)
instance.delete()

this is the right and working way
class Product(models.Model):
upload_path = 'media/product'
image = models.ImageField(upload_to=upload_path, null=True, blank=True)
image_url = models.URLField(null=True, blank=True)
def save(self, *args, **kwargs):
if self.image_url:
import urllib, os
from urlparse import urlparse
filename = urlparse(self.image_url).path.split('/')[-1]
urllib.urlretrieve(self.image_url, os.path.join(file_save_dir, filename))
self.image = os.path.join(upload_path, filename)
self.image_url = ''
super(Product, self).save()

Fetch multiple images from the server using Django

I am trying to download multiple image files from the server. I am using Django for my backend.
Question related to single image has already been answered and I tried the code and it works on single image. In my application, I want to download multiple images in a single HTTP connection.
from PIL import Image
img = Image.open('test.jpg')
img2 = Image.open('test2.png')
response = HttpResponse(content_type = 'image/jpeg')
response2 = HttpResponse(content_type = 'image/png')
img.save(response, 'JPEG')
img2.save(response2, 'PNG')
return response #SINGLE
How can I fetch both img and img2 at once. One way I was thinking is to zip both images and unzip it on client size but I dont think that is good solution. Is there a way to handle this?

I looked around and find an older solution using a temporary Zip file on disk: https://djangosnippets.org/snippets/365/
It needed some updating, and this should work (tested on django 2.0)
import tempfile, zipfile
from django.http import HttpResponse
from wsgiref.util import FileWrapper
def send_zipfile(request):
"""
Create a ZIP file on disk and transmit it in chunks of 8KB,
without loading the whole file into memory. A similar approach can
be used for large dynamic PDF files.
"""
temp = tempfile.TemporaryFile()
archive = zipfile.ZipFile(temp, 'w', zipfile.ZIP_DEFLATED)
for index in range(10):
filename = 'C:/Users/alex1/Desktop/temp.png' # Replace by your files here.
archive.write(filename, 'file%d.png' % index) # 'file%d.png' will be the
# name of the file in the
# zip
archive.close()
temp.seek(0)
wrapper = FileWrapper(temp)
response = HttpResponse(wrapper, content_type='application/zip')
response['Content-Disposition'] = 'attachment; filename=test.zip'
return response
Right now, this takes my .png and writes it 10 times in my .zip, then sends it.

You could add your files/images to a ZIP file and return that one in the response. I think that is the best approach.
Here is some example code of how you could achieve that (from this post):
def zipFiles(files):
outfile = StringIO() # io.BytesIO() for python 3
with zipfile.ZipFile(outfile, 'w') as zf:
for n, f in enumarate(files):
zf.writestr("{}.csv".format(n), f.getvalue())
return outfile.getvalue()
zipped_file = zip_files(myfiles)
response = HttpResponse(zipped_file, content_type='application/octet-stream')
response['Content-Disposition'] = 'attachment; filename=my_file.zip'
Otherwise (if you don't like ZIP files) you could make individual requests from the client.

Download a file with Django

This might perhaps be a simple question, but I somehow just can not find the solution. Django offers a lot about uploading file, but how do I do to download a file.
Let's assume we have a button on HTML on uploads/something.txt as a file.
I tried with django.views.static.serve, however what this did it would open a file on webpage.
My question is simple: What is the best and most pythonic way for user of our website to download a file?

You need to read that file.
Serve it using HttpResponse along with proper content type.
Here's some sample code:
content = open("uploads/something.txt").read()
return HttpResponse(content, content_type='text/plain')
This should serve a text file.
But as you described, on some browser, it will not ask to download the file, rather, it would show it in the browser. If you want to show a download prompt, use this:
response = HttpResponse(open("uploads/something.txt", 'rb').read())
response['Content-Type'] = 'text/plain'
response['Content-Disposition'] = 'attachment; filename=DownloadedText.txt'
return response
However, please note that it might be a better idea to serve static contents or uploaded files via nginx or the reverse proxy of your choice. Sending large files through Django might not be the most optimum way of doing that.

import os
from django.conf import settings
from django.http import HttpResponse, Http404
def download(request, path):
file_path = os.path.join(settings.MEDIA_ROOT, path)
if os.path.exists(file_path):
with open(file_path, 'rb') as fh:
response = HttpResponse(fh.read(), content_type="application/vnd.ms-excel")
response['Content-Disposition'] = 'inline; filename=' + os.path.basename(file_path)
return response
raise Http404

Maybe a little late but here is my solution:
def render(self, value):
return format_html('<a href="/media/{0}" download>{0}</a>', value)

How to save file data from POST variable and load it back to response in Python Django?

I have a such problem - I am using Python 2.6 / Django 1.3 and I need to accept as POST variable with key 'f', which contains a binary data. After that, I need to save data in a file.
POST
T$topX$objectsX$versionY$archiverО©ҐR$0О©ҐО©ҐО©Ґull_=<---------------------- content of file -------------------->О©ҐО©Ґ_NSKeyedArchive(258:=CО©ҐО©Ґ
Code
from django.core.files.storage import default_storage
from django.core.files.base import ContentFile
def save(request):
upload_file = request.POST['f']
save_path = default_storage.save('%s%s' % (save_dir, filename),
ContentFile(upload_file))
When I am trying to do
nano /tmp/myfile.zip
It returns data like
T^#^#^#$^#^#^#t^#^#^#o^#^#^#p^#^#^#X^#^#^#$^#^#^#o^#^#^#b^#^#^#j^#^#^#e^#^#^#c^#^#^#t^#^#^#s^#^#^#X^#^#^#$^#^#^#v^#^#^#e^#^#^#r^#^#^#s^#^#^#i^#^#$
When its done, I am going to read saved file
def read(request):
user_file = default_storage.open(file_path).read()
file_name = get_filename(file_path)
response = HttpResponse(user_file, content_type = 'text/plain',
mimetype = 'application/force-download')
response['Content-Disposition'] = 'attachment; filename=%s' % file_name
response['Content-Length'] = default_storage.size(file_path)
return response
In case, when I am writing
print user_file
It returns a correct data, but when I am returning a HttpResponse it has a different data from a source

It would probably be easier, and more memory efficient if you just save the data into a file, and like #keckse said, let a browser stream it. Django is very inefficient in streaming data. It will all depend on the size of the data. If you want to stream it with django anyways, it can be done like this:
from django.http import HttpResponse
import os.path
import mimetypes
def stream(request, document, type=None):
doc = Document.objects.get(pk=document)
fsock = open(doc.file.path,"r")
file_name = os.path.basename(doc.file.path)
mime_type_guess = mimetypes.guess_type(file_name)
if mime_type_guess is not None:
response = HttpResponse(fsock, mimetype=mime_type_guess[0])
response['Content-Disposition'] = 'attachment; filename=' + file_name
return response
In your case you might want to set the mime type manually, you can try out application/octet-stream too. The mainpassing iterators difference is that you pass the "string" from file.read(), instead of the handle to the file directly. Please note: if you use read(), you will be loading the whole file into memory.
More on passing iterators to HttpResonse. And I might be wrong, but I think you can drop the content-type.

Django test FileField using test fixtures

I'm trying to build tests for some models that have a FileField. The model looks like this:
class SolutionFile(models.Model):
'''
A file from a solution.
'''
solution = models.ForeignKey(Solution)
file = models.FileField(upload_to=make_solution_file_path)
I have encountered two problems:
When saving data to a fixture using ./manage.py dumpdata, the file contents are not saved, only the file name is saved into the fixture. While I find this to be the expected behavior as the file contents are not saved into the database, I'd like to somehow include this information in the fixture for tests.
I have a test case for uploading a file that looks like this:
def test_post_solution_file(self):
import tempfile
import os
filename = tempfile.mkstemp()[1]
f = open(filename, 'w')
f.write('These are the file contents')
f.close()
f = open(filename, 'r')
post_data = {'file': f}
response = self.client.post(self.solution.get_absolute_url()+'add_solution_file/', post_data,
follow=True)
f.close()
os.remove(filename)
self.assertTemplateUsed(response, 'tests/solution_detail.html')
self.assertContains(response, os.path.basename(filename))
While this test works just fine, it leaves the uploaded file in the media directory after finishing. Of course, the deletion could be taken care of in tearDown(), but I was wondering if Django had another way of dealing with this.
One solution I was thinking of was using a different media folder for tests which must be kept synced with the test fixtures. Is there any way to specify another media directory in settings.py when tests are being run? And can I include some sort of hook to dumpdata so that it syncs the files in the media folders?
So, is there a more Pythonic or Django-specific way of dealing with unit tests involving files?

Django provides a great way to write tests on FileFields without mucking about in the real filesystem - use a SimpleUploadedFile.
from django.core.files.uploadedfile import SimpleUploadedFile
my_model.file_field = SimpleUploadedFile('best_file_eva.txt', b'these are the contents of the txt file')
It's one of django's magical features-that-don't-show-up-in-the-docs :). However it is referred to here.

You can override the MEDIA_ROOT setting for your tests using the #override_settings() decorator as documented:
from django.test import override_settings
#override_settings(MEDIA_ROOT='/tmp/django_test')
def test_post_solution_file(self):
# your code here

I've written unit tests for an entire gallery app before, and what worked well for me was using the python tempfile and shutil modules to create copies of the test files in temporary directories and then delete them all afterwards.
The following example is not working/complete, but should get you on the right path:
import os, shutil, tempfile
PATH_TEMP = tempfile.mkdtemp(dir=os.path.join(MY_PATH, 'temp'))
def make_objects():
filenames = os.listdir(TEST_FILES_DIR)
if not os.access(PATH_TEMP, os.F_OK):
os.makedirs(PATH_TEMP)
for filename in filenames:
name, extension = os.path.splitext(filename)
new = os.path.join(PATH_TEMP, filename)
shutil.copyfile(os.path.join(TEST_FILES_DIR, filename), new)
#Do something with the files/FileField here
def remove_objects():
shutil.rmtree(PATH_TEMP)
I run those methods in the setUp() and tearDown() methods of my unit tests and it works great! You've got a clean copy of your files to test your filefield that are reusable and predictable.

with pytest and pytest-django, I use this in conftest.py file:
import tempfile
import shutil
from pytest_django.lazy_django import skip_if_no_django
from pytest_django.fixtures import SettingsWrapper
#pytest.fixture(scope='session')
##pytest.yield_fixture()
def settings():
"""A Django settings object which restores changes after the testrun"""
skip_if_no_django()
wrapper = SettingsWrapper()
yield wrapper
wrapper.finalize()
#pytest.fixture(autouse=True, scope='session')
def media_root(settings):
tmp_dir = tempfile.mkdtemp()
settings.MEDIA_ROOT = tmp_dir
yield settings.MEDIA_ROOT
shutil.rmtree(tmp_dir)
#pytest.fixture(scope='session')
def django_db_setup(media_root, django_db_setup):
print('inject_after')
might be helpful:
https://dev.funkwhale.audio/funkwhale/funkwhale/blob/de777764da0c0e9fe66d0bb76317679be964588b/api/tests/conftest.py
https://framagit.org/ideascube/ideascube/blob/master/conftest.py
https://stackoverflow.com/a/56177770/5305401

This is what I did for my test. After uploading the file it should end up in the photo property of my organization model object:
import tempfile
filename = tempfile.mkstemp()[1]
f = open(filename, 'w')
f.write('These are the file contents')
f.close()
f = open(filename, 'r')
post_data = {'file': f}
response = self.client.post("/org/%d/photo" % new_org_data["id"], post_data)
f.close()
self.assertEqual(response.status_code, 200)
## Check the file
## org is where the file should end up
org = models.Organization.objects.get(pk=new_org_data["id"])
self.assertEqual("These are the file contents", org.photo.file.read())
## Remove the file
import os
os.remove(org.photo.path)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django: Create zipfile from S3 using django-storage - python

The problem was that I needed to revert back to the beginning of the file by adding zf.seek(0) just before returning the file in the HttpResponse.

Related

How to convert image address and save to imagefield in Django [duplicate]

Fetch multiple images from the server using Django

Download a file with Django

How to save file data from POST variable and load it back to response in Python Django?

Django test FileField using test fixtures

Categories

Resources