Streaming a zip file in Python3

Streaming a zip file in Python3 - python

In my Django project, I have large zip files I need to send to a user for them to download.
These files can be up to 10Gb big at some points.
The files are already in zip format, and I don't want to extract them.
Instead, I would like to send say 100mb of the file at a time, wait for the frontend to receive them, and then ask for the next 100mb until the file has been completely read. I can handle the collection and download on the frontend, but how can I perform this request in Django?
Disclaimer: I'm a frontend developer, and very new to python.

Have you tried https://github.com/travcunn/django-zip-stream? It looks like what you may be looking for.
Example Use Case:
from django_zip_stream.responses import TransferZipResponse
def download_zip(request):
# Files are located at /home/travis but Nginx is configured to serve from /data
files = [
("/chicago.jpg", "/data/home/travis/chicago.jpg", 4096),
("/portland.jpg", "/data//home/travis/portland.jpg", 4096),
]
return TransferZipResponse(filename='download.zip', files=files)

Related

Is it possible to use Django to parse locally stored files (csv for example) without uploading them?

I would like to develop a WebApp that parses locally stored data and allows users to create a sorted excel file.
It would be amazing if I could somehow avoid the uploading of the files. Some users are worried because of the data, the files can get really big so I have to implement async processes and so on...
Is something like that possible?

What's the best way to handle large files upload with aiohttp (server)

I'm currently working on kind of cloud files manager. You can basically upload your files through a website. This website is connected to a python backend which will store your files and manage them using a database. It will basically put every of your files inside a folder and rename them with their hash. The database will associate the name of the file and its categories (kindof folders) with the hash so that you can retrieve the file easily.
My problem is that I would like the file upload to be really user friendly: I have a quite bad connection and when I try to download or upload a file on the internet I often get problems like at 90% of file uploading, the upload fails and I need to restart it. I'ld like to avoid that.
I'm using aiohttp to achieve this, how could I allow a file to be uploaded in multiple times? What should I use to upload large files.
In a previous code which managed really small files (less than 10MB), I was using something like that:
data = await request.post()
data['file'].file.read()
Should I continue to do it this way for large files?

Django Server - How to prevent caching on csv files?

I have a server which generates some data on a daily basis. I am using D3 to visualise the data (d3.csv("path")).
The problem is I can only access the files if they are under my static_dir in the project.
However, if I put them there, they do eventually get cached and I stop seeing the updates, which is fine for css and js files but not for the underlying data.
Is there a way to put these files maybe in a different folder and prevent caching on them? Under what path will I be able to access them?
Or alternatively how would it be advisable to structure my project differently in order to maybe avoid this operation in the first place. Atm, I have a seperate process that generates the data and stores it the given folder which is independent from the server.
Many thanks,
Tony

When accessing the files you can always add ?t=RANDOM to the request in order to get a "new" data all the time.
Because the request (on the server-side) is "new" - there will be no cache, and from the client side it doesn't really matter.
To get a new random you can use Date.now():
url = "myfile.csv?t="+Date.now()

Put django file object into tikka server

In my project I have receiving multiple files using request.FILES.getlist('filedname') and saving it using django forms save method. Again reading the same files using tika server api of python:
def read_by_tika(self, path):
'''file reading using tika server'''
parsed = parser.from_file(str(path))
contents = (parsed["content"].encode('utf-8'))
return contents
Is there any way to directly put list files getting from request.FILES to tikka server without saving it on hard disk.

If the files are small, try using tika's .from_buffer() with file.read(). However, files over 2.5 MBs are anyway saved to temporary files by django, see Where uploaded data is stored. In this case use read_by_tika(file.temporary_file_path()). See also file upload settings

Is there a way to tell a browser to download a file as a different name than as it exists on disk?

I am trying to serve up some user uploaded files with Flask, and have an odd problem, or at least one that I couldn't turn up any solutions for by searching. I need the files to retain their original filenames after being uploaded, so they will have the same name when the user downloads them. Originally I did not want to deal with databases at all, and solved the problem of filename conflicts by storing each file in a randomly named folder, and just pointing to that location for the download. However, stuff came up later that required me to use a database to store some info about the files, but I still kept my old method of handling filename conflicts. I have a model for my files now and storing the name would be as simple as just adding another field, so that shouldn't be a big problem. I decided, pretty foolishly after I had written the implmentation, on using Amazon S3 to store the files. Apparently S3 does not deal with folders in the way a traditional filesystem does, and I do not want to deal with the surely convoluted task of figuring out how to create folders programatically on S3, and in retrospect, this was a stupid way of dealing with this problem in the first place, when stuff like SQLalchemy exists that makes databases easy as pie. Anyway, I need a way to store multiple files with the same name on s3, without using folders. I thought of just renaming the files with a random UUID after they are uploaded, and then when they are downloaded (the user visits a page and presses a download button so I need not have the filename in the URL), telling the browser to save the file as its original name retrieved from the database. Is there a way to implement this in Python w/Flask? When it is deployed I am planning on having the web server handle the serving of files, will it be possible to do something like this with the server? Or is there a smarter solution?

I'm stupid. Right in the Flask API docs it says you can include the parameter attachment_filename in send_from_directory if it differs from the filename in the filesystem.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.