Delete file when file download is complete on Python x Django [duplicate] - python

I'm using the following django/python code to stream a file to the browser:
wrapper = FileWrapper(file(path))
response = HttpResponse(wrapper, content_type='text/plain')
response['Content-Length'] = os.path.getsize(path)
return response
Is there a way to delete the file after the reponse is returned? Using a callback function or something?
I could just make a cron to delete all tmp files, but it would be neater if I could stream files and delete them as well from the same request.

You can use a NamedTemporaryFile:
from django.core.files.temp import NamedTemporaryFile
def send_file(request):
newfile = NamedTemporaryFile(suffix='.txt')
# save your data to newfile.name
wrapper = FileWrapper(newfile)
response = HttpResponse(wrapper, content_type=mime_type)
response['Content-Disposition'] = 'attachment; filename=%s' % os.path.basename(modelfile.name)
response['Content-Length'] = os.path.getsize(modelfile.name)
return response
temporary file should be deleted once the newfile object is evicted.

For future references:
I just had the case in which I couldn't use temp files for downloads.
But I still needed to delete them after it; so here is how I did it (I really didn't want to rely on cron jobs or celery or wossnames, its a very small system and I wanted it to stay that way).
def plug_cleaning_into_stream(stream, filename):
try:
closer = getattr(stream, 'close')
#define a new function that still uses the old one
def new_closer():
closer()
os.remove(filename)
#any cleaning you need added as well
#substitute it to the old close() function
setattr(stream, 'close', new_closer)
except:
raise
and then I just took the stream used for the response and plugged into it.
def send_file(request, filename):
with io.open(filename, 'rb') as ready_file:
plug_cleaning_into_stream(ready_file, filename)
response = HttpResponse(ready_file.read(), content_type='application/force-download')
# here all the rest of the heards settings
# ...
return response
I know this is quick and dirty but it works. I doubt it would be productive for a server with thousands of requests a second, but that's not my case here (max a few dozens a minute).
EDIT: Forgot to precise that I was dealing with very very big files that could not fit in memory during the download. So that is why I am using a BufferedReader (which is what is underneath io.open())

Mostly, we use periodic cron jobs for this.
Django already has one cron job to clean up lost sessions. And you're already running it, right?
See http://docs.djangoproject.com/en/dev/topics/http/sessions/#clearing-the-session-table
You want another command just like this one, in your application, that cleans up old files.
See this http://docs.djangoproject.com/en/dev/howto/custom-management-commands/
Also, you may not really be sending this file from Django. Sometimes you can get better performance by creating the file in a directory used by Apache and redirecting to a URL so the file can be served by Apache for you. Sometimes this is faster. It doesn't handle the cleanup any better, however.

One way would be to add a view to delete this file and call it from the client side using an asynchronous call (XMLHttpRequest). A variant of this would involve reporting back from the client on success so that the server can mark this file for deletion and have a periodic job clean it up.

This is just using the regular python approach (very simple example):
# something generates a file at filepath
from subprocess import Popen
# open file
with open(filepath, "rb") as fid:
filedata = fid.read()
# remove the file
p = Popen("rm %s" % filepath, shell=True)
# make response
response = HttpResponse(filedata, content-type="text/plain")
return response

Python 3.7 , Django 2.2.5
from tempfile import NamedTemporaryFile
from django.http import HttpResponse
with NamedTemporaryFile(suffix='.csv', mode='r+', encoding='utf8') as f:
f.write('\uFEFF') # BOM
f.write('sth you want')
# ref: https://docs.python.org/3/library/tempfile.html#examples
f.seek(0)
data=f.read()
response = HttpResponse(data, content_type="text/plain")
response['Content-Disposition'] = 'inline; filename=export.csv'

Related

In Flask, how can I send a temporary file and delete it after upload is finished? [duplicate]

I have a Flask view that generates data and saves it as a CSV file with Pandas, then displays the data. A second view serves the generated file. I want to remove the file after it is downloaded. My current code raises a permission error, maybe because after_request deletes the file before it is served with send_from_directory. How can I delete a file after serving it?
def process_data(data)
tempname = str(uuid4()) + '.csv'
data['text'].to_csv('samo/static/temp/{}'.format(tempname))
return file
#projects.route('/getcsv/<file>')
def getcsv(file):
#after_this_request
def cleanup(response):
os.remove('samo/static/temp/' + file)
return response
return send_from_directory(directory=cwd + '/samo/static/temp/', filename=file, as_attachment=True)
after_request runs after the view returns but before the response is sent. Sending a file may use a streaming response; if you delete it before it's read fully you can run into errors.
This is mostly an issue on Windows, other platforms can mark a file deleted and keep it around until it not being accessed. However, it may still be useful to only delete the file once you're sure it's been sent, regardless of platform.
Read the file into memory and serve it, so that's it's not being read when you delete it later. In case the file is too big to read into memory, use a generator to serve it then delete it.
#app.route('/download_and_remove/<filename>')
def download_and_remove(filename):
path = os.path.join(current_app.instance_path, filename)
def generate():
with open(path) as f:
yield from f
os.remove(path)
r = current_app.response_class(generate(), mimetype='text/csv')
r.headers.set('Content-Disposition', 'attachment', filename='data.csv')
return r

Best way to create a download link for a file in Flask?

In my project, when a user clicks a link, an AJAX request sends the information required to create a CSV. The CSV takes a long time to generate and so I want to be able to include a download link for the generated CSV in the AJAX response. Is this possible?
Most of the answers I've seen return the CSV in the following way:
return Response(
csv,
mimetype="text/csv",
headers={"Content-disposition":
"attachment; filename=myplot.csv"})
However, I don't think this is compatible with the AJAX response I'm sending with:
return render_json(200, {'data': params})
Ideally, I'd like to be able to send the download link in the params dict. But I'm also not sure if this is secure. How is this problem typically solved?
I think one solution may the futures library (pip install futures). The first endpoint can queue up the task and then send the file name back, and then another endpoint can be used to retrieve the file. I also included gzip because it might be a good idea if you are sending larger files. I think more robust solutions use Celery or Rabbit MQ or something along those lines. However, this is a simple solution that should accomplish what you are asking for.
from flask import Flask, jsonify, Response
from uuid import uuid4
from concurrent.futures import ThreadPoolExecutor
import time
import os
import gzip
app = Flask(__name__)
# Global variables used by the thread executor, and the thread executor itself
NUM_THREADS = 5
EXECUTOR = ThreadPoolExecutor(NUM_THREADS)
OUTPUT_DIR = os.path.dirname(os.path.abspath(__file__))
# this is your long running processing function
# takes in your arguments from the /queue-task endpoint
def a_long_running_task(*args):
time_to_wait, output_file_name = int(args[0][0]), args[0][1]
output_string = 'sleeping for {0} seconds. File: {1}'.format(time_to_wait, output_file_name)
print(output_string)
time.sleep(time_to_wait)
filename = os.path.join(OUTPUT_DIR, output_file_name)
# here we are writing to a gzipped file to save space and decrease size of file to be sent on network
with gzip.open(filename, 'wb') as f:
f.write(output_string)
print('finished writing {0} after {1} seconds'.format(output_file_name, time_to_wait))
# This is a route that starts the task and then gives them the file name for reference
#app.route('/queue-task/<wait>')
def queue_task(wait):
output_file_name = str(uuid4()) + '.csv'
EXECUTOR.submit(a_long_running_task, [wait, output_file_name])
return jsonify({'filename': output_file_name})
# this takes the file name and returns if exists, otherwise notifies it is not yet done
#app.route('/getfile/<name>')
def get_output_file(name):
file_name = os.path.join(OUTPUT_DIR, name)
if not os.path.isfile(file_name):
return jsonify({"message": "still processing"})
# read without gzip.open to keep it compressed
with open(file_name, 'rb') as f:
resp = Response(f.read())
# set headers to tell encoding and to send as an attachment
resp.headers["Content-Encoding"] = 'gzip'
resp.headers["Content-Disposition"] = "attachment; filename={0}".format(name)
resp.headers["Content-type"] = "text/csv"
return resp
if __name__ == '__main__':
app.run()

use django to serve downloading big zip file with some data appended

I have a views snippet like below, which get a zip filename form a request, and I want to append some string sign after the end of zip file
#require_GET
def download(request):
... skip
response = HttpResponse(readFile(abs_path, sign), content_type='application/zip')
response['Content-Length'] = os.path.getsize(abs_path) + len(sign)
response['Content-Disposition'] = 'attachment; filename=%s' % filename
return response
and the readFile function as below:
def readFile(fn, sign, buf_size=1024<<5):
f = open(fn, "rb")
logger.debug("started reading %s" % fn)
while True:
c = f.read(buf_size)
if c:
yield c
else:
break
logger.debug("finished reading %s" % fn)
f.close()
yield sign
It works fine when using runserver mode, but failed on big zip file when I use uwsgi + nginx or apache + mod_wsgi.
It seems timeout because need too long time to read a big file.
I don't understand why I use yield but the browser start to download after whole file read finished.(Because I see the browser wait until the log finished reading %s appeared)
Shouldn't it start to download right after the first chunk read?
Is any better way to serve a file downloading function that I need to append a dynamic string after the file?
Django doesn't allow streaming responses by default so it buffers the entire response. If it didn't, middlewares couldn't function the way they do right now.
To get the behaviour you are looking for you need to use the StreamingHttpResponse instead.
Usage example from the docs:
import csv
from django.utils.six.moves import range
from django.http import StreamingHttpResponse
class Echo(object):
"""An object that implements just the write method of the file-like
interface.
"""
def write(self, value):
"""Write the value by returning it, instead of storing in a buffer."""
return value
def some_streaming_csv_view(request):
"""A view that streams a large CSV file."""
# Generate a sequence of rows. The range is based on the maximum number of
# rows that can be handled by a single sheet in most spreadsheet
# applications.
rows = (["Row {}".format(idx), str(idx)] for idx in range(65536))
pseudo_buffer = Echo()
writer = csv.writer(pseudo_buffer)
response = StreamingHttpResponse((writer.writerow(row) for row in rows),
content_type="text/csv")
response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
return response
This is a use case for StreamingHttpResponse instead of HttpResponse.
It's better to use FileRespose, is a subclass of StreamingHttpResponse optimized for binary files. It uses wsgi.file_wrapper if provided by the wsgi server, otherwise it streams the file out in small chunks.
import os
from django.http import FileResponse
from django.core.servers.basehttp import FileWrapper
def download_file(request):
_file = '/folder/my_file.zip'
filename = os.path.basename(_file)
response = FileResponse(FileWrapper(file(filename, 'rb')), content_type='application/x-zip-compressed')
response['Content-Disposition'] = "attachment; filename=%s" % _file
return response

Django Serving a Download File

I'm trying to serve a txt file generated with some content and i am having some issues. I'vecreated the temp files and written the content using NamedTemporaryFile and just set delete to false to debug however the downloaded file does not contain anything.
My guess is the response values are not pointed to the correct file, hense nothing is being downloaded, heres my code:
f = NamedTemporaryFile()
f.write(p.body)
response = HttpResponse(FileWrapper(f), mimetype='application/force-download')
response['Content-Disposition'] = 'attachment; filename=test-%s.txt' % p.uuid
response['X-Sendfile'] = f.name
Have you considered just sending p.body through the response like this:
response = HttpResponse(mimetype='text/plain')
response['Content-Disposition'] = 'attachment; filename="%s.txt"' % p.uuid
response.write(p.body)
XSend requires the path to the file in
response['X-Sendfile']
So, you can do
response['X-Sendfile'] = smart_str(path_to_file)
Here, path_to_file is the full path to the file (not just the name of the file)
Checkout this django-snippet
There can be several problems with your approach:
file content does not have to be flushed, add f.flush() as mentioned in comment above
NamedTemporaryFile is deleted on closing, what might happen just as you exit your function, so the webserver has no chance to pick it up
temporary file name might be out of paths which web server is configured to send using X-Sendfile
Maybe it would be better to use StreamingHttpResponse instead of creating temporary files and X-Sendfile...
import urllib2;
url ="http://chart.apis.google.com/chart?cht=qr&chs=300x300&chl=s&chld=H|0";
opener = urllib2.urlopen(url);
mimetype = "application/octet-stream"
response = HttpResponse(opener.read(), mimetype=mimetype)
response["Content-Disposition"]= "attachment; filename=aktel.png"
return response

Stream a file to the HTTP response in Pylons

I have a Pylons controller action that needs to return a file to the client. (The file is outside the web root, so I can't just link directly to it.) The simplest way is, of course, this:
with open(filepath, 'rb') as f:
response.write(f.read())
That works, but it's obviously inefficient for large files. What's the best way to do this? I haven't been able to find any convenient methods in Pylons to stream the contents of the file. Do I really have to write the code to read a chunk at a time myself from scratch?
The correct tool to use is shutil.copyfileobj, which copies from one to the other a chunk at a time.
Example usage:
import shutil
with open(filepath, 'r') as f:
shutil.copyfileobj(f, response)
This will not result in very large memory usage, and does not require implementing the code yourself.
The usual care with exceptions should be taken - if you handle signals (such as SIGCHLD) you have to handle EINTR because the writes to response could be interrupted, and IOError/OSError can occur for various reasons when doing I/O.
I finally got it to work using the FileApp class, thanks to Chris AtLee and THC4k (from this answer). This method also allowed me to set the Content-Length header, something Pylons has a lot of trouble with, which enables the browser to show an estimate of the time remaining.
Here's the complete code:
def _send_file_response(self, filepath):
user_filename = '_'.join(filepath.split('/')[-2:])
file_size = os.path.getsize(filepath)
headers = [('Content-Disposition', 'attachment; filename=\"' + user_filename + '\"'),
('Content-Type', 'text/plain'),
('Content-Length', str(file_size))]
from paste.fileapp import FileApp
fapp = FileApp(filepath, headers=headers)
return fapp(request.environ, self.start_response)
The key here is that WSGI, and pylons by extension, work with iterable responses. So you should be able to write some code like (warning, untested code below!):
def file_streamer():
with open(filepath, 'rb') as f:
while True:
block = f.read(4096)
if not block:
break
yield block
response.app_iter = file_streamer()
Also, paste.fileapp.FileApp is designed to be able to return file data for you, so you can also try:
return FileApp(filepath)
in your controller method.

Categories