How to download large file from Cloud Run in Django

How to download large file from Cloud Run in Django - python

I have Django project on Cloud Run. When I download small file from page which has below code.
def download_view(request,pk):
file_path = f'media/{pk}.mp3'
name = f'{pk}.mp3'
with open(file_path, 'rb') as f:
response = HttpResponse(f.read(), content_type='audio/wav')
response['Content-Disposition'] = f'attachment; filename={name}'
return response
It's works fine. However, when I download a file (50MB). I got this picture's error.
Cloud run's log is like this. I couldn't find any log of traceback.
2021-05-06 12:00:35.668 JSTGET500606 B66 msChrome 72 https://***/download/mp3/2500762/
2021-05-06 11:49:49.037 JSTGET500606 B61 msChrome 72 https://***/download/mp3/2500645/
I'm not sure. Is this error related with download error.
2021-05-06 16:48:32.570 JSTResponse size was too large. Please consider reducing response size.
I think this is upload file size error. So this is not related with this subject of download error.
When I run Django at local, then download same 50MB file. I can download it. I think this download error related with Cloud run. It's stop after request/response. So I think this error coused by Cloud Run. Which was stoped, when I'm still downloading file.
I don't know how to solve this download error. If you have any solution, please help me!

The Cloud Run HTTP request/response size is limited to 32Mb. Use a multipart/form-data to send chunks of your big file and not the whole file directly.

Thank you #guillaume blaquiere! I solved download error. I post my code for othres.
def _file_iterator(file, chunk_size=512):
with open(file, 'rb') as f:
while True:
c = f.read(chunk_size)
if c:
yield c
else:
break
def download_view(request,pk):
file_path = f'media/{pk}.mp3'
file_name = f'{pk}.mp3'
response = StreamingHttpResponse(_file_iterator(file_path))
response['Content-Type'] = 'audio/mpeg'
response['Content-Disposition'] = f'attachment;filename="{file_name}"'
return response
I think StreamingHttpResponse is key point of this problem. It's return big file by chunks. It dose not over Cloud Run's limit.
When I used multipart/form-data for Content-Type, I could download file. But it's couldn't open on smart phone, because It couldn't select application. When I download on PC, it's can't show audio file icon. We should select exact content type.

Related

How to convert a blob file to a specific format?

I am building a web application with ReactJS and Django framework.
In this web application, there is a part where I record an audio file and send it to the backend to save it.
This is the blob data from ReactJS that I send:
Blob {
size: 29535,
type: "audio/wav; codecs=0"
}
And this is the code I am using in the backend:
#api_view(['POST'])
#csrf_exempt
def AudioModel(request):
try:
audio = request.FILES.get('audio')
except KeyError:
return Response({'audio': ['no audio ?']}, status=HTTP_400_BAD_REQUEST)
destination = open('audio_name.wav', 'wb')
for chunk in audio.chunks():
destination.write(chunk)
destination.close() # closing the file
return Response("Done!", status=HTTP_200_OK)
When I play the file I saved, it plays some sound but it crashes when it achieves the end.
This problem makes me look for some information about the file I saved (extension,...).
For this reason I used fleep library:
import fleep
with open("audio_name.wav", "rb") as file:
info = fleep.get(file.read(128))
print(info.type)
print(info.extension)
print(info.mime)
OUTPUT:
['video']
['webm']
['video/webm']
But getting video in output!
Am I doing something wrong?
How can I fix this issue?
Is there anything I can use to save my file in the desired format?
Any help is appreciated.
EDIT:
Output of first 128 bytes of the saved file:
b'\x1aE\xdf\xa3\x9fB\x86\x81\x01B\xf7\x81\x01B\xf2\x81\x04B\xf3\x81\x08B\x82\x84webmB\x87\x81\x04B\x85\x81\x02\x18S\x80g\x01\xff\xff\xff\xff\xff\xff\xff\x15I\xa9f\x99*\xd7\xb1\x83\x0fB#M\x80\x86ChromeWA\x86Chrome\x16T\xaek\xbf\xae\xbd\xd7\x81\x01s\xc5\x87\xbd\x8d\xc0\xd5\xc6\xaf\xd0\x83\x81\x02\x86\x86A_OPUSc\xa2\x93OpusHead\x01\x01\x00\x00\x80\xbb\x00\x00'

Use SciPy to read data from and write data to a variety of file formats.
Usage examples:
Writing wav file in Python with wavfile.write from SciPy
scipy.io.wavfile.write

Looks like inside react-voice-recorder uses MediaRecorder object with default options. But in options you can set correct mimeType, for example audio/webm; codecs=opus

Is possible to save a temporaly file in a Azure Function Linux Consuption Plan in Python?

first of all sorry for my English. I have an Azure Function Linux Consuption Plan using Python and I need to generate an html, transform to pdf using wkhtmltopdf and send it by email.
#generate temporally pdf
config = pdfkit.configuration(wkhtmltopdf="binary/wkhtmltopdf")
pdfkit.from_string(pdf_content, 'report.pdf',configuration=config, options={})
#read pdf and transform to Bytes
with open('report.pdf', 'rb') as f:
data = f.read()
#encode bytes
encoded = base64.b64encode(data).decode()
#Send Email
EmailSendData.sendEmail(html_content,encoded,spanish_month)
Code is running ok in my local development but when I deploy the function and execute the code I am getting an error saying:
Result: Failure Exception: OSError: wkhtmltopdf reported an error: Loading pages (1/6) [> ] 0% [======> ] 10% [==============================> ] 50% [============================================================] 100% QPainter::begin(): Returned false Error: Unable to write to destination
I think that error is reported because for any reason write permission is not available. Can you help me to solve this problem?
Thanks in advance.

The tempfile.gettempdir() method returns a temporary folder, which on Linux is /tmp. Your application can use this directory to store temporary files generated and used by your functions during execution.
So use /tmp/report.pdf as the file directory to save temporary file.
with open('/tmp/report.pdf', 'rb') as f:
data = f.read()
For more details, you could refer to this article.

Final correct code:
config = pdfkit.configuration(wkhtmltopdf="binary/wkhtmltopdf")
local_path = os.path.join(tempfile.gettempdir(), 'report.pdf')
logger.info(tempfile.gettempdir())
pdfkit.from_string(pdf_content, local_path,configuration=config, options={})

Delete file when file download is complete on Python x Django [duplicate]

I'm using the following django/python code to stream a file to the browser:
wrapper = FileWrapper(file(path))
response = HttpResponse(wrapper, content_type='text/plain')
response['Content-Length'] = os.path.getsize(path)
return response
Is there a way to delete the file after the reponse is returned? Using a callback function or something?
I could just make a cron to delete all tmp files, but it would be neater if I could stream files and delete them as well from the same request.

You can use a NamedTemporaryFile:
from django.core.files.temp import NamedTemporaryFile
def send_file(request):
newfile = NamedTemporaryFile(suffix='.txt')
# save your data to newfile.name
wrapper = FileWrapper(newfile)
response = HttpResponse(wrapper, content_type=mime_type)
response['Content-Disposition'] = 'attachment; filename=%s' % os.path.basename(modelfile.name)
response['Content-Length'] = os.path.getsize(modelfile.name)
return response
temporary file should be deleted once the newfile object is evicted.

For future references:
I just had the case in which I couldn't use temp files for downloads.
But I still needed to delete them after it; so here is how I did it (I really didn't want to rely on cron jobs or celery or wossnames, its a very small system and I wanted it to stay that way).
def plug_cleaning_into_stream(stream, filename):
try:
closer = getattr(stream, 'close')
#define a new function that still uses the old one
def new_closer():
closer()
os.remove(filename)
#any cleaning you need added as well
#substitute it to the old close() function
setattr(stream, 'close', new_closer)
except:
raise
and then I just took the stream used for the response and plugged into it.
def send_file(request, filename):
with io.open(filename, 'rb') as ready_file:
plug_cleaning_into_stream(ready_file, filename)
response = HttpResponse(ready_file.read(), content_type='application/force-download')
# here all the rest of the heards settings
# ...
return response
I know this is quick and dirty but it works. I doubt it would be productive for a server with thousands of requests a second, but that's not my case here (max a few dozens a minute).
EDIT: Forgot to precise that I was dealing with very very big files that could not fit in memory during the download. So that is why I am using a BufferedReader (which is what is underneath io.open())

Mostly, we use periodic cron jobs for this.
Django already has one cron job to clean up lost sessions. And you're already running it, right?
See http://docs.djangoproject.com/en/dev/topics/http/sessions/#clearing-the-session-table
You want another command just like this one, in your application, that cleans up old files.
See this http://docs.djangoproject.com/en/dev/howto/custom-management-commands/
Also, you may not really be sending this file from Django. Sometimes you can get better performance by creating the file in a directory used by Apache and redirecting to a URL so the file can be served by Apache for you. Sometimes this is faster. It doesn't handle the cleanup any better, however.

One way would be to add a view to delete this file and call it from the client side using an asynchronous call (XMLHttpRequest). A variant of this would involve reporting back from the client on success so that the server can mark this file for deletion and have a periodic job clean it up.

This is just using the regular python approach (very simple example):
# something generates a file at filepath
from subprocess import Popen
# open file
with open(filepath, "rb") as fid:
filedata = fid.read()
# remove the file
p = Popen("rm %s" % filepath, shell=True)
# make response
response = HttpResponse(filedata, content-type="text/plain")
return response

Python 3.7 , Django 2.2.5
from tempfile import NamedTemporaryFile
from django.http import HttpResponse
with NamedTemporaryFile(suffix='.csv', mode='r+', encoding='utf8') as f:
f.write('\uFEFF') # BOM
f.write('sth you want')
# ref: https://docs.python.org/3/library/tempfile.html#examples
f.seek(0)
data=f.read()
response = HttpResponse(data, content_type="text/plain")
response['Content-Disposition'] = 'inline; filename=export.csv'

In Flask, how can I send a temporary file and delete it after upload is finished? [duplicate]

I have a Flask view that generates data and saves it as a CSV file with Pandas, then displays the data. A second view serves the generated file. I want to remove the file after it is downloaded. My current code raises a permission error, maybe because after_request deletes the file before it is served with send_from_directory. How can I delete a file after serving it?
def process_data(data)
tempname = str(uuid4()) + '.csv'
data['text'].to_csv('samo/static/temp/{}'.format(tempname))
return file
#projects.route('/getcsv/<file>')
def getcsv(file):
#after_this_request
def cleanup(response):
os.remove('samo/static/temp/' + file)
return response
return send_from_directory(directory=cwd + '/samo/static/temp/', filename=file, as_attachment=True)

after_request runs after the view returns but before the response is sent. Sending a file may use a streaming response; if you delete it before it's read fully you can run into errors.
This is mostly an issue on Windows, other platforms can mark a file deleted and keep it around until it not being accessed. However, it may still be useful to only delete the file once you're sure it's been sent, regardless of platform.
Read the file into memory and serve it, so that's it's not being read when you delete it later. In case the file is too big to read into memory, use a generator to serve it then delete it.
#app.route('/download_and_remove/<filename>')
def download_and_remove(filename):
path = os.path.join(current_app.instance_path, filename)
def generate():
with open(path) as f:
yield from f
os.remove(path)
r = current_app.response_class(generate(), mimetype='text/csv')
r.headers.set('Content-Disposition', 'attachment', filename='data.csv')
return r

How can I download zip to python directory from google storage after obtaing response object?

After running the following code successfully, I think I am close to get access to the zip file in gcloud storage. However, I really cannot figure out what to do next, download or something to make the zip file available for python environment as a programmable object.
from gs import GSClient
client = GSClient()
object_meta = client.get("b/rcmikejupyter/o/output1.zip")
with client.get("b/rcmikejupyter/o/output1.zip", params=dict(alt="media"), stream=True) as res:
object_bytes = res.raw.read()

Assuming this is a byesobject
with open("pathto/yourfile.zip", "wb") as file:
file.write(object_bytes)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to download large file from Cloud Run in Django - python

The Cloud Run HTTP request/response size is limited to 32Mb. Use a multipart/form-data to send chunks of your big file and not the whole file directly.

Related

How to convert a blob file to a specific format?

Is possible to save a temporaly file in a Azure Function Linux Consuption Plan in Python?

Delete file when file download is complete on Python x Django [duplicate]

In Flask, how can I send a temporary file and delete it after upload is finished? [duplicate]

How can I download zip to python directory from google storage after obtaing response object?

Categories

Resources