I have a django blog and want to download a backup zipfile with all the entries. The blog post text content is stored in the database.
I have written this code with the goal of trying to get the zipfile to save a bunch of .txt files in the main zip directory, but all this code does is outputs a single corrupted zip file. It cannot be unzipped but for some reason it can be opened in Word and it shows all of the blog post text mashed up.
def download_backups(request):
zip_filename = "test.zip"
s = BytesIO()
zf = zipfile.ZipFile(s, "w")
blogposts = Blog.objects.all()
for blogpost in blogposts:
filename = blogpost.title + ".txt"
zf.writestr(filename, blogpost.content)
resp = HttpResponse(s.getvalue())
resp['Content-Disposition'] = 'attachment; filename=%s' % zip_filename
return resp
Any help is appreciated.
Based on this answer to another question, you may be having an issue with the read mode. You'll also need to call zf.close(), either explicitly or implicitly, before the file will actually be complete.
I think there's a simpler way of handling this using a temporary file, which should have the advantage of not needing to fit all of the file's contents in memory.
from tempfile import TemporaryFile
from zipfile import ZipFile
with TemporaryFile() as tf:
with ZipFile(tf, mode="w") as zf:
zf.writestr("file1.txt", "The first file")
zf.writestr("file2.txt", "A second file")
tf.seek(0)
print(tf.read())
The with blocks here will result in your temp file going out of scope and being deleted, and zf.close being called implicitly before you attempt to read the file.
If the goal here is just to back up the data rather than using this specific format, though, I'd suggest using the built-in dumpdata management command. You can call it from code if you want to serve the results through a view like this.
Related
I have designed a webpage that allows the user to upload a zip file. What I want to do is store this zip file directly into my sqlite database as a large binary object, then be able to read this binary object as a zipfile using the zipfile package. Unfortunately this doesn't work because attempting to pass the file as a binary string in io.BytesIO into zipfile.ZipFile gives the error detailed in the title.
For my MWE, I exclude the database to better demonstrate my issue.
views = Blueprint('views', __name__)
#views.route("/upload", methods=["GET", "SET"])
def upload():
# Assume that file in request is a zip file (checked already)
f = request.files['file']
zip_content = f.read()
# Store in database
# ...
# at some point retrieve the file from database
archive = zipfile.ZipFile(io.BytesIO(zip_content))
return ""
I have searched for days on-end how to fix this issue without success. I have even printed out zip_content and the contents of io.BytesIO(zip_content) after applying .read() and they are exactly the same string.
What am I doing wrong?
Solved. Using f.read() only gets the name of the zip file. I needed to use f.getvalue() instead to get the full file contents.
I want to store data in a JSON file as an object in Python without having to store it locally, but I have not found any way of doing this! Currently, I create a file locally like so:
open(local_file_name + '.json', 'wb').write(file.content)
and then I proceed to use it for various functions. The problem, however, is that I create multiple of these files, so many in fact that it would simply be easier for me if I could somehow make them objects instead. Any suggestions?
It seems my code lacked one very important line of code: file.close()
Here is my code. Perhaps the data is stored locally for some time, but there is no JSON file created within the directory.
local_path = os.getcwd()
local_file_name = 'example.json'
f = open(local_file_name, 'wb')
f.write(file.content)
f.close()
upload_file_path = os.path.join(local_path, local_file_name)
Now I can upload the file object!
I am trying to use "requests" package and retrieve info from Github, like the Requests doc page explains:
import requests
r = requests.get('https://api.github.com/events')
And this:
with open(filename, 'wb') as fd:
for chunk in r.iter_content(chunk_size):
fd.write(chunk)
I have to say I don't understand the second code block.
filename - in what form do I provide the path to the file if created? where will it be saved if not?
'wb' - what is this variable? (shouldn't second parameter be 'mode'?)
following two lines probably iterate over data retrieved with request and write to the file
Python docs explanation also not helping much.
EDIT: What I am trying to do:
use Requests to connect to an API (Github and later Facebook GraphAPI)
retrieve data into a variable
write this into a file (later, as I get more familiar with Python, into my local MySQL database)
Filename
When using open the path is relative to your current directory. So if you said open('file.txt','w') it would create a new file named file.txt in whatever folder your python script is in. You can also specify an absolute path, for example /home/user/file.txt in linux. If a file by the name 'file.txt' already exists, the contents will be completely overwritten.
Mode
The 'wb' option is indeed the mode. The 'w' means write and the 'b' means bytes. You use 'w' when you want to write (rather than read) froma file, and you use 'b' for binary files (rather than text files). It is actually a little odd to use 'b' in this case, as the content you are writing is a text file. Specifying 'w' would work just as well here. Read more on the modes in the docs for open.
The Loop
This part is using the iter_content method from requests, which is intended for use with large files that you may not want in memory all at once. This is unnecessary in this case, since the page in question is only 89 KB. See the requests library docs for more info.
Conclusion
The example you are looking at is meant to handle the most general case, in which the remote file might be binary and too big to be in memory. However, we can make your code more readable and easy to understand if you are only accessing small webpages containing text:
import requests
r = requests.get('https://api.github.com/events')
with open('events.txt','w') as fd:
fd.write(r.text)
filename is a string of the path you want to save it at. It accepts either local or absolute path, so you can just have filename = 'example.html'
wb stands for WRITE & BYTES, learn more here
The for loop goes over the entire returned content (in chunks incase it is too large for proper memory handling), and then writes them until there are no more. Useful for large files, but for a single webpage you could just do:
# just W becase we are not writing as bytes anymore, just text.
with open(filename, 'w') as fd:
fd.write(r.content)
I am using a 3rd party python library that creates .svg's (specifically for evolutionary trees) which has a render function for tree objects. What I want is the svg in string form that I can edit. Currently I save the svg and read the file as follows:
tree.render('location/filename.svg', other_args...)
f = open('location/filename.svg', "r")
svg_string = f.read()
f.close()
This works, but is it possible to use a tempfile instead? So far I have:
t = tempfile.NamedTemporaryFile()
tmpdir = tempfile.mkdtemp()
t.name = os.path.join(tmpdir, 'tmp.svg')
tree.render(t.name, other_args...)
svg_string = t.read()
t.close()
Can anyone explain why this doesn't work and/or how I could do this without creating a file (which I just have to delete later). The svg_string I go on to edit for use in a django application.
EDIT: Importantly, the render function can also be used to create other filetypes e.g. .png - so the .svg extension needs to be specified.
You should not define yourself the name of your temporary file. When you create it, the name will be randomly generated. You can use it directly.
t = tempfile.NamedTemporaryFile()
tree.render(t.name, other_args...)
t.file.seek(0) #reset the file pointer to the beginning
svg_string = t.read()
t.close()
I need to get a binary file from wtforms and store it as bytea in postgresql. And I don't need to store it permanently as a file. From my understanding of the Flask offical doc I shall be able to access the filename through either request.files.['myfile'].filename or secure_filename(f.filename). However, both of them give me a error: IOError: [Errno 2] No such file or directory: u'myuploadpdf.pdf'
f = request.files.['myfile']:
if f and allowed_file(f.filename):
#filename = secure_filename(f.filename)
data = open(f.filename, 'rb').read()
#data = open(filename , 'rb').read()
binary = psycopg2.Binary(data)
open() expects a pathname to the file. Since the file hasn't been saved to disk, no such path exists. :)
What you actually want to do is call f.read() directly. Reading incoming files is covered here.
Also, definitely use secure_filename() if you work with anything on disk. Don't want to open yourself to any directory traversal attacks down the line.
The objects in request.files are FileStorage objects and they have the same methods as normal file objects in python. So to get the contents of the file as binary, try doing this:
data = request.files['myfile'].read()