Hello
My error is produced in generating a zip file. Can you inform what I should do?
main.py", line 2289, in get
buf=zipf.read(2048)
NameError: global name 'zipf' is not defined
The complete code is as follows:
def addFile(self,zipstream,url,fname):
# get the contents
result = urlfetch.fetch(url)
# store the contents in a stream
f=StringIO.StringIO(result.content)
length = result.headers['Content-Length']
f.seek(0)
# write the contents to the zip file
while True:
buff = f.read(int(length))
if buff=="":break
zipstream.writestr(fname,buff)
return zipstream
def get(self):
self.response.headers["Cache-Control"] = "public,max-age=%s" % 86400
start=datetime.datetime.now()-timedelta(days=20)
count = int(self.request.get('count')) if not self.request.get('count')=='' else 1000
from google.appengine.api import memcache
memcache_key = "ads"
data = memcache.get(memcache_key)
if data is None:
a= Ad.all().filter("modified >", start).filter("url IN", ['www.koolbusiness.com']).filter("published =", True).order("-modified").fetch(count)
memcache.set("ads", a)
else:
a = data
dispatch='templates/kml.html'
template_values = {'a': a , 'request':self.request,}
path = os.path.join(os.path.dirname(__file__), dispatch)
output = template.render(path, template_values)
self.response.headers['Content-Length'] = len(output)
zipstream=StringIO.StringIO()
file = zipfile.ZipFile(zipstream,"w")
url = 'http://www.koolbusiness.com/list.kml'
# repeat this for every URL that should be added to the zipfile
file =self.addFile(file,url,"list.kml")
# we have finished with the zip so package it up and write the directory
file.close()
zipstream.seek(0)
# create and return the output stream
self.response.headers['Content-Type'] ='application/zip'
self.response.headers['Content-Disposition'] = 'attachment; filename="list.kmz"'
while True:
buf=zipf.read(2048)
if buf=="": break
self.response.out.write(buf)
That is probably zipstream and not zipf. So replace that with zipstream and it might work.
i don't see where you declare zipf?
zipfile? Senthil Kumaran is probably right with zipstream since you seek(0) on zipstream before the while loop to read chunks of the mystery variable.
edit:
Almost certainly the variable is zipstream.
zipfile docs:
class zipfile.ZipFile(file[, mode[, compression[, allowZip64]]])
Open a ZIP file, where file can be either a path to a file (a string) or
a file-like object. The mode parameter
should be 'r' to read an existing
file, 'w' to truncate and write a new
file, or 'a' to append to an existing
file. If mode is 'a' and file refers
to an existing ZIP file, then
additional files are added to it. If
file does not refer to a ZIP file,
then a new ZIP archive is appended to
the file. This is meant for adding a
ZIP archive to another file (such as
python.exe).
your code:
zipsteam=StringIO.StringIO()
create a file-like object using StringIO which is essentially a "memory file" read more in docs
file = zipfile.ZipFile(zipstream,w)
opens the zipfile with the zipstream file-like object in 'w' mode
url = 'http://www.koolbusiness.com/list.kml'
# repeat this for every URL that should be added to the zipfile
file =self.addFile(file,url,"list.kml")
# we have finished with the zip so package it up and write the directory
file.close()
uses the addFile method to retrieve and write the retrieved data to the file-like object and returns it. The variables are slightly confusing because you pass a zipfile to the addFile method which aliases as zipstream (confusing because we are using zipstream as a StringIO file-like object). Anyways, the zipfile is returned, and closed to make sure everything is "written".
It was written to our "memory file", which we now seek to index 0
zipstream.seek(0)
and after doing some header stuff, we finally reach the while loop that will read our "memory-file" in chunks
while True:
buf=zipstream.read(2048)
if buf=="": break
self.response.out.write(buf)
You need to declare:
global zipf
right after your
def get(self):
line. you are modifying a global variable, and this is the only way python knows what you are doing.
Related
I apologize for the vague definition of my problem in the title, but I really can't figure out what sort of problem I'm dealing with. So, here it goes.
I have python file:
edit-json.py
import os, json
def add_rooms(data):
if(not os.path.exists('rooms.json')):
with open('rooms.json', 'w'): pass
with open('rooms.json', 'r+') as f:
d = f.read() # take existing data from file
f.truncate(0) # empty the json file
if(d == ''): rooms = [] # check if data is empty i.e the file was just created
else: rooms = json.loads(d)['rooms']
rooms.append({'name': data['roomname'], 'active': 1})
f.write(json.dumps({"rooms": rooms})) # write new data(rooms list) to the json file
add_rooms({'roomname': 'friends'})'
This python script basically creates a file rooms.json(if it doesn't exist), grabs the data(array) from the json file, empties the json file, then finally writes the new data into the file. All this is done in the function add_rooms(), which is then called at the end of the script, pretty simple stuff.
So, here's the problem, I run the file once, nothing weird happens, i.e the file is created and the data inside it is:
{"rooms": [{"name": "friends"}]}
But the weird stuff happens when the run the script again.
What I should see:
{"rooms": [{"name": "friends"}, {"name": "friends"}]}
What I see instead:
I apologize I had to post the image because for some reason I couldn't copy the text I got.
and I can't obviously run the script again(for the third time) because the json parser gives error due to those characters
I obtained this result in an online compiler. In my local windows system, I get extra whitespace instead of those extra symbols.
I can't figure out what causes it. Maybe I'm not doing file handling incorrectly? or is it due to the json module? or am I the only one getting this result?
When you truncate the file, the file pointer is still at the end of the file. Use f.seek(0) to move back to the start of the file:
import os, json
def add_rooms(data):
if(not os.path.exists('rooms.json')):
with open('rooms.json', 'w'): pass
with open('rooms.json', 'r+') as f:
d = f.read() # take existing data from file
f.truncate(0) # empty the json file
f.seek(0) # <<<<<<<<< add this line
if(d == ''): rooms = [] # check if data is empty i.e the file was just created
else: rooms = json.loads(d)['rooms']
rooms.append({'name': data['roomname'], 'active': 1})
f.write(json.dumps({"rooms": rooms})) # write new data(rooms list) to the json file
add_rooms({'roomname': 'friends'})
I added sections and its values to ini file, but configparser doesn't want to print what sections I have in total. What I've done:
import configparser
import os
# creating path
current_path = os.getcwd()
path = 'ini'
try:
os.mkdir(path)
except OSError:
print("Creation of the directory %s failed" % path)
# add section and its values
config = configparser.ConfigParser()
config['section-1'] = {'somekey' : 'somevalue'}
file = open(f'ini/inifile.ini', 'a')
with file as f:
config.write(f)
file.close()
# get sections
config = configparser.ConfigParser()
file = open(f'ini/inifile.ini')
with file as f:
config.read(f)
print(config.sections())
file.close()
returns
[]
The similar code was in the documentation, but doesn't work. What I do wrong and how I could solve this problem?
From the docs, config.read() takes in a filename (or list of them), not a file descriptor object:
read(filenames, encoding=None)
Attempt to read and parse an iterable of filenames, returning a list of filenames which were successfully parsed.
If filenames is a string, a bytes object or a path-like object, it is treated as a single filename. ...
If none of the named files exist, the ConfigParser instance will contain an empty dataset. ...
A file object is an iterable of strings, so basically the config parser is trying to read each string in the file as a filename. Which is sort of interesting and silly, because if you passed it a file that contained the filename of your actual config...it would work.
Anyways, you should pass the filename directly to config.read(), i.e.
config.read("ini/inifile.ini")
Or, if you want to use a file descriptor object instead, simply use config.read_file(f). Read the docs for read_file() for more information.
As an aside, you are duplicating some of the work the context manager is doing for no gain. You can use the with block without creating the object explicitly first or closing it after (it will get closed automatically). Keep it simple:
with open("path/to/file.txt") as f:
do_stuff_with_file(f)
I have a python script that gets input file names from the command prompt. I created a list to store all the input files and pass that to a function to create a new file with all the input files merged at once. Now, I pass this newly written file as an input to another function. I am getting an error message
TypeError: coercing to Unicode: need string or buffer, list found
Code:
file_list = []
for arg in range(1,len(sys.argv)-2):
file_list.append(sys.argv[arg])
process_name = sys.argv[len(sys.argv)-1]
integrate_files(file_list,process_name)
def integrate_files(file_list,process_name):
with open('result.log', 'w' ) as result:
for file_ in file_list:
for line in open( file_, 'r' ):
result.write( line )
start_process(result,process_name)
def start_process(result,process_name):
with open(result,'r') as mainFile:
content = mainFile.readlines()
I am getting this error highlighted at the lines having the word with.open(). I tried to print the abspath of the result.log file. It printed closed file 'result.log', mode 'w' at 0x000000000227578. Where am I going wrong ? How should I create a new file and pass it to a function?
Your problem is that result is a closed file object:
start_process(result,process_name)
I think you want
start_process('result.log', process_name)
You could clean the script up a bit with
import shutil
file_list = sys.argv[1:-1]
process_name = sys.argv[-1]
integrate_files(file_list,process_name)
def integrate_files(file_list,process_name):
with open('result.log', 'w' ) as result:
for file_ in file_list:
with open(file_) as infile:
shutil.copyfileobj(infile, result)
start_process('result.log',process_name)
def start_process(result,process_name):
with open(result,'r') as mainFile:
content = mainFile.readlines()
The issue is here:
with open('result.log', 'w' ) as result:
# ...
start_process(result,process_name)
Since you reopen your file in start_process, you should just pass the name:
start_process(result.name, process_name)
Or just be explicit:
start_process('result.log', process_name)
When you write with open('result.log', 'w') as result:, you make result be an object representing the actual file on disk. That is different from the name of the file.
You certainly can pass that result to another function. But since it will be the actual file object, and not a file name, you can't pass that to open - open expects a name of a file, and looks for the file with that name, in order to create a new file object.
You can call methods on that file object, but none of them will actually re-open the file. Instead, the simplest thing is to remember and pass the file name, so that start_process can open it again.
As shown in #matsjoyce's answer, the file object remembers the original file name. So you could pass the object, and have start_process get the name. But that's messy. Really, just pass the name. (You could, like mats showed, pass result.name explicitly instead of making your own name variable first). Passing file objects around is usually not what you want - do it only when you want to split the reading/writing work across functions (and have a good reason for that).
In this:
with open('result.log', 'w' ) as result:
When you define result above, you are only defining it for that single loop, so it won't pass when you call start_process
So either change start_process to:
with open('result.log','r') as mainFile:
Or you could pass the string result.log into start_process instead of the variable result:
file_list = []
for arg in range(1,len(sys.argv)-2):
file_list.append(sys.argv[arg])
process_name = sys.argv[len(sys.argv)-1]
integrate_files(file_list,process_name)
def integrate_files(file_list,process_name):
with open('result.log', 'w' ) as result:
for file_ in file_list:
for line in open( file_, 'r' ):
result.write( line )
start_process('result.log',process_name)
def start_process(result,process_name):
with open(result,'r') as mainFile:
content = mainFile.readlines()
How do I extract a zip to memory?
My attempt (returning None on .getvalue()):
from zipfile import ZipFile
from StringIO import StringIO
def extract_zip(input_zip):
return StringIO(ZipFile(input_zip).extractall())
extractall extracts to the file system, so you won't get what you want. To extract a file in memory, use the ZipFile.read() method.
If you really need the full content in memory, you could do something like:
def extract_zip(input_zip):
input_zip=ZipFile(input_zip)
return {name: input_zip.read(name) for name in input_zip.namelist()}
If you work with in-memory archives frequently, I would recommend making a tool. Something like this:
# Works in Python 2 and 3.
try:
import BytesIO
except ImportError:
from io import BytesIO # Python 3
import zipfile
class InMemoryZip(object):
def __init__(self):
# Create the in-memory file-like object for working w/IMZ
self.in_memory_zip = BytesIO()
# Just zip it, zip it
def append(self, filename_in_zip, file_contents):
# Appends a file with name filename_in_zip and contents of
# file_contents to the in-memory zip.
# Get a handle to the in-memory zip in append mode
zf = zipfile.ZipFile(self.in_memory_zip, "a", zipfile.ZIP_DEFLATED, False)
# Write the file to the in-memory zip
zf.writestr(filename_in_zip, file_contents)
# Mark the files as having been created on Windows so that
# Unix permissions are not inferred as 0000
for zfile in zf.filelist:
zfile.create_system = 0
return self
def read(self):
# Returns a string with the contents of the in-memory zip.
self.in_memory_zip.seek(0)
return self.in_memory_zip.read()
# Zip it, zip it, zip it
def writetofile(self, filename):
# Writes the in-memory zip to a physical file.
with open(filename, "wb") as file:
file.write(self.read())
if __name__ == "__main__":
# Run a test
imz = InMemoryZip()
imz.append("testfile.txt", "Make a test").append("testfile2.txt", "And another one")
imz.writetofile("testfile.zip")
print("testfile.zip created")
Probable reasons:
1.This module does not currently handle multi-disk ZIP files.
(OR)
2.Check with StringIO.getvalue() weather Unicode Error is coming up.
In a web app I am working on, the user can create a zip archive of a folder full of files. Here here's the code:
files = torrent[0].files
zipfile = z.ZipFile(zipname, 'w')
output = ""
for f in files:
zipfile.write(settings.PYRAT_TRANSMISSION_DOWNLOAD_DIR + "/" + f.name, f.name)
downloadurl = settings.PYRAT_DOWNLOAD_BASE_URL + "/" + settings.PYRAT_ARCHIVE_DIR + "/" + filename
output = "Download " + torrent_name + ""
return HttpResponse(output)
But this has the nasty side effect of a long wait (10+ seconds) while the zip archive is being downloaded. Is it possible to skip this? Instead of saving the archive to a file, is it possible to send it straight to the user?
I do beleive that torrentflux provides this excat feature I am talking about. Being able to zip GBs of data and download it within a second.
Check this Serving dynamically generated ZIP archives in Django
As mandrake says, constructor of HttpResponse accepts iterable objects.
Luckily, ZIP format is such that archive can be created in single pass, central directory record is located at the very end of file:
(Picture from Wikipedia)
And luckily, zipfile indeed doesn't do any seeks as long as you only add files.
Here is the code I came up with. Some notes:
I'm using this code for zipping up a bunch of JPEG pictures. There is no point compressing them, I'm using ZIP only as container.
Memory usage is O(size_of_largest_file) not O(size_of_archive). And this is good enough for me: many relatively small files that add up to potentially huge archive
This code doesn't set Content-Length header, so user doesn't get nice progress indication. It should be possible to calculate this in advance if sizes of all files are known.
Serving the ZIP straight to user like this means that resume on downloads won't work.
So, here goes:
import zipfile
class ZipBuffer(object):
""" A file-like object for zipfile.ZipFile to write into. """
def __init__(self):
self.data = []
self.pos = 0
def write(self, data):
self.data.append(data)
self.pos += len(data)
def tell(self):
# zipfile calls this so we need it
return self.pos
def flush(self):
# zipfile calls this so we need it
pass
def get_and_clear(self):
result = self.data
self.data = []
return result
def generate_zipped_stream():
sink = ZipBuffer()
archive = zipfile.ZipFile(sink, "w")
for filename in ["file1.txt", "file2.txt"]:
archive.writestr(filename, "contents of file here")
for chunk in sink.get_and_clear():
yield chunk
archive.close()
# close() generates some more data, so we yield that too
for chunk in sink.get_and_clear():
yield chunk
def my_django_view(request):
response = HttpResponse(generate_zipped_stream(), mimetype="application/zip")
response['Content-Disposition'] = 'attachment; filename=archive.zip'
return response
Here's a simple Django view function which zips up (as an example) any readable files in /tmp and returns the zip file.
from django.http import HttpResponse
import zipfile
import os
from cStringIO import StringIO # caveats for Python 3.0 apply
def somezip(request):
file = StringIO()
zf = zipfile.ZipFile(file, mode='w', compression=zipfile.ZIP_DEFLATED)
for fn in os.listdir("/tmp"):
path = os.path.join("/tmp", fn)
if os.path.isfile(path):
try:
zf.write(path)
except IOError:
pass
zf.close()
response = HttpResponse(file.getvalue(), mimetype="application/zip")
response['Content-Disposition'] = 'attachment; filename=yourfiles.zip'
return response
Of course this approach will only work if the zip files will conveniently fit into memory - if not, you'll have to use a disk file (which you're trying to avoid). In that case, you just replace the file = StringIO() with file = open('/path/to/yourfiles.zip', 'wb') and replace the file.getvalue() with code to read the contents of the disk file.
Does the zip library you are using allow for output to a stream. You could stream directly to the user instead of temporarily writing to a zip file THEN streaming to the user.
It is possible to pass an iterator to the constructor of a HttpResponse (see docs). That would allow you to create a custom iterator that generates data as it is being requested. However I don't think that will work with a zip (you would have to send partial zip as it is being created).
The proper way, I think, would be to create the files offline, in a separate process. The user could then monitor the progress and then download the file when its ready (possibly by using the iterator method described above). This would be similar what sites like youtube use when you upload a file and wait for it to be processed.