tmpfile and gzip combination problem - python

I have problem with this code:
file = tempfile.TemporaryFile(mode='wrb')
file.write(base64.b64decode(data))
file.flush()
os.fsync(file)
# file.seek(0)
f = gzip.GzipFile(mode='rb', fileobj=file)
print f.read()
I dont know why it doesn't print out anything. If I uncomment file.seek then error occurs:
File "/usr/lib/python2.5/gzip.py", line 263, in _read
self._read_gzip_header()
File "/usr/lib/python2.5/gzip.py", line 162, in _read_gzip_header
magic = self.fileobj.read(2)
IOError: [Errno 9] Bad file descriptor
Just for information this version works fine:
x = open("test.gzip", 'wb')
x.write(base64.b64decode(data))
x.close()
f = gzip.GzipFile('test.gzip', 'rb')
print f.read()
EDIT: For wrb problem. It doesn't give me an error when initialize it. Python 2.5.2.
>>> t = tempfile.TemporaryFile(mode="wrb")
>>> t.write("test")
>>> t.seek(0)
>>> t.read()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IOError: [Errno 9] Bad file descriptor

'wrb' is not a valid mode.
This works fine:
import tempfile
import gzip
with tempfile.TemporaryFile(mode='w+b') as f:
f.write(data.decode('base64'))
f.flush()
f.seek(0)
gzf = gzip.GzipFile(mode='rb', fileobj=f)
print gzf.read()

Some tips:
You can't .seek(0) or .read() a gzip file in wrb mode or wb or w+b. GzipFile class __init__ set itself to READ or WRITE only by looking at the first character of wrb (set itself to WRITE for this case).
When doing f = gzip.GzipFile(mode='rb', fileobj=file) your real file is file not f, I understood that after reading GzipFile class definition.
A working example for me was:
from tempfile import NamedTemporaryFile
import gzip
with NamedTemporaryFile(mode='w+b', delete=True, suffix='.txt.gz', prefix='f') as t_file:
gzip_file = gzip.GzipFile(mode='wb', fileobj=t_file)
gzip_file.write('SOMETHING HERE')
gzip_file.close()
t_file.seek(0)
# Do something here with your t_file, maybe send it to an external storage or:
print t_file.read()
I hope this can be useful for someone out there, took a lot of my time to make it work.

Related

"[WinError 6] The handle is invalid" with Urllib

import urllib.request
def Download(url, file_name):
urllib.request.urlretrieve(url, file_name)
f = open("links.txt", "r")
lines = f.readlines()
for line in lines:
for x in range (1, 5):
filenaame = x
cut_string = line.split('?$')
new_string = cut_string[0]
numerator = new_string.split('/1/')
separator = ''
link = (separator.join(numerator[0] + "/{}/".format(x) + numerator[1]))
file_name = link.split('/{}/'.format(x))
file_name = file_name[1]
file_name = file_name.split('.')
file_name = (separator.join(file_name[0] + "{}".format(filenaame)))
filenaame =+ 1
print("Downloading: {}".format(file_name))
Download(link, filenaame)
Error:
Traceback (most recent call last):
File "C:\python\downloader\rr.py", line 29, in <module>
Download(link, filenaame)
File "C:\python\downloader\rr.py", line 5, in Download
urllib.request.urlretrieve(url, file_name)
File "C:\python\lib\urllib\request.py", line 258, in urlretrieve
tfp = open(filename, 'wb')
OSError: [WinError 6] The handle is invalid
I have googled a lot about this and in every result I found the person was using the subprocess module, which I'm not, which makes it even more difficult.
The code is for downloading images. It does download the first one successfully, then it crashes. Anyone know what's causing the error? I'm still a beginner.
you're passing your counter as a second parameter. In the end the function uses that as a filename to copy the downloaded data to.
By passing an integer you enable the "pass a lowlevel file descriptor" to open:
file is either a string or bytes object giving the pathname (absolute or relative to the current working directory) of the file to be opened or an integer file descriptor of the file to be wrapped.
But the file descriptor doesn't exist (fortunately!)
Let's reproduce the issue here:
>>> open(4,"rb")
Traceback (most recent call last):
File "<string>", line 301, in runcode
File "<interactive input>", line 1, in <module>
OSError: [Errno 9] Bad file descriptor
the fix is obviously:
Download(link, file_name)
(I'd suggest that you rename your filenaame counter for something more meaningful, it avoids mistakes like that)
At this line,
urllib.request.urlretrieve(url, file_name)
here file_name should be a string so before calling Download(link, filenaame) make integer filenaame to string try.
Download(link, str(filenaame))
Sample working for downloading an image
def download_img(url):
name = random.randrange(100,1000)
full_name = str(name)+ ".jpg"
urllib.request.urlretrieve(url, full_name)
download_img("http://images.freeimages.com/images/small-previews/25d/eagle-1523807.jpg")

TypeError - What does this error mean?

So, i've been writing this program that takes a HTMl file, replaces some text and puts the return back into a different file in a different directory.
This error happened.
Traceback (most recent call last):
File "/Users/Glenn/jack/HTML_Task/src/HTML Rewriter.py", line 19, in <module>
with open (os.path.join("/Users/Glenn/jack/HTML_Task/src", out_file)):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/posixpath.py", line 89, in join
genericpath._check_arg_types('join', a, *p)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/genericpath.py", line 143, in _check_arg_types
(funcname, s.__class__.__name__)) from None
TypeError: join() argument must be str or bytes, not 'TextIOWrapper'
Below is my code. Has anyone got any solutions I could implement, or should I kill it with fire.
import re
import os
os.mkdir ("dest")
file = open("2016-06-06_UK_BackToSchool.html").read()
text_filtered = re.sub(r'http://', '/', file)
print (text_filtered)
with open ("2016-06-06_UK_BackToSchool.html", "wt") as out_file:
print ("testtesttest")
with open (os.path.join("/Users/Glenn/jack/HTML_Task/src", out_file)):
out_file.write(text_filtered)
os.rename("/Users/Glenn/jack/HTML_Task/src/2016-06-06_UK_BackToSchool.html", "/Users/Glenn/jack/HTML_Task/src/dest/2016-06-06_UK_BackToSchool.html")
with open (os.path.join("/Users/Glenn/jack/HTML_Task/src", out_file)):
Here out_file if TextIOWrapper, not string.
os.path.join takes string as arguments.
Do not use keywords name as variable. file is keyword.
Do not use space in between function call os.mkdir ("dest")
try to change this:
with open ("2016-06-06_UK_BackToSchool.html", "wt") as out_file
on this:
with open ("2016-06-06_UK_BackToSchool.html", "w") as out_file:
or this:
with open ("2016-06-06_UK_BackToSchool.html", "wb") as out_file:

Use codecs to read file with correct encoding: TypeError

I need to read from a file, linewise. Also also need to make sure the encoding is correctly handled.
I wrote the following code:
#!/bin/bash
import codecs
filename = "something.x10"
f = open(filename, 'r')
fEncoded = codecs.getreader("ISO-8859-15")(f)
totalLength = 0
for line in fEncoded:
totalLength+=len(line)
print("Total Length is "+totalLength)
This code does not work on all files, on some files I get a
Traceback (most recent call last):
File "test.py", line 11, in <module>
for line in fEncoded:
File "/usr/lib/python3.2/codecs.py", line 623, in __next__
line = self.readline()
File "/usr/lib/python3.2/codecs.py", line 536, in readline
data = self.read(readsize, firstline=True)
File "/usr/lib/python3.2/codecs.py", line 480, in read
data = self.bytebuffer + newdata
TypeError: can't concat bytes to str
Im using python 3.3 and the script must work with this python version.
What am I doing wrong, I was not able to find out which files work and which not, even some plain ASCII files fail.
You are opening the file in non-binary mode. If you read from it, you get a string decoded according to your default encoding (http://docs.python.org/3/library/functions.html?highlight=open%20builtin#open).
codec's StreamReader needs a bytestream (http://docs.python.org/3/library/codecs#codecs.StreamReader)
So this should work:
import codecs
filename = "something.x10"
f = open(filename, 'rb')
f_decoded = codecs.getreader("ISO-8859-15")(f)
totalLength = 0
for line in f_decoded:
total_length += len(line)
print("Total Length is "+total_length)
or you can use the encoding parameter on open:
f_decoded = open(filename, mode='r', encoding='ISO-8859-15')
The reader returns decoded data, so I fixed your variable name. Also, consider pep8 as a guide for formatting and coding style.

Does fp.readlines() close a file?

In python I'm seeing evidence that fp.readlines() is closing the file when I try to access the fp later in the program. Can you confirm this behavior, do I need to re-open the file again later if I also want to read from it again?
Is the file closed? is similar, but didn't answer all of my questions.
import sys
def lines(fp):
print str(len(fp.readlines()))
def main():
sent_file = open(sys.argv[1], "r")
lines(sent_file)
for line in sent_file:
print line
this returns:
20
Once you have read a file, the file pointer has been moved to the end and no more lines will be 'found' beyond that point.
Re-open the file or seek back to the start:
sent_file.seek(0)
Your file is not closed; a closed file raises an exception when you attempt to access it:
>>> fileobj = open('names.txt')
>>> fileobj.close()
>>> fileobj.read()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: I/O operation on closed file
It doesn't close the file, but it does read the lines in it so they cannot be read again without reopening the file or setting the file pointer back to the beginning with fp.seek(0).
As evidence that it doesn't close the file, try changing the function to actually close the file:
def lines(fp):
print str(len(fp.readlines()))
fp.close()
You will get the error:
Traceback (most recent call last):
File "test5.py", line 16, in <module>
main()
File "test5.py", line 12, in main
for line in sent_file:
ValueError: I/O operation on closed file
It won't be closed, but the file will be at the end. If you want to read its contents a second time then consider using
f.seek(0)
You may want to use the with statement and context manager:
>>> with open('data.txt', 'w+') as my_file: # This will allways ensure
... my_file.write('TEST\n') # that the file is closed.
... my_file.seek(0)
... my_file.read()
...
'TEST'
If you use a normal call, remember to close it manually (in theory python closes file objects and garbage collect them as needed):
>>> my_file = open('data.txt', 'w+')
>>> my_file.write('TEST\n') # 'del my_file' should close it and garbage collect it
>>> my_file.seek(0)
>>> my_file.read()
'TEST'
>>> my_file.close() # Makes shure to flush buffers to disk

Python, unpacking a .jar file, doesnt work

So, I'm trying to unzip a .jar file using this code:
It won't unzip, only 20 / 500 files, and no folders/pictures
The same thing happens when I enter a .zip file in filename.
Any one any suggestions?
import zipfile
zfilename = "PhotoVieuwer.jar"
if zipfile.is_zipfile(zfilename):
print "%s is a valid zip file" % zfilename
else:
print "%s is not a valid zip file" % zfilename
print '-'*40
zfile = zipfile.ZipFile( zfilename, "r" )
zfile.printdir()
print '-'*40
for info in zfile.infolist():
fname = info.filename
data = zfile.read(fname)
if fname.endswith(".txt"):
print "These are the contents of %s:" % fname
print data
filename = fname
fout = open(filename, "w")
fout.write(data)
fout.close()
print "New file created --> %s" % filename
print '-'*40
But, it doesn't work, it unzips maybe 10 out of 500 files
Can anyone help me on fixing this?
Already Thanks!
I tried adding, what Python told me, I got this:
Oops! Your edit couldn't be submitted because:
body is limited to 30000 characters; you entered 153562
and only the error is :
Traceback (most recent call last):
File "C:\Python27\uc\TeStINGGFDSqAEZ.py", line 26, in <module>
fout = open(filename, "w")
IOError: [Errno 2] No such file or directory: 'net/minecraft/client/ClientBrandRetriever.class'
The files that get unzipped:
amw.Class
amx.Class
amz.Class
ana.Class
ane.Class
anf.Class
ang.Class
ank.Class
anm.Class
ann.Class
ano.Class
anq.Class
anr.Class
anx.Class
any.Class
anz.Class
aob.Class
aoc.Class
aod.Class
aoe.Class
This traceback tells you what you need to know:
Traceback (most recent call last):
File "C:\Python27\uc\TeStINGGFDSqAEZ.py", line 26, in <module>
fout = open(filename, "w")
IOError: [Errno 2] No such file or directory: 'net/minecraft/client/ClientBrandRetriever.class'
The error message says that either the file ClientBrandRetriever.class doesn't exist or the directory net/minecraft/client does not exist. When a file is opened for writing Python creates it, so it can't be a problem that the file does not exist. It must be the case that the directory does not exist.
Consider that this works
>>> open('temp.txt', 'w')
<open file 'temp.txt', mode 'w' at 0x015FF0D0>
but this doesn't, giving nearly identical traceback to the one you are getting:
>>> open('bogus/temp.txt', 'w')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IOError: [Errno 2] No such file or directory: 'bogus/temp.txt'
Creating the directory fixes it:
>>> os.makedirs('bogus')
>>> open('bogus/temp.txt', 'w')
<open file 'bogus/temp.txt', mode 'w' at 0x01625D30>
Just prior to opening the file you should check if the directory exists and create it if necessary.
So to solve your problem, replace this
fout = open(filename, 'w')
with this
head, tail = os.path.split(filename) # isolate directory name
if not os.path.exists(head): # see if it exists
os.makedirs(head) # if not, create it
fout = open(filename, 'w')
If python -mzipfile -e PhotoVieuwer.jar dest works then you could:
import zipfile
with zipfile.ZipFile("PhotoVieuwer.jar") as z:
z.extractall()

Categories