SFTP error with Python/Paramiko writing a file - python

I'm trying to write a small Python script that will get query results from a database, write them to a file, and then sftp the file to a different server. The pieces work just fine but I'm getting a weird error when trying to sftp the file immediately after it's written.
The error I'm getting is
File "/usr/lib/python2.4/site-packages/paramiko/sftp_client.py", line 558, in put
file_size = os.stat(localpath).st_size
TypeError: coercing to Unicode: need string or buffer, file found
The offending line of code is just
sftp.put(outputfile, sftpoutputfile)
I tried using a copy of the output file instead of the one that's being written in the script and that worked exactly as it's supposed to. I'm calling file.close() after the file is written (and before setting up the sftp) so it seems like the file should be, well, closed and usable after that. Can someone tell me what I'm doing wrong? I can post more of the code if that would be helpful. Thank you very much.

The error message is telling you that it (in this case, os.stat) wants a stringlike object, and you're giving it the file instead.
Looking at the source of sftp_client.py in my copy of paramiko, we see
def put(self, localpath, remotepath, callback=None, confirm=True):
[...]
file_size = os.stat(localpath).st_size
fl = file(localpath, 'rb')
try:
fr = self.file(remotepath, 'wb')
fr.set_pipelined(True)
so I'm pretty sure that it wants the filename, not the file itself.

Related

PyPDF2 PdfReadError: Could not read Boolean object

I am getting the following error, when reading certain PDF files using PyPDF2. Due to the confidential nature of these documents, I can't share them, but I can try and provide information which can help solve this problem.
Stacktrace -
inputpdf = PdfFileReader(open(pdfpath, "rb"), strict=False)
File "/home/tata/.virtualenvs/obu/local/lib/python2.7/site-packages/PyPDF2/pdf.py", line 1084, in __init__
self.read(stream)
File "/home/tata/.virtualenvs/obu/local/lib/python2.7/site-packages/PyPDF2/pdf.py", line 1732, in read
num = readObject(stream, self)
File "/home/tata/.virtualenvs/obu/local/lib/python2.7/site-packages/PyPDF2/generic.py", line 74, in readObject
return BooleanObject.readFromStream(stream)
File "/home/tata/.virtualenvs/obu/local/lib/python2.7/site-packages/PyPDF2/generic.py", line 137, in readFromStream
raise utils.PdfReadError('Could not read Boolean object')
PdfReadError: Could not read Boolean object
The exception seems to be raised from the following function, in generic.py:
def readFromStream(stream):
word = stream.read(4)
if word == b_("true"):
return BooleanObject(True)
elif word == b_("fals"):
stream.read(1)
return BooleanObject(False)
else:
raise utils.PdfReadError('Could not read Boolean object')
Printing the variable word prints the string trai, but I am not sure what this string represents.
Since the PyPDF2 project seems unmaintained, can someone help me figure out a solution for this?
Note : Please note that these PDFs are not password protected.
It seems as if all pdfs are encrypted in some way. Using the solution cited in this issue #53 in PyPDF2's github repository, I used the following command to generate another pdf ( The Decrypted version of the original pdf ) -
qpdf --password= --decrypt input.pdf output.pdf
and then reading output.pdf worked for me. I am not sure as to how I can determine beforehand, whether a pdf is encrypted ( or in this particular state ) or not. But this solution temporarily solves the problem.

Python - Open file in notepad that is contained in a variable

I couldn't find this anywhere, so sorry if I missed it. It seems like it should be simple but somehow isn't. I have a simple program that opens a log (log1.lg let's say) and strips any lines that don't contain keywords. It then tosses them into a 2nd file that is renamed to Log1.lg.clean.
The way I've implemented this is by using os.rename so the code looks like this:
#define source and key words
source_log = 'Log1.lg'
bad_words = ['word', 'bad']
#clean up the log
with open(source_log) as orig_log, open('cleanlog.lg', 'w') as cleanlog:
for line in orig_log:
if not any9bad_word in line for bad_word in bad_words):
cleanlog.write(line)
#rename file and open in Notepad
rename = orig_log + '.clean'
new_log = os.rename("cleanlog.lg", rename)
prog = "notepad.exe"
subprocess.Popen(prog, new_log)
Error I'm getting is this:
File "C:\Users\me\Downloads\PythonStuff\stripMmax.py", line 23, in cleanLog
subprocess.Popen(prog, new_log)
File "C:\Python27\lib\subprocess.py", line 339, in __init__
raise TypeError("bufsize must be an integer")
TypeError: bufsize must be an integer
I'm using Python 2.7 if that's relevant. I don't get why this isn't working or why it's requiring a bufsize. I've seen other examples where this works this way so I'm thinking maybe this command doesn't work in 2.7 the way I'm typing it?
The documentation shows how to use this properly using the actual file name in quotes, but as you can see, mine here is contained in a variable which seems to cause issues. Thanks in advance!
See the Popen constructor here: subprocess.Popen. The second argument to Popen is bufsize. That explains your error. Also note that os.rename does not return anything so new_log will be None. Use your rename variable instead. Your call should look like this:
subprocess.Popen([prog, rename])
You likely also want to wait on the created Popen object:
proc = subprocess.Popen([prog, rename])
proc.wait()
Or something like that.

Local Blast empty xml file python

I am trying to implement a little script in order to automatize a local blast alignment.
I had ran commands in the terminal en it works perfectly. However when I try to automatize this, I have a message like : Empty XML file.
Do we have to implement a "system" waiting time to let the file be written, or I did something wrong?
The code :
#sequence identifier as key, sequence as value.
for element in dictionnaryOfSequence:
#I make a little temporary fasta file because the blast command need a fasta file as input.
out_fasta = open("tmp.fasta", 'w')
query = ">" + element + "\n" + str(dictionnary[element])
out_fasta.write(query) # And I have this file with my sequence correctly filled
OUT_FASTA.CLOSE() # EDIT : It was out of my loop....
#Now the blast command, which works well in the terminal, I have my tmp.xml file well filled.
os.system("blastn -db reads.fasta -query tmp.fasta -out tmp.xml -outfmt 5 -max_target_seqs 5000")
#Parsing of the xml file.
handle = open("tmp.xml", 'r')
blast_records = NCBIXML.read(handle)
print blast_records
I have an Error : Your XML file was empty, and the blast_records object doesn't exist.
Did I make something wrong with handles?
I take all advice. Thank you a lot for your ideas and help.
EDIT : Problem solved, sorry for the useless question. I did wrong with handle and I did not open the file in the right location. Same thing with the closing.
Sorry.
try to open the file "tmp.xml" in Internet explorer. All tags are closed?

Large file upload fails

I'm in the process of writing a python module to POST files to a server , I can upload files of size of upto 500MB but when I tried to upload a 1gb file the upload failed, If I were to use something like cURL it won't fail. I got the code after googling how to upload multipart formdata using python , the code can be found here. I just compiled and ran that code , the error I'm getting is this
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
opener.open("http://127.0.0.1/test_server/upload",params)
File "C:\Python27\lib\urllib2.py", line 392, in open
req = meth(req)
File "C:\Python27\MultipartPostHandler.py", line 35, in http_request
boundary, data = self.multipart_encode(v_vars, v_files)
File "C:\Python27\MultipartPostHandler.py", line 63, in multipart_encode
buffer += '\r\n' + fd.read() + '\r\n'
MemoryError
I'm new to python and having a hard time grasping it. I also came across another program here , I'll be honest I don't know how to run it. I tried running it by guessing based on the function name , but that didn't work.
The script in question isn't very smart and builds the POST body in memory.
Thus, to POST a 1GB file, you'll need 1GB of memory just to hold that data, plus the HTTP headers, boundaries, and python and the code itself.
You'd have to rework the script to use mmap instead, where you first construct the whole body in a temp file before handing that file wrapped in a mmap.mmap value to passing it to request.add_data.
See Python: HTTP Post a large file with streaming for hints on how to achieve that.

Decompress zip file with password fails - bug in Python?

I get a strange error in python. When I try to extract a password protected file using the zip module, I get an exception when trying to set "oy" as password. Everything else seems to work. A bug in ZipFile module?
import zipfile
zip = zipfile.ZipFile("file.zip", "r")
zip.setpassword("oy".encode('utf-8'))
zip.extractall() #Above password "oy" generates the error here
zip.close()
This is the exception I get:
Traceback (most recent call last):
File "unzip.py", line 4, in <module>
zip.extractall()
File "C:\Program Files\Python32\lib\zipfile.py", line 1002, in extrac
l
self.extract(zipinfo, path, pwd)
File "C:\Program Files\Python32\lib\zipfile.py", line 990, in extract
return self._extract_member(member, path, pwd)
File "C:\Program Files\Python32\lib\zipfile.py", line 1035, in _extra
member
shutil.copyfileobj(source, target)
File "C:\Program Files\Python32\lib\shutil.py", line 65, in copyfileo
buf = fsrc.read(length)
File "C:\Program Files\Python32\lib\zipfile.py", line 581, in read
data = self.read1(n - len(buf))
File "C:\Program Files\Python32\lib\zipfile.py", line 633, in read1
max(n - len_readbuffer, self.MIN_READ_SIZE)
zlib.error: Error -3 while decompressing: invalid block type
If I use UTF-16 as encoding I get this error:
zlib.error: Error -3 while decompressing: invalid distance too far back
EDIT
I have now tested on a virtual Linux machine with following stuff:
Python version: 2.6.5
I created a password protected zip file with zip -e file.zip
hello.txt
Now it seems the problem is something else. Now I can extract the zip file even if the password is wrong!
try:
zip.setpassword("ks") # "ks" is wrong password but it still extracts the zip
zip.extractall()
except RuntimeException:
print "wrong!"
Sometimes I can extract the zip file with an incorrect password. The file (inside the zip file) is then extracted but when I try to open it the information seems to be corrupted/decrypted.
If there's a problem with the password, usually you get the following exception:
RuntimeError: ('Bad password for file', <zipfile.ZipInfo object at 0xb76dec2c>)
Since your exception complains about block type, most probably your .zip archive is corrupted, have you tried to unpack it with standalone unzip utility?
Or maybe you have used something funny, like 7zip to create it, which makes incompatible .zip archives.
You don't provide enough information (OS version? Python version? ZIP archive creator and contents? are there many files in those archives or single file in single archive? do all those files give same errors, or you can unpack some of them?), so here's quick Q&A section, which should help you to find and remedy the problem.
Q1. Is this a bug in Python?
A1. Unlikely.
Q2. What might cause this behaviour?
A2. Broken zip files, incompatible zip compressors -- since you don't tell anything, it's hard to point the the exact cause.
Q3. How to find the cause?
A3. Try to isolate the problem, find the file which gives you an error, try to use zip.testzip() and/or decompress that particular file with different unzip utility, share the results. Only you have access to the problematic files, so nobody can help you unless you try to do something yourself.
Q4. How to fix this?
A4. You cannot. Use different zip extractor, ZipFile won't work.
Try using the testzip() method to check the file's integrity before extracting files.
It could be possibly a bug in zipfile, or a bug in your zip implementation. I noted that your line numbers do not match mine so I guess this is python 3.2 earlier than the current 3.2.3 release I have.
Now, as to your code, it does work for me on Python 3.2.3 on Linux. I suggest you update to the latest 3.2.x as there seem to be a number of bug fixes related to zipfile and zlib, including fixes for crashes.

Categories