Python pickle throws TypeError [duplicate] - python

I'm using python3.3 and I'm having a cryptic error when trying to pickle a simple dictionary.
Here is the code:
import os
import pickle
from pickle import *
os.chdir('c:/Python26/progfiles/')
def storvars(vdict):
f = open('varstor.txt','w')
pickle.dump(vdict,f,)
f.close()
return
mydict = {'name':'john','gender':'male','age':'45'}
storvars(mydict)
and I get:
Traceback (most recent call last):
File "C:/Python26/test18.py", line 31, in <module>
storvars(mydict)
File "C:/Python26/test18.py", line 14, in storvars
pickle.dump(vdict,f,)
TypeError: must be str, not bytes

The output file needs to be opened in binary mode:
f = open('varstor.txt','w')
needs to be:
f = open('varstor.txt','wb')

Just had same issue. In Python 3, Binary modes 'wb', 'rb' must be specified whereas in Python 2x, they are not needed. When you follow tutorials that are based on Python 2x, that's why you are here.
import pickle
class MyUser(object):
def __init__(self,name):
self.name = name
user = MyUser('Peter')
print("Before serialization: ")
print(user.name)
print("------------")
serialized = pickle.dumps(user)
filename = 'serialized.native'
with open(filename,'wb') as file_object:
file_object.write(serialized)
with open(filename,'rb') as file_object:
raw_data = file_object.read()
deserialized = pickle.loads(raw_data)
print("Loading from serialized file: ")
user2 = deserialized
print(user2.name)
print("------------")

pickle uses a binary protocol, hence only accepts binary files. As the document said in the first sentence, "The pickle module implements binary protocols for serializing and de-serializing".

Related

Import from <class 'bytes'> instead of file

In Python I have .pyd shared library that is encrypted to .epyd and that I read and decrypt with
with open('src_nuitka/src.epyd', 'rb') as f:
my_pyd_module = decrypt(f.read())
Now I would like to import the module using the <class 'bytes'> object my_pyd_module directly without writing to disk first. How can I do this? Since it is not a Python code string, I cannot use exec. Is there an import hook available for this task? All examples of writing import hooks did this using exec or by instantiating the classes directly as in https://dev.to/dangerontheranger/dependency-injection-with-import-hooks-in-python-3-5hap.
So here is my first try using the ideas of #a_guest and https://dev.to/dangerontheranger/dependency-injection-with-import-hooks-in-python-3-5hap (and no en-/decrypting yet):
import importlib.abc
import importlib.machinery
import sys
class DependencyInjectorFinder(importlib.abc.MetaPathFinder):
def __init__(self, loader):
self._loader: DependencyInjectorLoader = loader
def find_spec(self, fullname, path, target=None):
if fullname == 'src2':
return importlib.machinery.ModuleSpec(fullname, self._loader)
class DependencyInjectorLoader(importlib.machinery.ExtensionFileLoader):
def get_data(self, path):
with open('src_packaged/src_dist/src.pyd', 'rb') as f:
module = f.read()
return module
sys.meta_path.append(DependencyInjectorFinder(DependencyInjectorLoader('src2', 'src2')))
import src2
which results in
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: bad argument type for built-in operation
for the last line.

python io.open() integer required error

I'm getting the following error when attempting to open a new file with today's date.
Traceback (most recent call last):
File "C:\BenPi\stacking\pi3\red_RTS\iotest.py", line 6, in <module>
f = io.open('%s',today, 'w')
TypeError: an integer is required
Here is my code
import datetime
import io
import os
today = datetime.date.today().strftime('%m_%d_%Y')
print (today)
f = io.open('%s',today, 'w')
f.write('first line \n')
f.write('second line \n')
f.close()
It is my understanding that this is an issue that arises when someone inadvertently uses os.open() instead of io.open(), which is why I specified the io option. It should be noted that the same error comes up regardless if I import the os module.
I'm using python 3.2.5
Thoughts?
You're not formatting correctly, you're using , instead of %:
f = io.open('%s'%today, 'w')
Besides, you can just do:
f = io.open(today, 'w')
The line
f = io.open('%s',today, 'w')
should have '%s' first argument the first argument must be the file name.
If you write it like:
f = io.open(today, 'w')
Just works. Also consider using the "with" statment so in case of an exception the stream will be close anyway such as:
with io.open(today, 'w') as f:
f.write("hello world")
I hope I have been helpful.

Trying to write a cPickle object but get a 'write' attribute type error

When trying to apply some code I found on the internet in iPython, it's coming up with an error:
TypeError Traceback (most recent call last)
<ipython-input-4-36ec95de9a5d> in <module>()
13 all[i] = r.json()
14
---> 15 cPickle.dump(all, outfile)
TypeError: argument must have 'write' attribute
Here's what I have done in order:
outfile = "C:\John\Footy Bants\R COMPLAEX MATHS"
Then, I pasted in the following code:
import requests, cPickle, shutil, time
all = {}
errorout = open("errors.log", "w")
for i in range(600):
playerurl = "http://fantasy.premierleague.com/web/api/elements/%s/"
r = requests.get(playerurl % i)
# skip non-existent players
if r.status_code != 200: continue
all[i] = r.json()
cPickle.dump(all, outfile)
Here's the original article to give you an idea of what I'm trying to achieve:
http://billmill.org/fantasypl/
The second argument to cPickle.dump() must be a file object. You passed in a string containing a filename instead.
You need to use the open() function to open a file object for that filename, then pass the file object to cPickle:
with open(outfile, 'wb') as pickle_file:
cPickle.dump(all, pickle_file)
See the Reading and Writing Files section of the Python tutorial, including why using with when opening a file is a good idea (it'll be closed for you automatically).

How to read from a text file compressed with 7z?

I would like to read (in Python 2.7), line by line, from a csv (text) file, which is 7z compressed. I don't want to decompress the entire (large) file, but to stream the lines.
I tried pylzma.decompressobj() unsuccessfully. I get a data error. Note that this code doesn't yet read line by line:
input_filename = r"testing.csv.7z"
with open(input_filename, 'rb') as infile:
obj = pylzma.decompressobj()
o = open('decompressed.raw', 'wb')
obj = pylzma.decompressobj()
while True:
tmp = infile.read(1)
if not tmp: break
o.write(obj.decompress(tmp))
o.close()
Output:
o.write(obj.decompress(tmp))
ValueError: data error during decompression
This will allow you to iterate the lines. It's partially derived from some code I found in an answer to another question.
At this point in time (pylzma-0.5.0) the py7zlib module doesn't implement an API that would allow archive members to be read as a stream of bytes or characters — its ArchiveFile class only provides a read() function that decompresses and returns the uncompressed data in a member all at once. Given that, about the best that can be done is return bytes or lines iteratively via a Python generator using that as a buffer.
The following does the latter, but may not help if the problem is the archive member file itself is huge.
The code below should work in Python 3.x as well as 2.7.
import io
import os
import py7zlib
class SevenZFileError(py7zlib.ArchiveError):
pass
class SevenZFile(object):
#classmethod
def is_7zfile(cls, filepath):
""" Determine if filepath points to a valid 7z archive. """
is7z = False
fp = None
try:
fp = open(filepath, 'rb')
archive = py7zlib.Archive7z(fp)
_ = len(archive.getnames())
is7z = True
finally:
if fp: fp.close()
return is7z
def __init__(self, filepath):
fp = open(filepath, 'rb')
self.filepath = filepath
self.archive = py7zlib.Archive7z(fp)
def __contains__(self, name):
return name in self.archive.getnames()
def readlines(self, name, newline=''):
r""" Iterator of lines from named archive member.
`newline` controls how line endings are handled.
It can be None, '', '\n', '\r', and '\r\n' and works the same way as it does
in StringIO. Note however that the default value is different and is to enable
universal newlines mode, but line endings are returned untranslated.
"""
archivefile = self.archive.getmember(name)
if not archivefile:
raise SevenZFileError('archive member %r not found in %r' %
(name, self.filepath))
# Decompress entire member and return its contents iteratively.
data = archivefile.read().decode()
for line in io.StringIO(data, newline=newline):
yield line
if __name__ == '__main__':
import csv
if SevenZFile.is_7zfile('testing.csv.7z'):
sevenZfile = SevenZFile('testing.csv.7z')
if 'testing.csv' not in sevenZfile:
print('testing.csv is not a member of testing.csv.7z')
else:
reader = csv.reader(sevenZfile.readlines('testing.csv'))
for row in reader:
print(', '.join(row))
If you were using Python 3.3+, you might be able to do this using the lzma module which was added to the standard library in that version.
See: lzma Examples
If you can use python 3, there is a useful library, py7zr, which supports partially 7zip decompression as below:
import py7zr
import re
filter_pattern = re.compile(r'<your/target/file_and_directories/regex/expression>')
with SevenZipFile('archive.7z', 'r') as archive:
allfiles = archive.getnames()
selective_files = [f if filter_pattern.match(f) for f in allfiles]
archive.extract(targets=selective_files)

Python 3 how to generate md5 hash from file on stdin?

I am trying to calculate an md5 hash of a file from stdin using Python 3
Here is the error message returned. I can't see why it doesn't return the md5 hash. Any help appreciated.
$./pymd5.py < tmp.pdf
Traceback (most recent call last):
File "./pymd5.py", line 29, in <module>
main()
File "./pymd5.py", line 25, in main
print(m.hexdigest())
TypeError: 'str' does not support the buffer interface
$
The code:
#!/usr/local/bin/python3.2
import sys
import hashlib
BUFSIZE = 4096
def make_streams_binary():
sys.stdin = sys.stdin.detach()
sys.stdout = sys.stdout.detach()
def main():
make_streams_binary()
m = hashlib.md5()
while True:
data = sys.stdin.read(BUFSIZE)
if not data:
break
m.update(data)
print(m.hexdigest())
if __name__ == "__main__":
main()
When you do
sys.stdout = sys.stdout.detach()
It removes the ability to print normally at the terminal on Python 3, because you get a buffer instead of one wrapped for encoding and decoding. Before you print, you should do:
sys.stdout = sys._stdout
To get the original stdout back.

Categories