I have a variable that stores json value. I want to base64 encode it in Python. But the error 'does not support the buffer interface' is thrown. I know that the base64 needs a byte to convert. But as I am newbee in Python, no idea as how to convert json to base64 encoded string.Is there a straight forward way to do it??
In Python 3.x you need to convert your str object to a bytes object for base64 to be able to encode them. You can do that using the str.encode method:
>>> import json
>>> import base64
>>> d = {"alg": "ES256"}
>>> s = json.dumps(d) # Turns your json dict into a str
>>> print(s)
{"alg": "ES256"}
>>> type(s)
<class 'str'>
>>> base64.b64encode(s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.2/base64.py", line 56, in b64encode
raise TypeError("expected bytes, not %s" % s.__class__.__name__)
TypeError: expected bytes, not str
>>> base64.b64encode(s.encode('utf-8'))
b'eyJhbGciOiAiRVMyNTYifQ=='
If you pass the output of your_str_object.encode('utf-8') to the base64 module, you should be able to encode it fine.
Here are two methods worked on python3
encodestring is deprecated and suggested one to use is encodebytes
import json
import base64
with open('test.json') as jsonfile:
data = json.load(jsonfile)
print(type(data)) #dict
datastr = json.dumps(data)
print(type(datastr)) #str
print(datastr)
encoded = base64.b64encode(datastr.encode('utf-8')) #1 way
print(encoded)
print(base64.encodebytes(datastr.encode())) #2 method
You could encode the string first, as UTF-8 for example, then base64 encode it:
data = '{"hello": "world"}'
enc = data.encode() # utf-8 by default
print base64.encodestring(enc)
This also works in 2.7 :)
Here's a function that you can feed a string and it will output a base64 string.
import base64
def b64EncodeString(msg):
msg_bytes = msg.encode('ascii')
base64_bytes = base64.b64encode(msg_bytes)
return base64_bytes.decode('ascii')
Related
I'm using this library to download and decode MMS PDUs:
https://github.com/pmarti/python-messaging
The sample code almost works, except that this method:
mms = MMSMessage.from_data(response)
Is throwing an exception:
TypeError: unsupported operand type(s) for &: 'str' and 'int'
Which seems to obviously be some sort of binary formatting problem.
In the sample code, the HTTP response is passed directly into the from_data method, however in my case it comes through with HTTP headers on it so I'm splitting the response by double CRLF and then passing in just the PDU data:
data = buf.getvalue()
split = data.split("\r\n\r\n");
mms = MMSMessage.from_data(split[1].strip())
This throws an error BUT if I first write the exact same data to a file then use the from_file method it works:
data = buf.getvalue()
split = data.split("\r\n\r\n");
f = open('dump','w+')
f.write(split[1])
f.close()
path = 'dump'
mms = MMSMessage.from_file(path)
I looked in the from_file method, and all it does is load the contents and then pass it into the same method as the from_data method, so the first way should Just Work™.
What I did notice is that the file is opened in binary format, and the content is loaded like this:
data = array.array('B')
with open(filename, 'rb') as f:
data.fromfile(f, num_bytes)
return self.decode_data(data)
So it seems obvious that somehow what I'm passing into the first function is actually a "string representation of binary data" and what's being read from the file is "actual binary data".
I tried using bytearray like this to "binaryfy" the string:
mms = MMSMessage.from_data(bytearray(split[1].strip(), "utf8"))
but that throws the error:
Traceback (most recent call last):
File "decodepdu.py", line 41, in <module>
mms = MMSMessage.from_data(bytearray(split[1].strip(), "utf8"))
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8c in position 0: ordinal not in range(128)
which seems weird because it's using an 'ascii' codec but I specified utf8 encoding.
Anyway at this point I'm in over my head because I'm not really all that familiar with python, so for now I'm just writing the content to a temporary file but I would really rather not.
Any help would be most appreciated!
Okay thanks to Paul M. in the comments, this works:
data = buf.getvalue()
split = data.split("\r\n\r\n");
pdu = array.array('B');
pdu.fromstring(split[1]);
mms = MMSMessage.from_data(pdu);
The following is the string I want to decompress:

I have tried zlib:
import zlib
decompressed_data = zlib.decompress(data)
I get the following error:
TypeError: a bytes-like object is required, not 'str'
Then I did:
data = bytes(data, "utf-8")
decompressed_data = zlib.decompress(data)
I get an error again:
Error -3 while decompressing data: incorrect header check
You first need to decode the base64, then zlib decompress:
import zlib, base64
decompressed_data = zlib.decompress(base64.b64decode(data))
Looking at your data, it appears to be UTF-8 encoded XML, so we're almost there:
xml = decompressed_data.decode("utf-8")
So i'm trying to create a very simple program that opens a file, read the file and convert what is in it from hex to base64 using python3.
I tried this :
file = open("test.txt", "r")
contenu = file.read()
encoded = contenu.decode("hex").encode("base64")
print (encoded)
but I get the error:
AttributeError: 'str' object has no attribute 'decode'
I tried multiple other things but always get the same error.
inside the test.txt is :
4B
if you guys can explain me what I do wrong would be awesome.
Thank you
EDIT:
i should get Sw== as output
This should do the trick. Your code works for Python <= 2.7 but needs updating in later versions.
import base64
file = open("test.txt", "r")
contenu = file.read()
bytes = bytearray.fromhex(contenu)
encoded = base64.b64encode(bytes).decode('ascii')
print(encoded)
you need to encode hex string from file test.txt to bytes-like object using bytes.fromhex() before encoding it to base64.
import base64
with open("test.txt", "r") as file:
content = file.read()
encoded = base64.b64encode(bytes.fromhex(content))
print(encoded)
you should always use with statement for opening your file to automatically close the I/O when finished.
in IDLE:
>>>> import base64
>>>>
>>>> with open('test.txt', 'r') as file:
.... content = file.read()
.... encoded = base64.b64encode(bytes.fromhex(content))
....
>>>> encoded
b'Sw=='
just trying to load this JSON file(with non-ascii characters) as a python dictionary with Unicode encoding but still getting this error:
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 90: ordinal not in range(128)
JSON file content = "tooltip":{
"dxPivotGrid-sortRowBySummary": "Sort\"{0}\"byThisRow",}
import sys
import json
data = []
with open('/Users/myvb/Desktop/Automation/pt-PT.json') as f:
for line in f:
data.append(json.loads(line.encode('utf-8','replace')))
You have several problems as near as I can tell. First, is the file encoding. When you open a file without specifying an encoding, the file is opened with whatever sys.getfilesystemencoding() is. Since that may vary (especially on Windows machines) its a good idea to explicitly use encoding="utf-8" for most json files. Because of your error message, I suspect that the file was opened with an ascii encoding.
Next, the file is decoded from utf-8 into python strings as it is read by the file system object. The utf-8 line has already been decoded to a string and is already ready for json to read. When you do line.encode('utf-8','replace'), you encode the line back into a bytes object which the json loads (that is, "load string") can't handle.
Finally, "tooltip":{ "navbar":"Operações de grupo"} isn't valid json, but it does look like one line of a pretty-printed json file containing a single json object. My guess is that you should read the entire file as 1 json object.
Putting it all together you get:
import json
with open('/Users/myvb/Desktop/Automation/pt-PT.json', encoding="utf-8") as f:
data = json.load(f)
From its name, its possible that this file is encoded as a Windows Portugese code page. If so, the "cp860" encoding may work better.
I had the same problem, what worked for me was creating a regular expression, and parsing every line from the json file:
REGEXP = '[^A-Za-z0-9\'\:\.\;\-\?\!]+'
new_file_line = re.sub(REGEXP, ' ', old_file_line).strip()
Having a file with content similar to yours I can read the file in one simple shot:
>>> import json
>>> fname = "data.json"
>>> with open(fname) as f:
... data = json.load(f)
...
>>> data
{'tooltip': {'navbar': 'Operações de grupo'}}
You don't need to read each line. You have two options:
import sys
import json
data = []
with open('/Users/myvb/Desktop/Automation/pt-PT.json') as f:
data.append(json.load(f))
Or, you can load all lines and pass them to the json module:
import sys
import json
data = []
with open('/Users/myvb/Desktop/Automation/pt-PT.json') as f:
data.append(json.loads(''.join(f.readlines())))
Obviously, the first suggestion is the best.
I've got some real problems to encode/decode strings to a specific charset (UTF-8).
My Unicode Object is:
>> u'Valor Econ\xf4mico - Opini\xe3o'
When I call print from python it returns:
>> Valor Econômico - Opinião
When I call .encode("utf-8") from my unicode object to write it to a JSON it returns:
>> 'Valor Econ\xc3\xb4mico - Opini\xc3\xa3o'
What am I doing wrong? What exactly is print() doing that I'm not?
Obs: I'm creating this unicode object from a line of a file.
import codecs
with codecs.open(path, 'r') as local_file:
for line in local_file:
obj = unicode((line.replace(codecs.BOM_UTF8, '')).replace('\n', ''), 'utf-8')
Valor Econ\xc3\xb4mico - Opini\xc3\xa3o is the UTF-8 representation prepared for a non-UTF-8 terminal, probably in the interactive shell. If you were to write this to a file (open("myfile", "wb").write("Valor Econ\xc3\xb4mico - Opini\xc3\xa3o") then you'd have a valid UTF-8 file.
To create Unicode strings from a file, you can use automatic decoding in the io module (Codecs.open() is being deprecated). BOMs will be removed automatically:
import io
with io.open(path, "r", encoding="utf-8") as local_file:
for line in local_file:
unicode_obj = line.strip()
When it comes to creating a JSON response, use the result from json.dumps(my_object). It will return an str with all non-ASCII chars encoded using Unicode codepoints.