Python adds extra to crypt result - python

I'm trying to create an API with token to communicate between an Raspberry Pi and a Webserver. Right now i'm tring to generate an Token with Python.
from Crypto.Cipher import AES
import base64
import os
import time
import datetime
import requests
BLOCK_SIZE = 32
BLOCK_SZ = 14
#!/usr/bin/python
salt = "123456789123" # Zorg dat de salt altijd even lang is! (12 Chars)
iv = "1234567891234567" # Zorg dat de salt altijd even lang is! (16 Chars)
currentDate = time.strftime("%d%m%Y")
currentTime = time.strftime("%H%M")
PADDING = '{'
pad = lambda s: s + (BLOCK_SIZE - len(s) % BLOCK_SIZE) * PADDING
EncodeAES = lambda c, s: base64.b64encode(c.encrypt(pad(s)))
DecodeAES = lambda c, e: c.decrypt(base64.b64decode(e)).rstrip(PADDING)
secret = salt + currentTime
cipher=AES.new(key=secret,mode=AES.MODE_CBC,IV=iv)
encode = currentDate
encoded = EncodeAES(cipher, encode)
print (encoded)
The problem is that the output of the script an exta b' adds to every encoded string.. And on every end a '
C:\Python36-32>python.exe encrypt.py
b'Qge6lbC+SulFgTk/7TZ0TKHUP0SFS8G+nd5un4iv9iI='
C:\Python36-32>python.exe encrypt.py
b'DTcotcaU98QkRxCzRR01hh4yqqyC92u4oAuf0bSrQZQ='
Hopefully someone can explain what went wrong.
FIXED!
I was able to fix it to decode it to utf-8 format.
sendtoken = encoded.decode('utf-8')

You are running Python 3.6, which uses Unicode (UTF-8) for string literals. I expect that the EncodeAES() function returns an ASCII string, which Python is indicating is a bytestring rather than a Unicode string by prepending the b to the string literal it prints.
You could strip the b out of the output post-Python, or you could print(str(encoded)), which should give you the same characters, since ASCII is valid UTF-8.
EDIT:
What you need to do is decode the bytestring into UTF-8, as mentioned in the answer and in a comment above. I was wrong about str() doing the conversion for you, you need to call decode('UTF-8') on the bytestring you wish to print. That converts the string into the internal UTF-8 representation, which then prints correctly.

Related

Base64 decoding and encoding give different results

I have the two following encoded string :
base64_str1 = 'eyJzZWN0aW9uX29mZnNldCI6MiwiaXRlbXNfb2Zmc2V0IjozNiwidmVyc2lvbiI6MX0%3D'
base64_str2 = 'eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ%3D%3D'
Using Base64 online decoder/encoder , the results are as follow (which are the right results) :
base64_str1_decoded = '{"section_offset":2,"items_offset":36,"version":1}7'
base64_str2_decoded = '{"section_offset":0,"items_offset":0,"version":1}'
However, when I tried to encode base64_str1_decoded or base64_str2_decoded back to Base64, I'm not able to obtain the initial base64 strings.
For instance, the ouput for the following code :
base64_str2_decoded = '{"section_offset":0,"items_offset":0,"version":1}'
recoded_str2 = base64.b64encode(bytes(base64_str2_decoded, 'utf-8'))
print(recoded_str2)
# output = b'eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ=='
# expected_output = eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ%3D%3D
I tried changing the encoding scheme but can't seem to make it work.
Notice that extra 7 at the end of base64_str1_decoded? That's because your input strings are incorrect. They have escape codes required for URLs. %3D is an escape code for =, which is what should be entered into the online decoder instead. You'll notice the 2nd string in the decoder has an extra ÃÜ on the next line you haven't shown due to using %3D%3D instead of ==. That online decoder is allowing invalid base64 to be decoded.
To correctly decode in Python use urllib.parse.unquote on the string to remove the escaping first:
import base64
import urllib.parse
base64_str1 = 'eyJzZWN0aW9uX29mZnNldCI6MiwiaXRlbXNfb2Zmc2V0IjozNiwidmVyc2lvbiI6MX0%3D'
base64_str2 = 'eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ%3D%3D'
# Demonstrate Python decoder detects invalid B64 encoding
try:
print(base64.b64decode(base64_str1))
except Exception as e:
print('Exception:', e)
try:
print(base64.b64decode(base64_str2))
except Exception as e:
print('Exception:', e)
# Decode after unquoting...
base64_str1_decoded = base64.b64decode(urllib.parse.unquote(base64_str1))
base64_str2_decoded = base64.b64decode(urllib.parse.unquote(base64_str2))
print(base64_str1_decoded)
print(base64_str2_decoded)
# See valid B64 encoding.
recoded_str1 = base64.b64encode(base64_str1_decoded)
recoded_str2 = base64.b64encode(base64_str2_decoded)
print(recoded_str1)
print(recoded_str2)
Output:
Exception: Invalid base64-encoded string: number of data characters (69) cannot be 1 more than a multiple of 4
Exception: Incorrect padding
b'{"section_offset":2,"items_offset":36,"version":1}'
b'{"section_offset":0,"items_offset":0,"version":1}'
b'eyJzZWN0aW9uX29mZnNldCI6MiwiaXRlbXNfb2Zmc2V0IjozNiwidmVyc2lvbiI6MX0='
b'eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ=='
Note that the b'' notation is Python's indication that the object is a byte string as opposed to a Unicode string and is not part of the string itself.

How to decode base64 in python3

I have a base64 encrypt code, and I can't decode in python3.5
import base64
code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA" # Unencrypt is 202cb962ac59075b964b07152d234b70
base64.b64decode(code)
Result:
binascii.Error: Incorrect padding
But same website(base64decode) can decode it,
Please anybody can tell me why, and how to use python3.5 decode it?
Thanks
Base64 needs a string with length multiple of 4. If the string is short, it is padded with 1 to 3 =.
import base64
code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA="
base64.b64decode(code)
# b'admin:202cb962ac59075b964b07152d234b70'
According to this answer, you can just add the required padding.
code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA"
b64_string = code
b64_string += "=" * ((4 - len(b64_string) % 4) % 4)
base64.b64decode(b64_string) #'admin:202cb962ac59075b964b07152d234b70'
I tried the other way around. If you know what the unencrypted value is:
>>> import base64
>>> unencoded = b'202cb962ac59075b964b07152d234b70'
>>> encoded = base64.b64encode(unencoded)
>>> print(encoded)
b'MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA='
>>> decoded = base64.b64decode(encoded)
>>> print(decoded)
b'202cb962ac59075b964b07152d234b70'
Now you see the correct padding. b'MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA=
It actually seems to just be that code is incorrectly padded (code is incomplete)
import base64
code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA"
base64.b64decode(code+"=")
returns b'admin:202cb962ac59075b964b07152d234b70'

bz2 decompress with Python 3.4 - TypeError: 'str' does not support the buffer interface

There are similar errors but I could not find a solution for bz2.
The following program fails on the decompress:
import bz2
un = 'BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084'
pw = 'BZh91AY&SY\x94$|\x0e\x00\x00\x00\x81\x00\x03$ \x00!\x9ah3M\x13<]\xc9\x14\xe1BBP\x91\xf08'
decoded_un = bz2.decompress(un)
decoded_pw = bz2.decompress(pw)
print(decoded_un)
print(decoded_pw)
I tried using bytes(un, 'UTF-8) but that would not work. I think I did not have this problem in Python 3.3.
EDIT: this was for the Python challenge I have two bits of code which work thanks to Martijn:
import bz2
un_saved = 'BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084'
pw_saved = 'BZh91AY&SY\x94$|\x0e\x00\x00\x00\x81\x00\x03$ \x00!\x9ah3M\x13<]\xc9\x14\xe1BBP\x91\xf08'
print(bz2.decompress(un_saved.encode('latin1')))
print(bz2.decompress(pw_saved.encode('latin1')))
This one works from the webpage:
# http://www.pythonchallenge.com/pc/def/integrity.html
import urllib.request
import re
import os.path
import bz2
fname = "008.html"
if not os.path.isfile(fname):
url = 'http://www.pythonchallenge.com/pc/def/integrity.html'
response = urllib.request.urlopen(url)
webpage = response.read().decode("utf-8")
with open(fname, "w") as fh:
fh.write(webpage)
with open(fname, "r") as fh:
webpage = fh.read()
re_un = '\\nun: \'(.*)\'\\n'
m = re.search(re_un, webpage)
un = m.group(1)
print(un)
pw_un = '\\npw: \'(.*)\'\\n'
m = re.search(pw_un, webpage)
pw = m.group(1)
print(pw)
unde = un.encode('latin-1').decode('unicode_escape').encode('latin1')
pwde = pw.encode('latin-1').decode('unicode_escape').encode('latin1')
decoded_un = bz2.decompress(unde)
decoded_pw = bz2.decompress(pwde)
print(decoded_un)
print(decoded_pw)
The bz2 library deals with bytes objects, not strings:
un = b'BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084'
pw = b'BZh91AY&SY\x94$|\x0e\x00\x00\x00\x81\x00\x03$ \x00!\x9ah3M\x13<]\xc9\x14\xe1BBP\x91\xf08'
In other words, using bytes() works just fine, just make sure you use the correct encoding. UTF-8 is not that encoding; if you have bytes masking as string character codepoints, use Latin-1 to encode instead; Latin 1 maps characters one-on-one to bytes:
un = un.encode('latin1')
or
un = bytes(un, 'latin1')
Also see the Python Unicode HOWTO:
Latin-1, also known as ISO-8859-1, is a similar encoding. Unicode code points 0–255 are identical to the Latin-1 values, so converting to this encoding simply requires converting code points to byte values; if a code point larger than 255 is encountered, the string can’t be encoded into Latin-1.
I'll leave the decoding to you. Have fun with the Python Challenge!
Note that if you loaded these characters as they are from a webpage, they will not by ready-made bytes! You'll have the characters '\', 'x', 8 and 2 rather than a codepoint with hex value 82. You'd need to interpret those sequences as a Python string literal first:
>>> un = r'BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084'
>>> un
'BZh91AY&SYA\\xaf\\x82\\r\\x00\\x00\\x01\\x01\\x80\\x02\\xc0\\x02\\x00 \\x00!\\x9ah3M\\x07<]\\xc9\\x14\\xe1BA\\x06\\xbe\\x084'
>>> un.encode('latin-1').decode('unicode_escape')
'BZh91AY&SYA¯\x82\r\x00\x00\x01\x01\x80\x02À\x02\x00 \x00!\x9ah3M\x07<]É\x14áBA\x06¾\x084'
>>> un.encode('latin-1').decode('unicode_escape').encode('latin1')
b'BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084'
Note the double backslashes in the representation of un. Only the last bytes result is then decompressable!

TypeError while using pycrypto library in python 3.3.2

I have just started using PyCrypto package for python.
I am trying out the following code under python 3.3.2:
Code Reference : AES Encryption using python
#!/usr/bin/env python
from Crypto.Cipher import AES
import base64
import os
# the block size for the cipher object; must be 16, 24, or 32 for AES
BLOCK_SIZE = 32
# the character used for padding--with a block cipher such as AES, the value
# you encrypt must be a multiple of BLOCK_SIZE in length. This character is
# used to ensure that your value is always a multiple of BLOCK_SIZE
PADDING = '{'
# one-liner to sufficiently pad the text to be encrypted
pad = lambda s: s + (BLOCK_SIZE - len(s) % BLOCK_SIZE) * PADDING
# one-liners to encrypt/encode and decrypt/decode a string
# encrypt with AES, encode with base64
EncodeAES = lambda c, s: base64.b64encode(c.encrypt(pad(s)))
DecodeAES = lambda c, e: c.decrypt(base64.b64decode(e)).rstrip(PADDING)
# generate a random secret key
secret = os.urandom(BLOCK_SIZE)
# create a cipher object using the random secret
cipher = AES.new(secret)
# encode a string
encoded = EncodeAES(cipher, 'password')
print ('Encrypted string:', encoded)
# decode the encoded string
decoded = DecodeAES(cipher, encoded)
print ('Decrypted string:', decoded)
The error that I run into is :
Traceback (most recent call last):
File "C:/Users/Hassan Javaid/Documents/Python files/crypto_example.py", line 34, in <module>
decoded = DecodeAES(cipher, encoded)
File "C:/Users/Hassan Javaid/Documents/Python files/crypto_example.py", line 21, in <lambda>
DecodeAES = lambda c, e: c.decrypt(base64.b64decode(e)).rstrip(PADDING)
TypeError: Type str doesn't support the buffer API
Any pointers to why I am getting the same ?
This is because cipher.encrypt(plain_text) in python 3.x returns a byte string.
The example given in the page uses python 2.x in which case cipher.encrypt(plain_text) returned a regular string.
You can verify the same by using the type function:
In python 3.x:
>>> type(cipher.encrypt("ABCDEFGHIJKLMNOP"))
<class 'bytes'>
In python 2.x
>>> type(cipher.encrypt("ABCDEFGHIJKLMNOP"))
<class 'str'>
The error you are getting is because you are trying to use the rstrip method on a byte string.
Use:
DecodeAES = lambda c, e: c.decrypt(base64.b64decode(e)).decode("UTF-8").rstrip(PADDING)
This will decode the bytestring to regular string before using the rstrip method on it.
Another way to look at it is that the method rstrip accepts as argument a byte string if invoked on a byte string, or a regular string if invoked on a regular string.
Since decrypt of an AES object returns a byte string, DELIMITER should be defined as a byte string too:
PADDING = b'{'

How to print out 0xfb in python

I'm falling the unicode hell.
My environment in on unix, python 2.7.3
LC_CTYPE=zh_TW.UTF-8
LANG=en_US.UTF-8
I'm trying to dump hex encoded data in human readable format, here is simplified code
#! /usr/bin/env python
# encoding:utf-8
import sys
s=u"readable\n" # previous result keep in unicode string
s2="fb is not \xfb" # data read from binary file
s += s2
print s # method 1
print s.encode('utf-8') # method 2
print s.encode('utf-8','ignore') # method 3
print s.decode('iso8859-1') # method 4
# method 1-4 display following error message
#UnicodeDecodeError: 'ascii' codec can't decode byte 0xfb
# in position 0: ordinal not in range(128)
f = open('out.txt','wb')
f.write(s)
I just want to print out the 0xfb.
I should describe more here. The key is 's += s2'.
Where s will keep my previous decoded string.
And the s2 is next string which should append into s.
If I modified as following, it occurs on write file.
s=u"readable\n"
s2="fb is not \xfb"
s += s2.decode('cp437')
print s
f=open('out.txt','wb')
f.write(s)
# UnicodeEncodeError: 'ascii' codec can't encode character
# u'\u221a' in position 1: ordinal not in range(128)
I wish the result of out.txt is
readable
fb is not \xfb
or
readable
fb is not 0xfb
[Solution]
#! /usr/bin/env python
# encoding:utf-8
import sys
import binascii
def fmtstr(s):
r = ''
for c in s:
if ord(c) > 128:
r = ''.join([r, "\\x"+binascii.hexlify(c)])
else:
r = ''.join([r, c])
return r
s=u"readable"
s2="fb is not \xfb"
s += fmtstr(s2)
print s
f=open('out.txt','wb')
f.write(s)
I strongly suspect that your code is actually erroring out on the previous line: the s += s2 one. s2 is just a series of bytes, which can't be arbitrarily tacked on to a unicode object (which is instead a series of code points).
If you had intended the '\xfb' to represent U+FB, LATIN SMALL LETTER U WITH CIRCUMFLEX, it would have been better to assign it like this instead:
s2 = u"\u00fb"
But you said that you just want to print out \xHH codes for control characters. If you just want it to be something humans can understand which still makes it apparent that special characters are in a string, then repr may be enough. First, don't have s be a unicode object, because you're treating your strings here as a series of bytes, not a series of code points.
s = s.encode('utf-8')
s += s2
print repr(s)
Finally, if you don't want the extra quotes on the outside that repr adds, for nice pretty printing or whatever, there's not a simple builtin way to do that in Python (that I know of). I've used something like this before:
import re
controlchars_re = re.compile(r'[\x00-\x31\x7f-\xff]')
def _show_control_chars(match):
txt = repr(match.group(0))
return txt[1:-1]
def escape_special_characters(s):
return controlchars_re.sub(_show_control_chars, s.replace('\\', '\\\\'))
You can pretty easily tweak the controlchars_re regex to define which characters you care about escaping.

Categories