Fixed-digit base64 encode and decode in Python - python

I'm trying to encode and decode a base64 string. It works fine normally, but if I try to restrict the hash to 6 digits, I get an error on decoding:
from base64 import b64encode
from base64 import b64decode
s="something"
base 64 encode/decode:
# Encode:
hash = b64encode(s)
# Decode:
dehash = b64decode(hash)
print dehash
(works)
6-digit base 64 encode/decode:
# Encode:
hash = b64encode(s)[:6]
# Decode:
dehash = b64decode(hash)
print dehash
TypeError: Incorrect padding
What am I doing wrong?
UPDATE:
Based on Mark's answer, I added padding to the 6-digit hash to make it divisible by 4:
hash = hash += "=="
But now the decode result = "some"
UPDATE 2
Wow that was stupid ..

Base64 by definition requires padding on the input if it does not decode into an integral number of bytes on the output. Every 4 base64 characters gets turned into 3 bytes. Your input length does not divide evenly by 4, thus there's an error.
Wikipedia has a good description of the specifics of Base64.

Related

Base64 decoding and encoding give different results

I have the two following encoded string :
base64_str1 = 'eyJzZWN0aW9uX29mZnNldCI6MiwiaXRlbXNfb2Zmc2V0IjozNiwidmVyc2lvbiI6MX0%3D'
base64_str2 = 'eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ%3D%3D'
Using Base64 online decoder/encoder , the results are as follow (which are the right results) :
base64_str1_decoded = '{"section_offset":2,"items_offset":36,"version":1}7'
base64_str2_decoded = '{"section_offset":0,"items_offset":0,"version":1}'
However, when I tried to encode base64_str1_decoded or base64_str2_decoded back to Base64, I'm not able to obtain the initial base64 strings.
For instance, the ouput for the following code :
base64_str2_decoded = '{"section_offset":0,"items_offset":0,"version":1}'
recoded_str2 = base64.b64encode(bytes(base64_str2_decoded, 'utf-8'))
print(recoded_str2)
# output = b'eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ=='
# expected_output = eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ%3D%3D
I tried changing the encoding scheme but can't seem to make it work.
Notice that extra 7 at the end of base64_str1_decoded? That's because your input strings are incorrect. They have escape codes required for URLs. %3D is an escape code for =, which is what should be entered into the online decoder instead. You'll notice the 2nd string in the decoder has an extra ÃÜ on the next line you haven't shown due to using %3D%3D instead of ==. That online decoder is allowing invalid base64 to be decoded.
To correctly decode in Python use urllib.parse.unquote on the string to remove the escaping first:
import base64
import urllib.parse
base64_str1 = 'eyJzZWN0aW9uX29mZnNldCI6MiwiaXRlbXNfb2Zmc2V0IjozNiwidmVyc2lvbiI6MX0%3D'
base64_str2 = 'eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ%3D%3D'
# Demonstrate Python decoder detects invalid B64 encoding
try:
print(base64.b64decode(base64_str1))
except Exception as e:
print('Exception:', e)
try:
print(base64.b64decode(base64_str2))
except Exception as e:
print('Exception:', e)
# Decode after unquoting...
base64_str1_decoded = base64.b64decode(urllib.parse.unquote(base64_str1))
base64_str2_decoded = base64.b64decode(urllib.parse.unquote(base64_str2))
print(base64_str1_decoded)
print(base64_str2_decoded)
# See valid B64 encoding.
recoded_str1 = base64.b64encode(base64_str1_decoded)
recoded_str2 = base64.b64encode(base64_str2_decoded)
print(recoded_str1)
print(recoded_str2)
Output:
Exception: Invalid base64-encoded string: number of data characters (69) cannot be 1 more than a multiple of 4
Exception: Incorrect padding
b'{"section_offset":2,"items_offset":36,"version":1}'
b'{"section_offset":0,"items_offset":0,"version":1}'
b'eyJzZWN0aW9uX29mZnNldCI6MiwiaXRlbXNfb2Zmc2V0IjozNiwidmVyc2lvbiI6MX0='
b'eyJzZWN0aW9uX29mZnNldCI6MCwiaXRlbXNfb2Zmc2V0IjowLCJ2ZXJzaW9uIjoxfQ=='
Note that the b'' notation is Python's indication that the object is a byte string as opposed to a Unicode string and is not part of the string itself.

AES Python - Different output than expected

I try to make a AES ecryption script for a HEX file in python, which should then be decrypted on a microcontroller. At the moment I want to encrypt a test array (hex, 16-byte), which I already did successfully on the microcontroller, but phyton seems to do something different.
I expected the 'expected' output when encrypted, but it gives me a much larger output, but the AES block size is 16 byte, so it should work. When I have a look at the size of the iv or password after unhexlify, it states 49, that seems totally wrong. What am I doing wrong here?
from base64 import b64encode, b64decode
from binascii import unhexlify
from Crypto.Cipher import AES
from Crypto.Util.Padding import pad, unpad
# Press the green button in the gutter to run the script.
if __name__ == '__main__':
iv = "000102030405060708090A0B0C0D0E0F"
password = "2b7e151628aed2a6abf7158809cf4f3c"
msg = "6bc1bee22e409f96e93d7e117393172a"
expected = "7649abac8119b246cee98e9b12e9197d"
print(f"IV: {iv}")
print(f"PWD: {password}")
print(f"MSG: {msg}")
# Convert Hex String to Binary
iv = unhexlify(iv)
password = unhexlify(password)
# Pad to AES Block Size
msg = pad(msg.encode(), AES.block_size)
print(f"IV SIZE: {iv.__sizeof__()}")
print(f"PSW SIZE: {password.__sizeof__()}")
print(f"MSG SIZE: {msg.__sizeof__()}")
# Encipher Text
cipher = AES.new(password, AES.MODE_CBC, iv)
cipher_text = cipher.encrypt(msg)
print(cipher_text)
# Encode Cipher_text as Base 64 and decode to String
out = b64encode(cipher_text).decode('utf-8')
print(f"OUT: {out}")
# Decipher cipher text
decipher = AES.new(password, AES.MODE_CBC, iv)
# UnPad Based on AES Block Size
plaintext = unpad(decipher.decrypt(b64decode(out)), AES.block_size).decode('utf-8')
print(f'PT: {plaintext}')
Edit: When I use len(IV) instead of size, it gives the correct length. The problem is still, that the message length is somehow 48-bytes, although the AES.block_size is 16 bytes
The expected value for the ciphertext is produced when 1st the plaintext is not padded (the padding is not necessary because the length of the plaintext satisfies the length criterion, according to which the length must be an integer multiple of the blocksize, 16 bytes for AES), 2nd the message is hex decoded and 3rd the ciphertext is hex encoded.
I.e. you have to replace
msg = pad(msg.encode(), AES.block_size)
with
msg = unhexlify(msg)
and hex encode the ciphertext for the output (to get the expected value):
print(hexlify(cipher_text).decode('utf-8'))
Similarly, no unpadding may be performed during decryption and the message must be hex encoded and not UTF-8 decoded.
I.e. you have to replace
plaintext = unpad(decipher.decrypt(b64decode(out)), AES.block_size).decode('utf-8')
with
plaintext = hexlify(decipher.decrypt(b64decode(out))).decode('utf-8')
Regarding the length, you have already recognized that __sizeof__() is the wrong function and that len() should be used.

Converting RSA signature to String

I'm creating my RSA Signature like this.
transactionStr = json.dumps(GenesisTransaction())
signature = rsa.sign(transactionStr.encode(), client.privateKey, 'SHA-1')
But I'm unable to get it to a string so I can save it.
I have tried decoding it using utf8
signature.decode("utf8")
but I get the error "'utf-8' codec can't decode byte 0xe3 in position 2"
any way I can do this?
A RSA signature looks like this
b'aL\xe3\xf4\xbeEM\xc4\x9e\n\x9e\xf4M`\xba\x85*\x13\xd52x\xd9\\\xe8F\x1c\x07\x90[/\x9dy\xce\xa9IV\x89\xe0\xcd9\\_3\x1e\xaa\x80\xdea\xd1\xbem/\x8e\x91\xbd\x13\x12o\x8c\xed\xf6\x89\xb5\x0b'
.decode('utf8') is used to decode text encoded in UTF8, not arbitrary bytes. Convert the byte string to a hexadecimal string instead:
>>> sig = b'aL\xe3\xf4\xbeEM\xc4\x9e\n\x9e\xf4M`\xba\x85*\x13\xd52x\xd9\\\xe8F\x1c\x07\x90[/\x9dy\xce\xa9IV\x89\xe0\xcd9\\_3\x1e\xaa\x80\xdea\xd1\xbem/\x8e\x91\xbd\x13\x12o\x8c\xed\xf6\x89\xb5\x0b'
>>> s = sig.hex()
>>> s
'614ce3f4be454dc49e0a9ef44d60ba852a13d53278d95ce8461c07905b2f9d79cea9495689e0cd395c5f331eaa80de61d1be6d2f8e91bd13126f8cedf689b50b'
To convert back, if needed:
>>> b = bytes.fromhex(s)
>>> b
b'aL\xe3\xf4\xbeEM\xc4\x9e\n\x9e\xf4M`\xba\x85*\x13\xd52x\xd9\\\xe8F\x1c\x07\x90[/\x9dy\xce\xa9IV\x89\xe0\xcd9\\_3\x1e\xaa\x80\xdea\xd1\xbem/\x8e\x91\xbd\x13\x12o\x8c\xed\xf6\x89\xb5\x0b'
>>> b==sig
True

Best way to output a binary sequence in Python

I have a binary sequence, for example: 10010111010101. I need to output this sequence to a file and then read it later but I want it to be compressed as much as possible, what is the easiest way to do this?
I have tried to take every 8 bits (byte) in the sequence together and output the byte value and then when I read it later, I cut it bit by bit, is there an easier way? or a module that does this readily?
The best textual encoding for binary data is either base64 or ascii85.
ASCII85
import base64
import sys
# Length of the binary string in bytes (32 bytes will let you have a 256 digit binary character stream)
# Keep it as low as possible to save space
length = 32
binary_string = input('Enter binary string : ')
integer = eval('0b'+binary_string)
data = integer.to_bytes(length, sys.byteorder, signed=False)
print(base64.a85encode(data).decode('utf-8'))
Base64
import base64
import sys
# Length of the binary string in bytes (32 bytes will let you have a 256 digit binary character stream)
# Keep it as low as possible to save space
length = 32
binary_string = input('Enter binary string : ')
integer = eval('0b'+binary_string)
data = integer.to_bytes(length, sys.byteorder, signed=False)
print(base64.b64encode(data).decode('utf-8'))
WARNING: Typically sys.byteorder is little-endian, so you might run into problems when you try to load up the file.

How to decode base64 in python3

I have a base64 encrypt code, and I can't decode in python3.5
import base64
code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA" # Unencrypt is 202cb962ac59075b964b07152d234b70
base64.b64decode(code)
Result:
binascii.Error: Incorrect padding
But same website(base64decode) can decode it,
Please anybody can tell me why, and how to use python3.5 decode it?
Thanks
Base64 needs a string with length multiple of 4. If the string is short, it is padded with 1 to 3 =.
import base64
code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA="
base64.b64decode(code)
# b'admin:202cb962ac59075b964b07152d234b70'
According to this answer, you can just add the required padding.
code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA"
b64_string = code
b64_string += "=" * ((4 - len(b64_string) % 4) % 4)
base64.b64decode(b64_string) #'admin:202cb962ac59075b964b07152d234b70'
I tried the other way around. If you know what the unencrypted value is:
>>> import base64
>>> unencoded = b'202cb962ac59075b964b07152d234b70'
>>> encoded = base64.b64encode(unencoded)
>>> print(encoded)
b'MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA='
>>> decoded = base64.b64decode(encoded)
>>> print(decoded)
b'202cb962ac59075b964b07152d234b70'
Now you see the correct padding. b'MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA=
It actually seems to just be that code is incorrectly padded (code is incomplete)
import base64
code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA"
base64.b64decode(code+"=")
returns b'admin:202cb962ac59075b964b07152d234b70'

Categories