How to decode base64 in python3

How to decode base64 in python3 - python

I have a base64 encrypt code, and I can't decode in python3.5
import base64
code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA" # Unencrypt is 202cb962ac59075b964b07152d234b70
base64.b64decode(code)
Result:
binascii.Error: Incorrect padding
But same website(base64decode) can decode it,
Please anybody can tell me why, and how to use python3.5 decode it?
Thanks

Base64 needs a string with length multiple of 4. If the string is short, it is padded with 1 to 3 =.
import base64
code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA="
base64.b64decode(code)
# b'admin:202cb962ac59075b964b07152d234b70'

According to this answer, you can just add the required padding.
code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA"
b64_string = code
b64_string += "=" * ((4 - len(b64_string) % 4) % 4)
base64.b64decode(b64_string) #'admin:202cb962ac59075b964b07152d234b70'

I tried the other way around. If you know what the unencrypted value is:
>>> import base64
>>> unencoded = b'202cb962ac59075b964b07152d234b70'
>>> encoded = base64.b64encode(unencoded)
>>> print(encoded)
b'MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA='
>>> decoded = base64.b64decode(encoded)
>>> print(decoded)
b'202cb962ac59075b964b07152d234b70'
Now you see the correct padding. b'MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA=

It actually seems to just be that code is incorrectly padded (code is incomplete)
import base64
code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA"
base64.b64decode(code+"=")
returns b'admin:202cb962ac59075b964b07152d234b70'

Related

How to decode a string representation of a bytes object?

I have a string which includes encoded bytes inside it:
str1 = "b'Output file \xeb\xac\xb8\xed\x95\xad\xeb\xb6\x84\xec\x84\x9d.xlsx Created'"
I want to decode it, but I can't since it has become a string. Therefore I want to ask whether there is any way I can convert it into
str2 = b'Output file \xeb\xac\xb8\xed\x95\xad\xeb\xb6\x84\xec\x84\x9d.xlsx Created'
Here str2 is a bytes object which I can decode easily using
str2.decode('utf-8')
to get the final result:
'Output file 문항분석.xlsx Created'

You could use ast.literal_eval:
>>> print(str1)
b'Output file \xeb\xac\xb8\xed\x95\xad\xeb\xb6\x84\xec\x84\x9d.xlsx Created'
>>> type(str1)
<class 'str'>
>>> from ast import literal_eval
>>> literal_eval(str1).decode('utf-8')
'Output file 문항분석.xlsx Created'

Based on the SyntaxError mentioned in your comments, you may be having a testing issue when attempting to print due to the fact that stdout is set to ascii in your console (and you may also find that your console does not support some of the characters you may be trying to print). You can try something like the following to set sys.stdout to utf-8 and see what your console will print (just using string slice and encode below to get bytes rather than the ast.literal_eval approach that has already been suggested):
import codecs
import sys
sys.stdout = codecs.getwriter('utf-8')(sys.stdout.buffer)
s = "b'Output file \xeb\xac\xb8\xed\x95\xad\xeb\xb6\x84\xec\x84\x9d.xlsx Created'"
b = s[2:-1].encode().decode('utf-8')

A simple way is to assume that all the characters of the initial strings are in the [0,256) range and map to the same Unicode value, which means that it is a Latin1 encoded string.
The conversion is then trivial:
str1[2:-1].encode('Latin1').decode('utf8')

Finally I have found an answer where i use a function to cast a string to bytes without encoding.Given string
str1 = "b'Output file \xeb\xac\xb8\xed\x95\xad\xeb\xb6\x84\xec\x84\x9d.xlsx Created'"
now i take only actual encoded text inside of it
str1[2:-1]
and pass this to the function which convert the string to bytes without encoding its values
import struct
def rawbytes(s):
"""Convert a string to raw bytes without encoding"""
outlist = []
for cp in s:
num = ord(cp)
if num < 255:
outlist.append(struct.pack('B', num))
elif num < 65535:
outlist.append(struct.pack('>H', num))
else:
b = (num & 0xFF0000) >> 16
H = num & 0xFFFF
outlist.append(struct.pack('>bH', b, H))
return b''.join(outlist)
So, calling the function would convert it to bytes which then is decoded
rawbytes(str1[2:-1]).decode('utf-8')
will give the correct output
'Output file 문항분석.xlsx Created'

Python adds extra to crypt result

I'm trying to create an API with token to communicate between an Raspberry Pi and a Webserver. Right now i'm tring to generate an Token with Python.
from Crypto.Cipher import AES
import base64
import os
import time
import datetime
import requests
BLOCK_SIZE = 32
BLOCK_SZ = 14
#!/usr/bin/python
salt = "123456789123" # Zorg dat de salt altijd even lang is! (12 Chars)
iv = "1234567891234567" # Zorg dat de salt altijd even lang is! (16 Chars)
currentDate = time.strftime("%d%m%Y")
currentTime = time.strftime("%H%M")
PADDING = '{'
pad = lambda s: s + (BLOCK_SIZE - len(s) % BLOCK_SIZE) * PADDING
EncodeAES = lambda c, s: base64.b64encode(c.encrypt(pad(s)))
DecodeAES = lambda c, e: c.decrypt(base64.b64decode(e)).rstrip(PADDING)
secret = salt + currentTime
cipher=AES.new(key=secret,mode=AES.MODE_CBC,IV=iv)
encode = currentDate
encoded = EncodeAES(cipher, encode)
print (encoded)
The problem is that the output of the script an exta b' adds to every encoded string.. And on every end a '
C:\Python36-32>python.exe encrypt.py
b'Qge6lbC+SulFgTk/7TZ0TKHUP0SFS8G+nd5un4iv9iI='
C:\Python36-32>python.exe encrypt.py
b'DTcotcaU98QkRxCzRR01hh4yqqyC92u4oAuf0bSrQZQ='
Hopefully someone can explain what went wrong.
FIXED!
I was able to fix it to decode it to utf-8 format.
sendtoken = encoded.decode('utf-8')

You are running Python 3.6, which uses Unicode (UTF-8) for string literals. I expect that the EncodeAES() function returns an ASCII string, which Python is indicating is a bytestring rather than a Unicode string by prepending the b to the string literal it prints.
You could strip the b out of the output post-Python, or you could print(str(encoded)), which should give you the same characters, since ASCII is valid UTF-8.
EDIT:
What you need to do is decode the bytestring into UTF-8, as mentioned in the answer and in a comment above. I was wrong about str() doing the conversion for you, you need to call decode('UTF-8') on the bytestring you wish to print. That converts the string into the internal UTF-8 representation, which then prints correctly.

Convert Python Bytes to String Without Encoding

I am using Python 3.6 and I have an image as bytes:
img = b'\xff\xd8\xff\xe0\x00\x10JFIF\x00'
I need to convert the bytes into a string without encoding so it looks like:
raw_img = '\xff\xd8\xff\xe0\x00\x10JFIF\x00'
The goal is to incorporate this into an html image tag:
<img src="'data:image/png;base64," + base64.b64encode(raw_img) + "' />"

Why not just call str and remove the b after?
In:
str(img)[2:-1]
Out:
'\xff\xd8\xff\xe0\x00\x10JFIF\x00'

img.decode("utf-8")
You can decode the variable with the above. Then convert it to base64.
"<img src='data:image/png;base64,{}'/>".format( base64.b64encode(img.decode("utf-8")) )
UPDATED:
What you really want is this:
raw_img = repr(img)
"<img src='data:image/png;base64,{}'/>".format( base64.b64encode(raw_img) )

Since you just need to convert the image to string why not just use str() function?
>>> img = b'\xff\xd8\xff\xe0\x00\x10JFIF\x00'
>>> type(img)
<class 'bytes'>
>>>
>>>raw_img = str(img)
>>> type(str(img))
<class 'str'>
>>>
img is in bytes, but when you use str() it is converted to type string.
An encoding can also be specified https://docs.python.org/3/library/stdtypes.html#str, which would be a more natural way to do things:
str(img, encoding='ansi')
As suggested in these answers

I didn't solve this but here's some research on it(3Feb2022): This encoding is latin (or latin-1) and it's hard to print because Python wants to print it in another format. But for your case they should be the same. And for a data:image/png;base64 base64 code should be used.
My test code:
import codecs
img = b"\xff\xd8\xff\xe0\x00\x10JFIF\x00"
desired = "\xff\xd8\xff\xe0\x00\x10JFIF\x00"
str_decode = img.decode("latin-1")
str_decode_2 = str(img, "latin-1")
codecs_decode = codecs.decode(img, "latin-1")
print(desired.encode("latin-1") == img)
print(str_decode == desired)
print(str_decode == str_decode_2)
print(str_decode == codecs_decode)
print("desired:", repr(desired)) ##devprint
This gives 4 True and a desired: ÿØÿà\x00\x10JFIF\x00 with Python 3.10.

I'm pretty sure img is the byte string that you want to pass to base64.b64encode:
>>> import base64
>>> img = b'\xff\xd8\xff\xe0\x00\x10JFIF\x00'
>>> base64.b64encode(img)
b'/9j/4AAQSkZJRgA='
If you want to incorporate that into an HTML string, use
html = b'<img src="data:image/png;base64,' + base64.b64encode(img) + b' />'

I've solved it (2022 - bit late to the party...)
If you try img_raw.decode() you get the
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte error
But if you leave img_raw as a binary string and pass it into b64encode and then decode it, it doesn't have the UnicodeDecodeError, and you can pass it in as a data string to your image tag.
base64.b64encode(raw_image).decode()

Base58Check encoding for Bitcoin addresses too long

I'm trying to create a Bitcoin address with Python. I got the hashing part right, but I have some trouble with the Base58Check encoding. I use this package:
https://pypi.python.org/pypi/base58
Here is an example:
import base58
unencoded_string = "00010966776006953D5567439E5E39F86A0D273BEED61967F6"
encoded_string = base58.b58encode(unencoded_string)
print(encoded_string)
The output is:
bSLesHPiFV9jKNeNbUiMyZGJm45zVSB8bSdogLWCmvs88wxHjEQituLz5daEGCrHE7R7
According to the technical background for creating Bitcoin addresses the RIPEMD-160 hash above should be "16UwLL9Risc3QfPqBUvKofHmBQ7wMtjvM". That said, my output is wrong and obviously too long. Does anyone know what I did wrong?
EDIT:
I added a decoding to hex (.decode("hex")):
import base58
unencoded_string = "00010966776006953D5567439E5E39F86A0D273BEED61967F6"
encoded_string = base58.b58encode(unencoded_string.decode("hex"))
print(encoded_string)
The output looks better now:
1csU3KSAQMEYLPudM8UWJVxFfptcZSDvaYY477
Yet, it is still wrong. Does it have to be a byte encoding? How do you do that in Python?
EDIT2:
Fixed it now (thanks to Arpegius). Added str(bytearray.fromhex( hexstring )) to my code (in Python 2.7):
import base58
hexstring= "00010966776006953D5567439E5E39F86A0D273BEED61967F6"
unencoded_string = str(bytearray.fromhex( hexstring ))
encoded_string= base58.b58encode(unencoded_string)
print(encoded_string)
Output:
16UwLL9Risc3QfPqBUvKofHmBQ7wMtjvM

In base58.b58encode need a bytes (python2 str) not a hex. You need to decode it first:
In [1]: import base58
In [2]: hexstring= "00010966776006953D5567439E5E39F86A0D273BEED61967F6"
In [3]: unencoded_string = bytes.fromhex(hexstring)
In [4]: encoded_string= base58.b58encode(unencoded_string)
In [5]: print(encoded_string)
16UwLL9Risc3QfPqBUvKofHmBQ7wMtjvM
In python 2.7 you can use str(bytearray.fromhex( hexstring )).

Fixed-digit base64 encode and decode in Python

I'm trying to encode and decode a base64 string. It works fine normally, but if I try to restrict the hash to 6 digits, I get an error on decoding:
from base64 import b64encode
from base64 import b64decode
s="something"
base 64 encode/decode:
# Encode:
hash = b64encode(s)
# Decode:
dehash = b64decode(hash)
print dehash
(works)
6-digit base 64 encode/decode:
# Encode:
hash = b64encode(s)[:6]
# Decode:
dehash = b64decode(hash)
print dehash
TypeError: Incorrect padding
What am I doing wrong?
UPDATE:
Based on Mark's answer, I added padding to the 6-digit hash to make it divisible by 4:
hash = hash += "=="
But now the decode result = "some"
UPDATE 2
Wow that was stupid ..

Base64 by definition requires padding on the input if it does not decode into an integral number of bytes on the output. Every 4 base64 characters gets turned into 3 bytes. Your input length does not divide evenly by 4, thus there's an error.
Wikipedia has a good description of the specifics of Base64.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to decode base64 in python3 - python

Base64 needs a string with length multiple of 4. If the string is short, it is padded with 1 to 3 =. import base64 code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA=" base64.b64decode(code) # b'admin:202cb962ac59075b964b07152d234b70'

According to this answer, you can just add the required padding. code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA" b64_string = code b64_string += "=" * ((4 - len(b64_string) % 4) % 4) base64.b64decode(b64_string) #'admin:202cb962ac59075b964b07152d234b70'

It actually seems to just be that code is incorrectly padded (code is incomplete) import base64 code = "YWRtaW46MjAyY2I5NjJhYzU5MDc1Yjk2NGIwNzE1MmQyMzRiNzA" base64.b64decode(code+"=") returns b'admin:202cb962ac59075b964b07152d234b70'

Related

How to decode a string representation of a bytes object?

Python adds extra to crypt result

Convert Python Bytes to String Without Encoding

Base58Check encoding for Bitcoin addresses too long

Fixed-digit base64 encode and decode in Python

Categories

Resources