decode base64 like string with different index table(s) - python

My problem is, that I have something encoded (base64 like) with a differnet index table:
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+/
instead of
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
so when I use base64.b64decode() it gives me a wrong result.
Is there a way to set this table durring conversion (as a parameter maybe)?
Or should I "convert" the wrong base64 string, I mean replace 0 to A, 1 to B, etc... and than use base64decode? if so what is the best and fast workaround for this?
update1: I use this, which works, but looks a bit slow, and unprofessional. :)
def correctbase64(str):
dicta = [ ['0','A'], ['1','B'], ['2','C'], ['3','D'], ['4','E'], ['5','F'], ['6','G'], ['7','H'], ['8','I'], ['9','J'], ['A','K'], ['B','L'], ['C','M'], ['D','N'], ['E','O'], ['F','P'], ['G','Q'], ['H','R'], ['I','S'], ['J','T'], ['K','U'], ['L','V'], ['M','W'], ['N','X'], ['O','Y'], ['P','Z'], ['Q','a'], ['R','b'], ['S','c'], ['T','d'], ['U','e'], ['V','f'], ['W','g'], ['X','h'], ['Y','i'], ['Z','j'], ['a','k'], ['b','l'], ['c','m'], ['d','n'], ['e','o'], ['f','p'], ['g','q'], ['h','r'], ['i','s'], ['j','t'], ['k','u'], ['l','v'], ['m','w'], ['n','x'], ['o','y'], ['p','z'], ['q','0'], ['r','1'], ['s','2'], ['t','3'], ['u','4'], ['v','5'], ['w','6'], ['x','7'], ['y','8'], ['z','9'] ]
l = list(str)
for i in range(len(l)):
for c in dicta:
if l[i] == c[0]:
l[i] = c[1]
break
return "".join(l)

Something like this should work (WARNING: untested code; may be full of mistakes):
import string
my_base64chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+/"
std_base64chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
s = s.translate(string.maketrans(my_base64chars, std_base64chars))
data = base64.b64decode(s)
It isn't possible to make the standard base64 functions (or the lower-level ones in binascii that they call) use a custom table.

You can use translate() and maketrans():
from string import maketrans
base64fixTable = maketrans("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+/", "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/");
def correctbase64(str):
return str.translate(base64fixTable)

print "Hello Reverse Engineering!\n"
import string
import base64
my_base64chars = "WXYZlabcd3fghijko12e456789ABCDEFGHIJKL+/MNOPQRSTUVmn0pqrstuvwxyz"
std_base64chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
s = 'whatever encoded message you have that used my_base64chars index'
c = s.translate(string.maketrans(my_base64chars, std_base64chars))
data = base64.b64decode(c)
print (data)

Use maketrans to build a translation table and then translate from the first alphabet to the second. Then base64 decode.
import string
import base64
def decode(str):
#make a translation table.
table = string.maketrans(
#your alphabet
string.digits + string.uppercase + string.lowercase + "+/",
#the original alphabet
string.uppercase + string.lowercase + string.digits + "+/"
)
#translate
str.translate(s, table)
#finally decode
return base64.b64decode(str)

this will handle error TypeError: Incorrect padding
from string import maketrans
import base64
STANDARD_ALPHABET = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'
CUSTOM_ALPHABET = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz+/'
def correctbase64(input):
DECODE_TRANS = maketrans(CUSTOM_ALPHABET, STANDARD_ALPHABET)
newStr = input.translate(DECODE_TRANS)
# Add '=' char at the end of the string
newStr += '='
return base64.b64decode(newStr)
print custom_base64decode('x/Tcw/g') # hello

Related

Convert utf-8 string to base64

I converted base64 str1= eyJlbXBsb3llciI6IntcIm5hbWVcIjpcInVzZXJzX2VtcGxveWVyXCIsXCJhcmdzXCI6XCIxMDQ5NTgxNjI4MzdcIn0ifQ
to str2={"employer":"{\"name\":\"users_employer\",\"args\":\"104958162837\"}"}
with help of http://www.online-decoder.com/ru
I want to convert str2 to str1 with help of python. My code:
import base64
data = """{"employer":"{"name":"users_employer","args":"104958162837"}"}"""
encoded_bytes = base64.b64encode(data.encode("utf-8"))
encoded_str = str(encoded_bytes, "utf-8")
print(encoded_str)
The code prints str3=
eyJlbXBsb3llciI6InsibmFtZSI6InVzZXJzX2VtcGxveWVyIiwiYXJncyI6IjEwNDk1ODE2MjgzNyJ9In0=
What should I change in code to print str1 instead of str3 ?
I tried
{"employer":"{\"name\":\"users_employer\",\"args\":\"104958162837\"}"}
and
{"employer":"{"name":"users_employer","args":"104958162837"}"}
but result is the same
The problem is that \" is a python escape sequence for the single character ". The fix is to keep the backslashes and use a raw string (note the "r" at the front).
data = r"""{"employer":"{\"name\":\"users_employer\",\"args\":\"104958162837\"}"}"""
This will be the original string, except that python pads base64 numbers to 4 character multiples with the "=" sign. Strip the padding and you have the original.
import base64
str1 = "eyJlbXBsb3llciI6IntcIm5hbWVcIjpcInVzZXJzX2VtcGxveWVyXCIsXCJhcmdzXCI6XCIxMDQ5NTgxNjI4MzdcIn0ifQ"
str1_decoded = base64.b64decode(str1 + "="*divmod(len(str1),4)[1]).decode("ascii")
print("str1", str1_decoded)
data = r"""{"employer":"{\"name\":\"users_employer\",\"args\":\"104958162837\"}"}"""
encoded_bytes = base64.b64encode(data.encode("utf-8"))
encoded_str = str(encoded_bytes.rstrip(b"="), "utf-8")
print("my encoded", encoded_str)
print("same?", str1 == encoded_str)
The results you want are just base64.b64decode(str1 + "=="). See this post for more information on the padding.

Python; Encode to MD5 (hashlib) shows error: "NoneType"

I wrote a code that will generate random password for 5 times, and I would like to encode that passwords to MD5, but when I try to encode it, it will show an error that 'NoneType' object has no attribute 'encode' and I dont know how to change the code to avoid this error. Sorry I'm beginner in python... My Code is below. Thanks for help
import random, string
import hashlib
length = 6
chars = string.ascii_letters + string.digits
def ff():
rnd = random.SystemRandom()
a = (''.join(rnd.choice(chars) for i in range(length)))
c = a
return(c)
def ff2():
for i in range(5):
print(ff(),' ')
str = ff2()
result = hashlib.md5(str.encode())
print("The hexadecimal equivalent of hash is : ", end ="")
print(result.hexdigest())
The function ff2 doesn’t return anything so str will be of type NoneType.
IIUC, your ff2() function should call ff() five times but it should not print out the result. It should accumulate them in a string and return the string. Something like this perhaps:
def ff2():
l = []
for i in range(5):
l.append(ff())
return " ".join(l)
Here we accumulate the results of the five calls to ff() in a list l and then
use the string method join() to join them together.
The above returns a string that is the concatenation of the five strings that the calls to ff() returned, with spaces separating them. If you want commas as separators, just replace the return " ".join(l) with return ",".join(l).

What is wrong with my decryption function?

import base64
import re
def encrypt(cleartext, key):
to_return = bytearray(len(cleartext))
for i in xrange(len(cleartext)):
to_return[i] = ord(cleartext[i]) ^ ord(key)
return base64.encodestring(str(to_return))
def decrypt(ciphertxt,key):
x = base64.decodestring(re.escape(ciphertxt))
to_return = bytearray(len(x))
for i in xrange(len(x)):
to_return[i] = ord(x[i]) ^ ord(key)
while to_return[i]>127:
to_return[i]-=127
return to_return
When I encrypt bob then use my decrypt function it returns bob. However for longer things like paragraphs that when encrypted, the cipher text contains \ slashes it does not work. I do not get back ascii characters or base64 characters I get back weird chinese characters or square characters. Please any insight to point me in the right direction will help.
As jasonharper said, you're mangling your Base64 data by calling re.escape on it. Once you get rid of that, your code should be fine. I haven't tested it extensively, but it works correctly for me with multi-line text.
You should also get rid of this from your decrypt function:
while to_return[i]>127:
to_return[i]-=127
It won't do anything if the original cleartext is valid ASCII, but it will mess up the decoding if the cleartext does contain bytes > 127.
However, those functions could be a little more efficient.
FWIW, here's a version that works correctly on both Python 2 and Python 3. This code isn't as efficient as it could be on Python 3, due to the compromises made to deal with the changes in text and bytes handling in Python 3.
import base64
def encrypt(cleartext, key):
buff = bytearray(cleartext.encode())
key = ord(key)
buff = bytearray(c ^ key for c in buff)
return base64.b64encode(bytes(buff))
def decrypt(ciphertext, key):
buff = bytearray(base64.b64decode(ciphertext))
key = ord(key)
buff = bytearray(c ^ key for c in buff)
return buff.decode()
# Test
s = 'This is a test\nof XOR encryption'
key = b'\x53'
coded = encrypt(s, key)
print(coded)
plain = decrypt(coded, key)
print(plain)
Python 3 output
b'Bzs6IHM6IHMycyc2ICdZPDVzCxwBczY9MCEqIyc6PD0='
This is a test
of XOR encryption

Read Null terminated string in python

I'm trying to read a null terminated string but i'm having issues when unpacking a char and putting it together with a string.
This is the code:
def readString(f):
str = ''
while True:
char = readChar(f)
str = str.join(char)
if (hex(ord(char))) == '0x0':
break
return str
def readChar(f):
char = unpack('c',f.read(1))[0]
return char
Now this is giving me this error:
TypeError: sequence item 0: expected str instance, int found
I'm also trying the following:
char = unpack('c',f.read(1)).decode("ascii")
But it throws me:
AttributeError: 'tuple' object has no attribute 'decode'
I don't even know how to read the chars and add it to the string, Is there any proper way to do this?
Here's a version that (ab)uses __iter__'s lesser-known "sentinel" argument:
with open('file.txt', 'rb') as f:
val = ''.join(iter(lambda: f.read(1).decode('ascii'), '\x00'))
How about:
myString = myNullTerminatedString.split("\x00")[0]
For example:
myNullTerminatedString = "hello world\x00\x00\x00\x00\x00\x00"
myString = myNullTerminatedString.split("\x00")[0]
print(myString) # "hello world"
This works by splitting the string on the null character. Since the string should terminate at the first null character, we simply grab the first item in the list after splitting. split will return a list of one item if the delimiter doesn't exist, so it still works even if there's no null terminator at all.
It also will work with byte strings:
myByteString = b'hello world\x00'
myStr = myByteString.split(b'\x00')[0].decode('ascii') # "hello world" as normal string
If you're reading from a file, you can do a relatively larger read - estimate how much you'll need to read to find your null string. This is a lot faster than reading byte-by-byte. For example:
resultingStr = ''
while True:
buf = f.read(512)
resultingStr += buf
if len(buf)==0: break
if (b"\x00" in resultingStr):
extraBytes = resultingStr.index(b"\x00")
resultingStr = resultingStr.split(b"\x00")[0]
break
# now "resultingStr" contains the string
f.seek(0 - extraBytes,1) # seek backwards by the number of bytes, now the pointer will be on the null byte in the file
# or f.seek(1 - extraBytes,1) to skip the null byte in the file
(edit version 2, added extra way at the end)
Maybe there are some libraries out there that can help you with this, but as I don't know about them lets attack the problem at hand with what we know.
In python 2 bytes and string are basically the same thing, that change in python 3 where string is what in py2 is unicode and bytes is its own separate type, which mean that you don't need to define a read char if you are in py2 as no extra work is required, so I don't think you need that unpack function for this particular case, with that in mind lets define the new readString
def readString(myfile):
chars = []
while True:
c = myfile.read(1)
if c == chr(0):
return "".join(chars)
chars.append(c)
just like with your code I read a character one at the time but I instead save them in a list, the reason is that string are immutable so doing str+=char result in unnecessary copies; and when I find the null character return the join string. And chr is the inverse of ord, it will give you the character given its ascii value. This will exclude the null character, if its needed just move the appending...
Now lets test it with your sample file
for instance lets try to read "Sword_Wea_Dummy" from it
with open("sword.blendscn","rb") as archi:
#lets simulate that some prior processing was made by
#moving the pointer of the file
archi.seek(6)
string=readString(archi)
print "string repr:", repr(string)
print "string:", string
print ""
#and the rest of the file is there waiting to be processed
print "rest of the file: ", repr(archi.read())
and this is the output
string repr: 'Sword_Wea_Dummy'
string: Sword_Wea_Dummy
rest of the file: '\xcd\xcc\xcc=p=\x8a4:\xa66\xbfJ\x15\xc6=\x00\x00\x00\x00\xeaQ8?\x9e\x8d\x874$-i\xb3\x00\x00\x00\x00\x9b\xc6\xaa2K\x15\xc6=;\xa66?\x00\x00\x00\x00\xb8\x88\xbf#\x0e\xf3\xb1#ITuB\x00\x00\x80?\xcd\xcc\xcc=\x00\x00\x00\x00\xcd\xccL>'
other tests
>>> with open("sword.blendscn","rb") as archi:
print readString(archi)
print readString(archi)
print readString(archi)
sword
Sword_Wea_Dummy
ÍÌÌ=p=Š4:¦6¿JÆ=
>>> with open("sword.blendscn","rb") as archi:
print repr(readString(archi))
print repr(readString(archi))
print repr(readString(archi))
'sword'
'Sword_Wea_Dummy'
'\xcd\xcc\xcc=p=\x8a4:\xa66\xbfJ\x15\xc6='
>>>
Now that I think about it, you mention that the data portion is of fixed size, if that is true for all files and the structure on all of them is as follow
[unknow size data][know size data]
then that is a pattern we can exploit, we only need to know the size of the file and we can get both part smoothly as follow
import os
def getDataPair(filename,knowSize):
size = os.path.getsize(filename)
with open(filename, "rb") as archi:
unknown = archi.read(size-knowSize)
know = archi.read()
return unknown, know
and by knowing the size of the data portion, its use is simple (which I get by playing with the prior example)
>>> strins_data, data = getDataPair("sword.blendscn", 80)
>>> string_data, data = getDataPair("sword.blendscn", 80)
>>> string_data
'sword\x00Sword_Wea_Dummy\x00'
>>> data
'\xcd\xcc\xcc=p=\x8a4:\xa66\xbfJ\x15\xc6=\x00\x00\x00\x00\xeaQ8?\x9e\x8d\x874$-i\xb3\x00\x00\x00\x00\x9b\xc6\xaa2K\x15\xc6=;\xa66?\x00\x00\x00\x00\xb8\x88\xbf#\x0e\xf3\xb1#ITuB\x00\x00\x80?\xcd\xcc\xcc=\x00\x00\x00\x00\xcd\xccL>'
>>> string_data.split(chr(0))
['sword', 'Sword_Wea_Dummy', '']
>>>
Now to get each string a simple split will suffice and you can pass the rest of the file contained in data to the appropriated function to be processed
Doing file I/O one character at a time is horribly slow.
Instead use readline0, now on pypi: https://pypi.org/project/readline0/ . Or something like it.
In 3.x, there's a "newline" argument to open, but it doesn't appear to be as flexible as readline0.
Here is my implementation:
import struct
def read_null_str(f):
r_str = ""
while 1:
back_offset = f.tell()
try:
r_char = struct.unpack("c", f.read(1))[0].decode("utf8")
except:
f.seek(back_offset)
temp_char = struct.unpack("<H", f.read(2))[0]
r_char = chr(temp_char)
if ord(r_char) == 0:
return r_str
else:
r_str += r_char

I'm getting a typeError: string indices must be integers, not type [duplicate]

This question already has answers here:
I'm getting a TypeError. How do I fix it?
(2 answers)
Closed 6 months ago.
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
key = "XPMGTDHLYONZBWEARKJUFSCIQV"
def encode():
alpha[""] = key["x"]
def decode():
key[""] = alpha[""]
def menu():
response = raw_input("""Crypto Menu
quit(0)
encode(1)
decode(2)""")
return response
def main():
keepGoing = True
while keepGoing:
response = menu()
if response == "1":
plain = raw_input("text to be encoded: ")
print encode()
elif response == "2":
coded = raw_input("code to be decyphered: ")
print decode()
elif response == "0":
print "Thanks for doing secret spy stuff with me."
keepGoing = False
else:
print "I don't know what you want to do..."
print main()
I keep getting a TypeError saying string indices must be integers, not type. Not sure how to correct this, it is highlighting the decode and encode variables.
There's a lot going on here, but you're definitely having issues with your encode and decode functions. If I understand what you're trying to do, you could rewrite them as follows:
def encode(string):
alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
key = 'XPMGTDHLYONZBWEARKJUFSCIQV'
encoded = []
for character in string:
character_index = alpha.index(character)
encoded.append(key[character_index])
return ''.join(encoded)
def decode(string):
alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
key = 'XPMGTDHLYONZBWEARKJUFSCIQV'
decoded = []
for character in string:
character_index = key.index(character)
decoded.append(alpha[character_index])
return ''.join(decoded)
Each function is doing essentially the same thing (here's what encode is doing:
Creating an empty list called encoded. I'll use this to store each character, translated, in order.
Loop through the characters of the string passed in.
At each iteration, find its index in the string alpha.
Find the character in key at that same index and append that character to the list encoded
Once all the characters have been translated, join them into a string and return that string.
Note: This will fail if a character in the string argument is not found in the alpha string. You could add some error checking.
You could make this even more general if you wanted to allow for different keys. You could write a translate function like this:
def translate(string, from_language_string, to_language_string):
translated = []
for character in string:
character_index = from_language_string.index(character)
translated.append(to_language_string[character_index])
return ''.join(translated)
And then your encode and decode functions could be written like this:
def encode(string):
return translate(string, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'XPMGTDHLYONZBWEARKJUFSCIQV')
def decode(string):
return translate(string, 'XPMGTDHLYONZBWEARKJUFSCIQV', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ')
To address what's going on in the rest of your code, for the conditionals in your main function, you'll just want to make sure to pass the strings read in from raw_input to the encode and decode functions as needed. Something like this:
if response == '1':
plain = raw_input('text to be encoded: ')
print encode(plain)
# and so on
Good luck.
I think your problem is here:
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
key = "XPMGTDHLYONZBWEARKJUFSCIQV"
def encode():
alpha[""] = key["x"]
def decode():
key[""] = alpha[""]
I think you misunderstand how indexing in strings works. So let me try to correct this:
Take a string, like x = "hello". The reference x["h"] is meaningless. There's no way for Python to interpret this. On the other hand, x[0] is meaningful. It returns the element of x at index 0. That's "h", in our case.
Similarly, alpha[""] doesn't mean anything. When you use the square brackets, you are trying to specify an index in the string alpha. But the indices of alpha are integers. alpha[0] returns "A". alpha[1] returns "B". alpha[25] returns "Z".
So you need to use integers for your indices. Notation like key["x"] doesn't mean anything, and that raises errors.

Categories