Sum Hex Values using Python to get CheckSum - python

My question maybe is simple but i'm not good with bytes/hex operations. I need to do a checksum from a Serial Port data with this Values:
55 55 3A 0B 47 09 3E 08 FF 0F 93
The last value 93 is the sum value but i don't know how to do this.
55 + 55 + 3A + 0B + 47 + 09 + 3E + 08 + FF + 0F = 93

Convert the raw bytestring into a sequence of numbers, then add all but the last number, mask to byte-length, and compare the result with the last number in the sequence.
>>> data = bytearray('\x55\x55\x3a\x0b\x47\x09\x3e\x08\xff\x0f\x93')
>>> sum(data[:-1]) & 0xff == data[-1]
True

Related

What's a less bad way to write this hex formatting function in Python?

With Python, I wanted to format a string of hex characters:
spaces between each byte (easy enough): 2f2f -> 2f 2f
line breaks at a specified max byte width (not hard): 2f 2f 2f 2f 2f 2f 2f 2f\n
address ranges for each line (doable): 0x7f8-0x808: 2f 2f 2f 2f 2f 2f 2f 2f\n
replace large ranges of sequential 00 bytes with: ... trimmed 35 x 00 bytes [0x7 - 0x2a] ... ... it was at this point that I knew I was doing some bad coding. The function got bloated and hard to follow. Too many features piled up in a non-intuitive way.
Example output:
0x0-0x10: 5a b6 f7 6e 7c 65 45 a0 bc 6a e5 f5 77 2b 92 48
0x10-0x20: 47 d7 33 ea 40 15 44 ac 6b a4 50 78 6e f2 10 d4
0x20-0x30: 9c 7c c1 f7 5a bf ec 9f b0 2b b7 29 97 ee 56 31
0x30-0x40: ff 23 d9 1a 0b 4e fd 65 50 92 42 eb b2 77 7a 55
0x40-0x50:
I'm pretty sure the address ranges aren't correct anymore in certain cases (particularly when the 00 replacement occurs), the function just looks disgusting, and I'm embarrassed to even show it.
def pretty_print_hex(hex_str, byte_width=16, line_start=False, addr=0):
out = ''
condense_min = 12
total_bytes = int(len(hex_str) / 2)
line_width = False
if byte_width is not False:
line_width = byte_width * 2
if line_start is not False:
out += line_start
end = addr + byte_width
if (end > addr + total_bytes):
end = addr + total_bytes
out += f"{hex(addr)}-{hex(end)}:\t"
addr += byte_width
i = 0
if len(hex_str) == 1:
print('Cannot pretty print < 1 byte', hex_str)
return
condensing = False
cond_start_addr = 0
cond_end_addr = 0
condense_cache = []
while i < len(hex_str):
byte = hex_str[i] + hex_str[i + 1]
i += 2
if byte == '00':
condensing = True
cond_start_addr = (addr - byte_width) + ((i + 1) % byte_width)
condense_cache.append(byte)
else:
if condensing is True:
condensed_count = len(condense_cache)
if condensed_count >= condense_min:
cond_end_addr = cond_start_addr + condensed_count
out += f"... trimmed {condensed_count} x 00 bytes [{hex(cond_start_addr)} - {hex(cond_end_addr)}] ..."
else:
for byte in condense_cache:
out += f"{byte} "
condense_cache = []
condensing = False
if condensing is False:
out += byte + ' '
if (line_width is not False) and (i) % line_width == 0:
out += '\n'
if line_start is not False:
out += line_start
end = addr + byte_width
if end > addr + total_bytes:
end = addr + total_bytes
if (addr - end) != 0:
out += f"{hex(addr)}-{hex(end)}:\t"
addr += byte_width
if condensing is True:
condensed_count = len(condense_cache)
if condensed_count >= condense_min:
cond_end_addr = cond_start_addr + condensed_count
out += f"... trimmed {condensed_count} x 00 bytes [{hex(cond_start_addr)} - {hex(cond_end_addr)}] ..."
else:
for byte in condense_cache:
out += f"{byte} "
return out.rstrip()
example input / output:
hex_str = 'c8d8fb631cc7d072b62aaf9cd47bc270d4341e35f23b7a94acf24f33397a6cb4145b6eacfd56653d79bea10d2842023155e5b14bec3b5851a0a58cb3a523c476b126486e1392bdd2e3bcb6cbc333b23de387ae8624123009'
byte_width=16
line_start='\t'
addr=0
print(pretty_print_hex(hex_str , byte_width=16, line_start='\t', addr=0))
0x0-0x10: c8 d8 fb 63 1c c7 d0 72 b6 2a af 9c d4 7b c2 70
0x10-0x20: d4 34 1e 35 f2 3b 7a 94 ac f2 4f 33 39 7a 6c b4
0x20-0x30: 14 5b 6e ac fd 56 65 3d 79 be a1 0d 28 42 02 31
0x30-0x40: 55 e5 b1 4b ec 3b 58 51 a0 a5 8c b3 a5 23 c4 76
0x40-0x50: b1 26 48 6e 13 92 bd d2 e3 bc b6 cb c3 33 b2 3d
0x50-0x60: e3 87 ae 86 24 12 30 09
It gets much worse when you involve some 00 replacement, here's an example of that:
hex_str = 'c8000000000000000000000000000aaf9cd47bc270d4341e35f23b7a94acf24f33397a6cb4145b6eacfd56653d79bea10d2842023155e5b14bec3b5851a0a58cb3a523c476b126486e1392bdd2e3bcb6cbc333b23de387ae8624123009'
byte_width=16
line_start='\t'
addr=0
print(pretty_print_hex(hex_str, byte_width=16, line_start='\t', addr=0))
0x0-0x10: c8 ... trimmed 13 x 00 bytes [0xd - 0x1a] ...0a af
0x10-0x20: 9c d4 7b c2 70 d4 34 1e 35 f2 3b 7a 94 ac f2 4f
0x20-0x30: 33 39 7a 6c b4 14 5b 6e ac fd 56 65 3d 79 be a1
0x30-0x40: 0d 28 42 02 31 55 e5 b1 4b ec 3b 58 51 a0 a5 8c
0x40-0x50: b3 a5 23 c4 76 b1 26 48 6e 13 92 bd d2 e3 bc b6
0x50-0x60: cb c3 33 b2 3d e3 87 ae 86 24 12 30 09
It would also make more sense to make the address range (`0x0-0x10) portray the true range, to include the trimmed bytes on that line, but I couldn't even begin to think of how to add that in.
Rather than patch this bad looking function, I thought I might ask for a better approach entirely, if one exists.
I would suggest to not start a "trimmed 00 bytes" series in the middle of an output line, but only apply this compacting when it applies to complete output lines with only zeroes.
This means that you will still see non-compacted zeroes in a line that also contains non-zeroes, but in my opinion this results in a cleaner output format. For instance, if a line would end with just two 00 bytes, it really does not help to replace that last part of the line with the longer "trimmed 2 x 00 bytes" message. By only replacing complete 00-lines with this message, and compress multiple such lines with one message, the output format seems cleaner.
To produce that output format, I would use the power of regular expressions:
to identify a block of bytes to be output on one line: either a line with at least one non-zero, or a range of zero bytes which either runs to the end of the input, or else is a multiple of the "byte width" argument.
to insert spaces in a line of bytes
All this can be done through iterations in one expression:
def pretty_print_hex(hex_str, byte_width=16, line_start='\t', addr=0):
return "\n".join(f"{hex(start)}-{hex(last)}:{line_start}{line}"
for start, last, line in (
(match.start() // 2, match.end() // 2 - 1,
f"...trimmed {(match.end() - match.start()) // 2} x 00 bytes..." if match[1]
else re.sub("(..)(?!$)", r"\1 ", match[0])
)
for match in re.finditer(
f"(0+$|(?:(?:00){{{byte_width}}})+)|(?:..){{1,{byte_width}}}",
hex_str
)
)
)
If you want to use it rather than write it (not sure - tell me to delete if required), you can use the excellent (I am not associated with it) hexdump:
https://pypi.org/project/hexdump
python -m hexdump binary.dat
It is super cool - I guess you could also inspect the source for ideas.
It doesn't, however, look like it is still maintained...
I liked the challenge in this function, and this is what I could come up with this evening. It is somewhat shorter than your original one, but not as short as trincot's answer.
def hexpprint(
hexstring: str,
width: int = 16,
hexsep: str = " ",
addr: bool = False,
addrstart: int = 0,
linestart: str = "",
compress: bool = False,
):
# if address get hex address length size
if addr:
addrlen = len(f"{addrstart+len(hexstring):x}")
# compression buffer just count hex 0 chars
cbuf = 0
for i in range(0, len(hexstring), width):
j = i + width
row = hexstring[i:j]
# if using compression and compressable
if compress and row.count("0") == len(row):
cbuf += len(row)
continue
# if not compressable and has cbuf, flush it
if cbuf:
line = linestart
if addr:
beg = f"0x{addrstart+i-cbuf:0{addrlen}x}"
end = f"0x{addrstart+i:0{addrlen}x}"
line += f"{beg}-{end} "
line += f"compressed {cbuf//2} NULL bytes"
print(line)
cbuf = 0
# print formatted hex row
line = linestart
if addr:
beg = f"0x{addrstart+i:0{addrlen}x}"
end = f"0x{addrstart+i+len(row):0{addrlen}x}"
line += f"{beg}-{end} "
line += hexsep.join(row[i : i + 2] for i in range(0, width, 2))
print(line)
# flush cbuf if necessary
if cbuf:
line = linestart
if addr:
beg = f"0x{addrstart+i-cbuf:0{addrlen}x}"
end = f"0x{addrstart+len(hexstring):0{addrlen}x}"
line += f"{beg}-{end} "
line += f"compressed {cbuf//2} NULL bytes"
print(line)
PS: I don't really like the code repetition to print things, so I might come back and edit later.

Dynamic Two Dimensional Array Creation Python

I have a long string of repeating two hexadecimal characters separated by a space read in from a file that I would like to store into a two dimensional (array) list for processing later. The string is in the form:
file_content = "00 18 00 19 F0 0F 1A 80 FF C7 E8 11 7F 52 7D 00 F0 0D F0 0C 0B FF"
Each sub string that needs indexed begins with "00" and ends with "FF". There are no instances of "FF" mid string but there are instances of "00" possible which makes this tricky. I would like to store each one of these events to its own index in the list. For example:
event_list = [[00 18 00 19 F0 0F 1A 80 FF], [C7 E8 11 7F 52 7D 00 F0 0D F0 0C 0B FF], .....}
If I understand this correctly, you're splitting it up based on the 'FF's present in the string, so you could probably get away with something like:
event_list = [('%sFF' % x).strip().split(' ') for x in file_content.split('FF')[:-1]]
This will split your original string by the 'FF's present, then loop through the split parts, append an ' FF' to the end of them. It then splits the new string by the space character, generating a new list and appends it to the outer list, creating the 2D array you require in 1 line :)

How to find out address in binary file with python code only?

I have binary for example https://github.com/andrew-d/static-binaries/blob/master/binaries/linux/x86_64/nmap
1) How to find what is the address of this series of bytes :48 8B 45 A8 48 8D 1C 02 48 8B 45 C8 ? , the result need to be 0x6B0C67
2)How to find out the 12 bytes that in address 0x6B0C67 ? the result need to be 48 8B 45 A8 48 8D 1C 02 48 8B 45 C8 .
3) How to find which address call to specific string? for example i + 1 == features[i].index that locate in 0x6FC272 ? the result need to be 0x4022F6
How can I find all of this without open Ida? only with python/c code?
thanks
For 1) Is your file small enough to be loaded into memory? Then it's as simple as
offset = open(file, 'rb').read().find(
bytes.fromhex("48 8B 45 A8 48 8D 1C 02 48 8B 45 C8")
)
# offset will be -1 if not found
If not, you will need to read it in chunks.
For 2), do
with open(file, 'rb') as stream:
stream.seek(0x6b0c67)
data = stream.read(12)
I'm afraid I don't understand the question in 3)...

Diffing Binary Files In Python

I've got two binary files. They look something like this, but the data is more random:
File A:
FF FF FF FF 00 00 00 00 FF FF 44 43 42 41 FF FF ...
File B:
41 42 43 44 00 00 00 00 44 43 42 41 40 39 38 37 ...
What I'd like is to call something like:
>>> someDiffLib.diff(file_a_data, file_b_data)
And receive something like:
[Match(pos=4, length=4)]
Indicating that in both files the bytes at position 4 are the same for 4 bytes. The sequence 44 43 42 41 would not match because they're not in the same positions in each file.
Is there a library that will do the diff for me? Or should I just write the loops to do the comparison?
You can use itertools.groupby() for this, here is an example:
from itertools import groupby
# this just sets up some byte strings to use, Python 2.x version is below
# instead of this you would use f1 = open('some_file', 'rb').read()
f1 = bytes(int(b, 16) for b in 'FF FF FF FF 00 00 00 00 FF FF 44 43 42 41 FF FF'.split())
f2 = bytes(int(b, 16) for b in '41 42 43 44 00 00 00 00 44 43 42 41 40 39 38 37'.split())
matches = []
for k, g in groupby(range(min(len(f1), len(f2))), key=lambda i: f1[i] == f2[i]):
if k:
pos = next(g)
length = len(list(g)) + 1
matches.append((pos, length))
Or the same thing as above using a list comprehension:
matches = [(next(g), len(list(g))+1)
for k, g in groupby(range(min(len(f1), len(f2))), key=lambda i: f1[i] == f2[i])
if k]
Here is the setup for the example if you are using Python 2.x:
f1 = ''.join(chr(int(b, 16)) for b in 'FF FF FF FF 00 00 00 00 FF FF 44 43 42 41 FF FF'.split())
f2 = ''.join(chr(int(b, 16)) for b in '41 42 43 44 00 00 00 00 44 43 42 41 40 39 38 37'.split())
The provided itertools.groupby solution works fine, but it's pretty slow.
I wrote a pretty naive attempt using numpy and tested it versus the other solution on a particular 16MB file I happened to have, and it was about 42x faster on my machine. Someone familiar with numpy could likely improve this significantly.
import numpy as np
def compare(path1, path2):
x,y = np.fromfile(path1, np.int8), np.fromfile(path2, np.int8)
length = min(x.size, y.size)
x,y = x[:length], y[:length]
z = np.where(x == y)[0]
if(z.size == 0) : return z
borders = np.append(np.insert(np.where(np.diff(z) != 1)[0] + 1, 0, 0), len(z))
lengths = borders[1:] - borders[:-1]
starts = z[borders[:-1]]
return np.array([starts, lengths]).T

Question about DES encryption in python

I am trying to create an LM/NTLM response for which I require encrypting the challenge sent by server using DES algorithm
The following is what I did:
from M2Crypto.EVP import Cipher
def encryptChallenge(magic, key):
str_key = ""
for iter1 in key:
str_key = str_key + chr(iter1)
encrypt = 1
cipher = Cipher(alg='des_ede_ecb', key=str_key, op=encrypt, iv='\0'*16)
ciphertext = cipher.update(magic)
ciphertext += cipher.final()
return ciphertext
However when I try encrypting "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" using DES, I get the following result:
Key used to encrypt: ['0xfe', '0x9b', '0xd5', '0x16', '0xcd', '0x15', '0xc8', '0x49']
Challenge after encryption:
Encrypted_server_challenge_using_key_1 : ['0x66', '0xf7', '0xa', '0xf8', '0xda', '0x4e', '0x7', '0xaa', '0x65', '0xc3', '0x8d', '0xaa', '0x48', '0xcc', '0x67', '0x57', '0xe2', '0xb0', '0x6e', '0x10', '0xb', '0x5e', '0xdd', '0xb4']
The above response was not accepted by the server
Tried using a tool called DEScalc.jar (http://www.unsw.adfa.edu.au/~lpb/src/DEScalc/index.html) and found that the encrypted result is:
setKey(fe9bd516cd15c849)
encryptDES(0123456789abcdef)
IP: L0=cc00ccff, R0=f0aaf0aa
Rnd1 f(R0=f0aaf0aa, SK1=0b 2c 23 12 33 1c 2b 09 ) = 988995a0
Rnd2 f(R1=5489595f, SK2=21 15 0d 11 1c 1a 3b 38 ) = 63200664
Rnd3 f(R2=938af6ce, SK3=01 35 2f 05 3e 19 30 1f ) = c206c318
Rnd4 f(R3=968f9a47, SK4=06 37 07 01 03 37 1a 3e ) = bdf738ef
Rnd5 f(R4=2e7dce21, SK5=06 14 17 29 0f 17 27 25 ) = 76c68d3d
Rnd6 f(R5=e049177a, SK6=34 14 06 0d 28 2c 23 37 ) = c182a1c7
Rnd7 f(R6=efff6fe6, SK7=04 18 2e 05 31 3a 3e 17 ) = c3e45497
Rnd8 f(R7=23ad43ed, SK8=04 13 22 27 2f 30 1f 19 ) = 4977a92c
Rnd9 f(R8=a688c6ca, SK9=12 0a 38 0c 3d 33 19 26 ) = 4975507e
Rnd10 f(R9=6ad81393, SK10=10 0b 30 1e 1f 08 2f 2e ) = d52a9361
Rnd11 f(R10=73a255ab, SK11=19 0a 31 22 05 0f 33 1f ) = 38b2a619
Rnd12 f(R11=526ab58a, SK12=38 2e 30 22 1b 3b 13 31 ) = e9dec064
Rnd13 f(R12=9a7c95cf, SK13=3a 0a 1c 12 2a 3e 35 2b ) = d88ee399
Rnd14 f(R13=8ae45613, SK14=19 09 18 1b 0b 2d 3c 16 ) = 9de6ddb2
Rnd15 f(R14=079a487d, SK15=19 39 01 12 37 14 17 36 ) = 5fb60a90
Rnd16 f(R15=d5525c83, SK16=24 05 0d 39 31 1f 2d 34 ) = 6a40b6ea
FP: L=c337cd5c, R=bd44fc97
returns c337cd5cbd44fc97
Noticed that the above result is accepted by the server
Is there a specific algorithm that is used by DEScalc.jar which I am missing, because of which I don't get the results obtained by DEScalc.jar
Hi Everyone,
Thanks a lot for your help; The issue was with the way I represented the hexadecimal in python; I used the following function to convert "0123456789abcdef" to hex representation as Keith mentioned and it worked:
def HexToByte( hexStr ):
"""
Convert a string hex byte values into a byte string. The Hex Byte values may
or may not be space separated.
"""
# The list comprehension implementation is fractionally slower in this case
#
# hexStr = ''.join( hexStr.split(" ") )
# return ''.join( ["%c" % chr( int ( hexStr[i:i+2],16 ) ) \
# for i in range(0, len( hexStr ), 2) ] )
bytes = []
hexStr = ''.join( hexStr.split(" ") )
for i in range(0, len(hexStr), 2):
bytes.append( chr( int (hexStr[i:i+2], 16 ) ) )
return ''.join( bytes )
Thanks a lot
The problem here is in your source (plaintext) string. You have each character expanded to two bytes, instead of one byte. The Java program will take the input "0123456789abcdef", and use internally the hex string of that. Using pycrypto and a properly encoded plaintext I get this.
Python2> from Crypto.Cipher import DES
Python2> key
'\xfe\x9b\xd5\x16\xcd\x15\xc8I'
Python2> pw
'\x01#Eg\x89\xab\xcd\xef'
Python2> eng = DES.new(key, DES.MODE_ECB, "\0"*8)
Python2> hexdigest(eng.encrypt(pw))
'c337cd5cbd44fc97'
Which you can see is the same as the Java code.
Are you sure you need to use DES-EDE-ECB?
EDE means that you're actually using Triple DES: you run DES three times (with three different keys), and EDE means that you encrypt-decrypt-encrypt (each time with a different key).
But it sounds like you should just be using plain DES ('des_ecb').

Categories