python calculate signed crc32 integer to checksum - python

I have the following string I would like to calculate a checksum for.
3556.5:200:3557.0:2:3556.4:84:3557.4:4:3555.7:6:3557.7:14:3555.1:46:3558.6:21:3552.9:14:3558.7:10:3552.8:194:3558.8:106:3552.7:10:3558.9:10:3552.6:25:3560.2:178:3552.5:4:3560.5:111:3551.7:1:3561.7:1:3551.6:65:3562.5:18:3551.0:103:3562.6:111:3550.7:3:3562.7:3:3550.6:4:3562.8:185:3550.5:1:3563.7:1:3550.3:84:3564.2:1:3550.2:156:3564.8:153:3550.0:82:3565.0:400:3549.7:1:3565.9:60:3548.4:104:3566.1:20:3547.2:177:3566.5:40:3545.9:1:3568.0:20:3545.1:11:3569.4:12:3545.0:71:3570.0:82:3544.9:1:3570.6:4
I do it the following
string2 = string.encode('ascii')
checksum = zlib.crc32((string2))
This gives me an integer of 3467096777. However, the server provider says it should be -949017128. Additionally, I tried many variants of the string and always ended up with a positive number, which somehow leads me to the possibility that my way of calculating a signed crc32 integer is wrong.
I converted the -949017128 via the following
checksum_server = -949017128 & 0xffffffff
it yields 3345950168, which is still different from mine.
Is there a way to calculate the string out of the signed crc32 integer -949017128?

I think it is the BTC's price at okex. What a good time!
crc32 of zlib returns a unsigned number which is different from signed number as the API documentation; if the server check_sum is positive, they should be equal; if server check_sum is negative, there is a need to check as below (should have better solution):
check_sum = zlib.crc32(checksum_str.encode("utf-8"))
if server_check_sum < 0 and 2 ** 32 - check_sum + server_check_sum == 0 or server_check_sum == check_sum:
print(f"{instrument_id}: checksum successful")
You must ensure the string is corrected formatted, no "0" added if you do the type conversion.

Related

Reversing byte order of negative integer with Python 3

After extracting some 32-bit sign bit value and keeping that same 32-bit representation sign-extended, I now have to reverse the byte order of the value (I need to follow a precise imposed workflow).
Here is what I previously did :
initially I have the value "11101101111110100111001110011010"
I converted that chain to int : I get 3992613786
I extracted the 32-bit sign bit (sign-extended) : I get -302353510
Now, I have to reverse the byte order of that last value (I am supposed to get -1703675155 in the end).
Does anyone know how to reverse the byte order of a negative extended sign bit with Python3 ?
There are probably better ways but this seems to work:
from struct import pack
x = "11101101111110100111001110011010"
n = ((~int(x, 2) + 1) & 0xffffffff) * -1 if x[0] == '1' else 1
print(int.from_bytes(pack('!i', n), 'little', signed=True))
Output:
-1703675155

Verifying CRC32 of UDP given .jpg file of payload

I'm running a server that receives UDP packets that contain a 2 byte CRC32 polynomial and a variable number of XOR'd DWORDs corresponding to a .jpg file. The packets also contain the index of the corresponding DWORD in the .jpg file for each DWORD in the packet. I am also given the actual .jpg file.
For example, the packet could contain 10 DWORDs and specify the starting index as 3, so we can expect the received DWORDs to correspond with the 4th through 11th DWORDs making up the .jpg.
I want to verify the integrity of each of the DWORDs by comparing their CRC32 values against the CRC32 values of the corresponding DWORDs in the .jpg.
I thought that the proper way to do this would be to divide each DWORD in the packet and its corresponding DWORD in the .jpg by the provided CRC polynomial and analyze the remainder. If the remainders are the same after doing these divisions, then there is no problem with the packet. However, even with packets that are guaranteed to be correct, these remainders are never equal.
Here is how I'm reading the bytes of the actual .jpg and splitting them up into DWORDs:
def split(data):
# Split the .jpg data into DWORDs
chunks = []
for i in range(0, len(data), 4):
chunks.append(data[i: i + 4])
return chunks
def get_image_bytes():
with open("dog.jpg", "rb") as image:
f = image.read()
jpg_bytes = split(f)
return jpg_bytes
Now I have verified my split() function works and to my knowledge, get_image_bytes() reads the .jpg correctly by calling image.read().
After receiving a packet, I convert each DWORD to binary and perform the mod 2 division like so:
jpg_bytes = get_image_bytes()
crc_key_bin = '1000110111100' # binary representation of the received CRC32 polynomial
d_words = [b'\xc3\xd4)v', ... , b'a4\x96\xbb']
iteration = 0 # For simplicity, assume the packet specified that the starting index is 0
for d in d_words:
d_bin = format(int(d.hex(), 16), "b") # binary representation of the DWORD from the packet
jpg_dword = format(int(jpg_bytes[iteration].hex(), 16), "b") # binary representation of the corresponding DWORD in dog.jpg
remainder1 = mod2div(d_bin, crc_key_bin) # <--- These remainders should be
remainder2 = mod2div(jpg_dword, crc_key_bin) # <--- equal, but they're not!
iteration += 1
I have tested the mod2div() function, and it returns the expected remainder after performing mod 2 division.
Where am I going wrong? I'm expecting the 2 remainders to be equal, but they never are. I'm not sure if the way I'm reading the bytes from the .jpg file is incorrect, if I'm performing the mod 2 division with the wrong values, or if I'm completely misunderstanding how to verify the CRC32 values. I'd appreciate any help.
First off, there's no such thing as a "2 byte CRC32 polynomial". A 32-bit CRC needs 32-bits to specify the polynomial.
Second, a CRC polynomial is something that is fixed for a given protocol. Why is a CRC polynomial being transmitted, as opposed to simply specified? Are you sure it's the polynomial? Where is this all documented?
What does "XOR'd DWORDs" means? Exclusive-or'd with what?
And, yes, I think you are completely misunderstanding how to verify CRC values. All you need to do is calculate the check values on the message the same way it was done at the other end, and compare that to the check values that were transmitted. (That is true for any check value, not just CRCs.) However I cannot tell from your description what was calculated on what, or how.

Converting bytes to signed numbers in Python

I am faced with a problem in Python and I think I don't understand how signed numbers are handled in Python. My logic works in Java where everything is signed so need some help in Python.
I have some bytes that are coded in HEX and I need to decode them and interpret them to numbers. The protocol are defined.
Say the input may look like:
raw = '016402570389FFCF008F1205DB2206CA'
And I decode like this:
bin_bytes = binascii.a2b_hex(raw)
lsb = bin_bytes[5] & 0xff
msb = bin_bytes[6] << 8
aNumber = int(lsb | msb)
print(" X: " + str(aNumber / 4000.0))
After dividing by 4000.0, X can be in a range of -0.000025 to +0.25.
This logic works when X is in positive range. When X is expected
to be negative, I am getting back a positive number.
I think I am not handling "msb" correctly when it is a signed number.
How should I handlehandle negative signed number in
Python?
Any tips much appreciated.
You can use Python's struct module to convert the byte string to integers. It takes care of endianness and sign extension for you. I guess you are trying to interpret this 16-byte string as 8 2-byte signed integers, in big-endian byte order. The format string for this is '>8h. The > character tells Python to interpret the string as big endian, 8 means 8 of the following data type, and h means signed short integers.
import struct
nums = struct.unpack('>8h', bin_bytes)
Now nums is a tuple of integers that you can process further.
I'm not quite sure if your data is little or big endian. If it is little-endian, you can use < to indicate that in the struct.unpack format string.

Understanding Two's complement to float(Texas Instruments Sensor Tag)

I found some sample code to extract temperature from the Texas Instruments Sensor Tag on github:
https://github.com/msaunby/ble-sensor-pi/blob/master/sensortag/sensor_calcs.py
I don't understand what the following code does:
tosigned = lambda n: float(n-0x10000) if n>0x7fff else float(n)
How i read the above piece of code:
if n>0x7fff: n = float(n-0x10000)
else n = float(n)
Basically what is happening is that the two's complement value(n) is converted to float. Why should this only happen when the value of n is greater than 0x7fff? If the value is 0x7fff or smaller, then we just convert i to float. Why? I don't understand this.
The sample code from Texas Instruments can be found here:
http://processors.wiki.ti.com/index.php/SensorTag_User_Guide#SensorTag_Android_Development
Why is the return value devided by 128.0 in this function in the TI sample code?
private double extractAmbientTemperature(BluetoothGattCharacteristic c) {
int offset = 2;
return shortUnsignedAtOffset(c, offset) / 128.0;
}
I did ask this to the developer, but didn't get a reply.
On disk and in memory integers are stored to a certain bit-width. Modern Python's ints allows us to ignore most of that detail because they can magically expand to whatever size is necessary, but sometimes when we get values from disk or other systems we have to think about how they are actually stored.
The positive values of a 16-bit signed integer will be stored in the range 0x0001-0x7fff, and its negative values from 0x8000-0xffff. If this value was read in some way that didn't already check the sign bit (perhaps as an unsigned integer, or part of a longer integer, or assembled from two bytes) then we need to recover the sign.
How? Well, if the value is over 0x7fff we know that it should be negative, and negative values are stored as two's complement. So we simply subtract 0x10000 from it and we get the negative value.
So you're converting between signed hex and floats. In python, signed floats are displayed as having a negative sign, so you can ignore the way it's actually represented in memory. But in hex, the negative part of the number is represented as part of the value. So, to convert correctly, the shift is put in.
You can play with this yourself using the Python interpreter:
tosigned = lambda n: float(n-0x10000) if n>0x7fff else float(n)
print(tosigned(0x3fff))
versus:
unsigned = lambda n: float(n)
Check this out to learn more:
http://www.swarthmore.edu/NatSci/echeeve1/Ref/BinaryMath/NumSys.html

Need Assistance in Calculating Checksum

I am working on an interface in Python to a home automation system (ElkM1). I have sample code in C# below which apparently correctly calculates the checksum needed when sending messages to this system. I put together the python code below but it doesn't appear to be returning the correct value.
According to the documentation the checksum of the message needs to be the sum of the ASCII values of the message in mod256 then taken as 2s complement. From their manual: "This is the hexadecimal two‟s complement of the modulo-256 sum of the ASCII values of all characters in the message excluding the checksum itself and the CR-LF terminator at the end of the message. Permissible characters are ASCII 0-9 and upper case A-F. When all the characters are added to the Checksum, the value should equal 0."
The vendor has a tool which will calculate the correct checksum. As test data I have been using '00300005000' which should return a checksum of 74
My code returns 18
Thanks in advance.
My Code (Python)
def calc_checksum (string):
'''
Calculates checksum for sending commands to the ELKM1.
Sums the ASCII character values mod256 and takes
the Twos complement
'''
sum= 0
for i in range(len(string)) :
sum = sum + ord(string[i])
temp = sum % 256 #mod256
rem = temp ^ 256 #inverse
cc1 = hex(rem)
cc = cc1.upper()
p=len(cc)
return cc[p-2:p]
Their Code C#:
private string checksum(string s)
{
int sum = 0;
foreach (char c in s)
sum += (int)c;
sum = -(sum % 256);
return ((byte)sum).ToString("X2");
}
FWIW, here's a literal translation of the C# code into Python:
def calc_checksum(s):
sum = 0
for c in s:
sum += ord(c)
sum = -(sum % 256)
return '%2X' % (sum & 0xFF)
print calc_checksum('00300005000')
It outputs is E8 for the message shown which is different from both your and the C# code. Given the description in the manual and doing the calculations by hand, I don't see how their answer could be 74. How do you know that's the correct answer?
After seeing Mark Ransom's comment that the C# code does indeed return E8, I spent some time debugging your Python code and found out why it doesn't produce the same result. One problem is that it doesn't calculate the two's complement correctly on the line with the comment #inverse in your code. There's at least a couple of ways to do that correctly.
A second problem is way the hex() function handles negative numbers is not what you'd might expect. With the -24 two's complement in this case it produces -0x18, not 0xffe8 or something similar. This means that just taking the last two characters of the uppercased result would be incorrect. An really easy way to do that is just convert the lower byte of the value to uppercase hexadecimal using the % string interpolation operator. Here's a working version of your function:
def calc_checksum(string):
'''
Calculates checksum for sending commands to the ELKM1.
Sums the ASCII character values mod256 and takes
the Twos complement.
'''
sum = 0
for i in range(len(string)):
sum = sum + ord(string[i])
temp = sum % 256 # mod256
# rem = (temp ^ 0xFF) + 1 # two's complement, hard way (one's complement + 1)
rem = -temp # two's complement, easier way
return '%2X' % (rem & 0xFF)
A more Pythonic (and faster) implementation would be a one-liner like this which makes use of the built-in sum() function:
def calc_checksum(s):
"""
Calculates checksum for sending commands to the ELKM1.
Sums the ASCII character values mod256 and returns
the lower byte of the two's complement of that value.
"""
return '%2X' % (-(sum(ord(c) for c in s) % 256) & 0xFF)

Categories