custom crc32 calculation in python without libs - python

I have been looking for a simple python code which can generate a crc32-sum. It is for a stm32 and i dont find a good example which is adjustable.
To get the right settings for my calculation i used following side.
http://www.sunshine2k.de/coding/javascript/crc/crc_js.html
The settings would be the following:
Polynomial: 0x4C11DB7,
Initial Value: 0xFFFFFFFF
and no Xor Value or 0x00, also the Input and result are not reflected.
Does someone know where i could get a simple adjustable algorithm or where i can learn how to write one?
Edit:
I use this function to create the table
def create_table():
a = []
for i in range(256):
k = i
for j in range(8):
if k & 1:
k ^= 0x4C11DB7
k >>= 1
a.append(k)
return a
and the following for generating the crc-sum
def crc32(bytestream):
crc_table = create_table()
crc32 = 0xffffffff
for byte in range( int(len(bytestream)) ):
lookup_index = (crc32 ^ byte) & 0xff
crc32 = (crc32 >> 8) ^ crc_table[lookup_index]
return crc32
and call the function with this
print(hex(crc32(b"1205")))
the result is: 0x9f8e7b8c
but the website gives me: 0xA7D10A0A
can someone help me?

First off, what you have is for a reflected CRC, not a non-reflected CRC. Though there is an error in your table construction. This:
if k & 1:
k ^= 0x4C11DB7
k >>= 1
is wrong. The exclusive-or must be done after the shift. So it would need to be (for the reflected case):
k = (k >> 1) ^ 0xedb88320 if k & 1 else k >> 1
Note that the polynomial also needs to be reflected in this case.
Another error in your code is using range to make the integers 0, 1, ..., and using those instead of the actual data bytes to compute the CRC on! What you want for your for loop is simply:
for byte in bytestream:
The whole point of using a table is to make the CRC calculation faster. You don't want to regenerate the table every time you do a CRC. You want to generate the table once when your program starts, and then use it multiple times. Or you can generate the table separately from your program, and then put the table itself in your program. That's what's usually done.
Anyway, to do the non-reflected case, you need to flip things around. So to make the table:
def create_table():
a = []
for i in range(256):
k = i << 24;
for _ in range(8):
k = (k << 1) ^ 0x4c11db7 if k & 0x80000000 else k << 1
a.append(k & 0xffffffff)
return a
To use the table:
def crc32(bytestream):
crc_table = create_table()
crc = 0xffffffff
for byte in bytestream:
lookup_index = ((crc >> 24) ^ byte) & 0xff
crc = ((crc & 0xffffff) << 8) ^ crc_table[lookup_index]
return crc
Now it correctly implements your specification, which happens to be the MPEG-2 32-bit CRC specification (from Greg Cook's CRC catalogue):
width=32 poly=0x04c11db7 init=0xffffffff refin=false refout=false xorout=0x00000000 check=0x0376e6e7 residue=0x00000000 name="CRC-32/MPEG-2"
For the code above, if I do:
print(hex(crc32(b'123456789')))
I get 0x376e6e7, which matches the check value in the catalog.
Again, you need to take the create_table() out of the crc32() routine and do it somewhere else, once.

Related

Most Significant Byte Calculation

I am trying to implement a larger cipher problem, and I am running into an issue I don't quite understand when taking the Most Significant Byte (not bit).
To turn an int into a byte I am using:
def binary(i):
if i == 0:
return "0"
s = ''
while i:
if i & 1 == 1:
s = "1" + s
else:
s = "0" + s
i >>= 1
return s
I am pretty sure the above is correct, it works with my test numbers. To then extract the Most Significant Byte I am using:
def msb(i):
a = binary(i)
b = a[0:7]
c = int(b,2)
return c
However, this seems to return a number half what I would expect. Am I wrong in thinking you can get the most significant byte by just taking the first 8 bits, or am I missing something else silly?
Your example code only gets the seven leading bits, not 8:
def msb(i):
a = binary(i)
b = a[0:7] # gets first SEVEN characters of string a
c = int(b,2)
return c
Change it to a[0:8] to extract 8 leading characters/bits rather than 7.
There are much easier ways to do this. For example, if you want the top eight bits (ignoring byte alignment), you can do:
def msb(val):
return val >> (val.bit_length() - 8)
For the most significant aligned byte, in Python 3 you can do:
def msb(val):
return val.to_bytes((val.bit_length() + 7) // 8, 'big')[0]
In Py2, you'd have to convert to a hex string and back to match the to_bytes approach.
a byte is 0xFF you can get the most signifigant byte(leftmost) by doing
i & (0xFF<<(n_bytes(i)-1))
I always get most significant and least significant confused if you want the rightmost byte its easier even
i & 0xFF
i think thats right at least ... im not sure if it will be guaranteed to return the number of bytes or not ...
based on your example i think the second code is what you want
you could also do something like
s = struct.pack("i",i)
ord(s[0]) # leftmost
ord(s[-1]) # rightmost
If you want aligned bytes, this should work at least from Python 2.5 onwards:
def msb(val):
return 0 if val == 0 else val >> (((val.bit_length() - 1) >> 3) << 3)
Or, if you prefer it more readable:
def msb(val)
if val == 0:
return 0
else:
return val >> (((val.bit_length() - 1) / 8) * 8)

Incorrect LRC value calculated from checksum

I'm trying to calculate an LRC (Longitudinal Redundancy Check) value with Python.
My Python code is pulled from other posts on StackOverflow. It looks like this:
lrc = 0
for b in message:
lrc ^= b
print lrc
If I plug in the value '\x02\x47\x30\x30\x03', I get an LRC value of 70 or 0x46 (F)
However, I am expecting a value of 68 - 0x44 (D) instead.
I have calculated the correct LRC value via C# code:
byte LRC = 0;
for (int i = 1; i < bytes.Length; i++)
{
LRC ^= bytes[i];
}
return LRC;
If I plug in the same byte array values, I get the expected result of 0x44.
Functionally, the code looks very similar. So I'm wondering what the difference is between the code. Is it my input value? Should I format my string differently?
Arrays are 0-ordered in C#, so by starting iteration from int i = 1; you are skipping 1st byte.
Python result is correct one.
Fixed reference code:
byte LRC = 0;
for (int i = 0; i < bytes.Length; i++)
{
LRC ^= bytes[i];
}
return LRC;
To avoid such mistake you should consider using foreach syntactic sugar (although I'm not familiar with C# practices).
/edit
To skip first byte in Python simply use slice syntax:
lrc = 0
for b in message[1:]:
lrc ^= b
print lrc
So I figured out the answer to my question. Thanks to Nsh for his insight. I found a way to make the algorithm work. I just had to skip the first byte in the for-loop. There's probably a better way to do this but it was quick and it's readable.
def calcLRC(input):
input=input.decode('hex')
lrc = 0
i = 0
message = bytearray(input)
for b in message:
if(i == 0):
pass
else:
lrc ^= b
i+=1;
return lrc
It now returns the expected 0x44 in my use case.

read single bit operation python 2.6

I am trying to read a single bit in a binary string but can't seem to get it to work properly. I read in a value then convert to a 32b string. From there I need to read a specific bit in the string but its not always the same. getBin function returns 32bit string with leading 0's. The code I have always returns a 1, even if the bit is a 0. Code example:
slot=195035377
getBin = lambda x, n: x >= 0 and str(bin(x))[2:].zfill(n) or "-" + str(bin(x))[3:].zfill(n)
bits = getBin(slot,32)
bit = (bits and (1 * (2 ** y)) != 0)
print("bit: %i\n"%(bit))
in this example bits = 00001011101000000000000011110011
and if I am looking for bit3 which i s a 0, bit will be equal to 1. Any ideas?
To test for specific bits in a integer value, use the & bitwise operand; no need to convert this to a binary string.
if slot & (1 << 3):
print 'bit 3 is set'
else:
print 'bit 3 is not set'
The above code shifts a test bit to the left twice. Alternatively, shift slot to the right 3 times:
if (slot >> 2) & 1:
To make this generic for any bit position, subtract 1:
if slot & (1 << (bitpos - 1)):
print 'bit {} is set'.format(bitpos)
or
if (slot >> (bitpos - 1)) & 1:
Your binary formatting code is overly verbose. Just use the format() function to create a binary string representation:
format(slot, '032b')
formats your binary value to a 0-padded 32-character binary string.
n = 223
bitpos = 3
bit3 = (n >> (bitpos-1))&1
is how you should be doing it ... don't use strings!
You can just use slicing to get the correct digit.
bits = getBin(slot, 32)
bit = bits[bit_location-1:bit_location] #Assumes zero based values
print("bit: %i\n"%(bit))

Convert string to big endian, index out of range

I'm trying to convert a string to a big endian but due to my lack of experience with bit shifting etc, I've got stuck with the following so far:
def my_func(self, b):
a = [(len(b)+3) >> 2]
for i, val in enumerate(b):
a[i>>2] |= ord(b[i]) << (24-(i & 3)*8)
return a
The above returns the error
a[i>>2] |= ord(b[i]) << (24-(i & 3)*8)
IndexError: list index out of range, and also never gets further through the loop index than #4.
The error message is pointing to the a[] list.
Can anyone see what I'm doing wrong here? I'm porting this from JavaScript so that may be the issue (link to that http://pastebin.com/GKE3AeCm )
Without resorting to other methods, your code just needs to be adjusted more correctly from the Javascript version. In Javascript you are creating an Array of certain length, but in your Python code you always create a list of size 1. Here is it corrected:
def my_func(b):
a = [0] * ((len(b)+3) >> 2)
for i, val in enumerate(b):
a[i>>2] |= ord(b[i]) << (24-(i & 3)*8)
return a
So what you are doing is considering sequences of 4 objects as being raw bytes and unpacking them to build an integer. Using struct, the correct way would be to be explicit about your data being bytes and passing it as such:
import struct
def my_func2(data):
lb = len(data)
if lb % 4:
data += b'\x00' * (4 - (lb % 4))
a = [struct.unpack('>i', data[i:i+4])[0] for i in range(0, lb, 4)]
return a
print(my_func2(b'pass123'))

How to get the signed integer value of a long in python?

If lv stores a long value, and the machine is 32 bits, the following code:
iv = int(lv & 0xffffffff)
results an iv of type long, instead of the machine's int.
How can I get the (signed) int value in this case?
import ctypes
number = lv & 0xFFFFFFFF
signed_number = ctypes.c_long(number).value
You're working in a high-level scripting language; by nature, the native data types of the system you're running on aren't visible. You can't cast to a native signed int with code like this.
If you know that you want the value converted to a 32-bit signed integer--regardless of the platform--you can just do the conversion with the simple math:
iv = 0xDEADBEEF
if(iv & 0x80000000):
iv = -0x100000000 + iv
Essentially, the problem is to sign extend from 32 bits to... an infinite number of bits, because Python has arbitrarily large integers. Normally, sign extension is done automatically by CPU instructions when casting, so it's interesting that this is harder in Python than it would be in, say, C.
By playing around, I found something similar to BreizhGatch's function, but that doesn't require a conditional statement. n & 0x80000000 extracts the 32-bit sign bit; then, the - keeps the same 32-bit representation but sign-extends it; finally, the extended sign bits are set on n.
def toSigned32(n):
n = n & 0xffffffff
return n | (-(n & 0x80000000))
Bit Twiddling Hacks suggests another solution that perhaps works more generally. n ^ 0x80000000 flips the 32-bit sign bit; then - 0x80000000 will sign-extend the opposite bit. Another way to think about it is that initially, negative numbers are above positive numbers (separated by 0x80000000); the ^ swaps their positions; then the - shifts negative numbers to below 0.
def toSigned32(n):
n = n & 0xffffffff
return (n ^ 0x80000000) - 0x80000000
Can I suggest this:
def getSignedNumber(number, bitLength):
mask = (2 ** bitLength) - 1
if number & (1 << (bitLength - 1)):
return number | ~mask
else:
return number & mask
print iv, '->', getSignedNumber(iv, 32)
You may use struct library to convert values like that. It's ugly, but works:
from struct import pack, unpack
signed = unpack('l', pack('L', lv & 0xffffffff))[0]
A quick and dirty solution (x is never greater than 32-bit in my case).
if x > 0x7fffffff:
x = x - 4294967296
If you know how many bits are in the original value, e.g. byte or multibyte values from an I2C sensor, then you can do the standard Two's Complement conversion:
def TwosComp8(n):
return n - 0x100 if n & 0x80 else n
def TwosComp16(n):
return n - 0x10000 if n & 0x8000 else n
def TwosComp32(n):
return n - 0x100000000 if n & 0x80000000 else n
In case the hexadecimal representation of the number is of 4 bytes, this would solve the problem.
def B2T_32(x):
num=int(x,16)
if(num & 0x80000000): # If it has the negative sign bit. (MSB=1)
num -= 0x80000000*2
return num
print(B2T_32(input("enter a input as a hex value\n")))
Simplest solution with any bit-length of number
Why is the syntax of a signed integer so difficult for the human mind to understand. Because this is the idea of machines. :-)
Let's explain.
If we have a bi-directional 7-bit counter with the initial state
000 0000
and we get a pulse for the back count input. Then the next number to count will be
111 1111
And the people said:
Hey, the counter we need to know that this is a negative reload. You
should add a sign letting you know about this.
And the counter added:
1111 1111
And people asked,
How are we going to calculate that this is -1.
The counter replied: Find a number one greater than the reading and subtract it and you get the result.
1111 1111
-10000 0000
____________
(dec) -1
def sigIntFromHex(a): # a = 0x0xffe1
if a & (1 << (a.bit_length()-1)): # check if highest bit is 1 thru & with 0x1000
return a - (1 << (a.bit_length())) # 0xffe1 - 0x10000
else:
return a
###and more elegant:###
def sigIntFromHex(a):
return a - (1 << (a.bit_length())) if a & (1 << (a.bit_length()-1)) else a
b = 0xFFE1
print(sigIntFromHex(b))
I hope I helped

Categories