CRC value calculation - python

I'm working on a communication command protocol between a PLC and a 3rd party device.
The manufacturer has provided me with the following information for calculating the CRC values that will change depending on the address of the device I wish to read information from.
A CRC is performed on a block of data, for example the first seven bytes of all transmissions are followed by a two byte CRC for that data. This CRC will be virtually unique for that particular combination of bytes. The process of calculating the CRC follows:
Inputs:
N.BYTES = Number or data bytes to CRC ( maximum 64 bytes )
DATA() = An array of data bytes of quantity N.BYTES
CRC.MASK = 0xC9DA a hexadecimal constant used in the process
Outputs:
CRC = two byte code redundancy check made up of CRC1 (High byte) and CRC2 (Low byte)
Process:
START
CRC = 0xFFFF
FOR N = 1 TO N.BYTES
CRC = CRC XOR ( DATA(N) AND 0xFF )
FOR I = 1 TO 8
IF ( CRC AND 0x0001 ) = 0 THEN LSR CRC
ELSE LSR CRC ; CRC = CRC XOR CRC.MASK
NEXT I
NEXT N
X = CRC1 ; Change the two bytes in CRC around
CRC1 = CRC2
CRC2 = X
END
They also provided me with a couple of complete command strings for the first few device addresses.
RTU #1
05-64-00-02-10-01-00-6C-4B-53-45-EB-F7
RTU #2
05-64-00-02-10-02-00-1C-AE-53-45-EB-F7
RTU #3
05-64-00-02-10-03-00-CC-F2-53-45-EB-F7
The header CRC bytes in the previous three commands are 6C-4B, 1C-AE, and CC-F2 respectively.
I calculated out a the first few lines by hand to have something to compare against when I wrote out the following code in Python.
byte1 = 05
byte2 = 100
byte3 = 00
byte4 = 02
byte5 = 16
byte6 = 01
byte7 = 00
byte8 = 00
mask = 51674
hexarray = [byte1, byte2, byte3, byte4, byte5, byte6, byte7, byte8]
#print hexarray
CRCdata = 65535
for n in hexarray:
CRCdata = CRCdata ^ (n & 255)
print(n, CRCdata)
for i in range(1,8):
if (CRCdata & 1) == 0:
CRCdata = CRCdata >> 1
# print 'if'
else:
CRCdata = CRCdata >> 1
CRCdata = CRCdata ^ mask
# print 'else'
print(i, CRCdata)
print CRCdata
I added byte8 due to some research I did mentioning that an extra byte of 0s needs to be added to the end of the CRC array for calculations. I converted the final result and did the byte swap manually. The problem I've been running into, is that my CRC calculations, whether I keep byte8 or not, are not matching up with any of the three examples that have been provided.
I'm not quite sure where I am going wrong on this and any help would be greatly appreciated.

I was able to solve the issue by updating the code to range(0,8) and dropping byte8.

Related

I want to merge four bytes into one digit for serial transmission

I want to send long values using python to an arduino board which runs c++. The serial communication breaks the 4 byte numbers up and sends them byte by byte. When I try to reassemble them on the back end, I only get a valid number for 2 bytes instead of the four bytes I sent.
Here is the python code sending instructions.
pos1 = int(input("pos1: "))
pos2 = int(input("pos2: "))
data = struct.pack('<ll', pos1, pos2)
ser.write(data)
Here is the arduino code to parse the bytes that it reads.
if(Serial.available()>0){
size_t numbytes = Serial.readBytes(data, 8);
for(int i=0; i<8; i++){
Serial.println(data[i], HEX);
}
pos1 = readfourbytes(data[0], data[1], data[2], data[3]);
pos2 = readfourbytes(data[4], data[5], data[6], data[7]);
Serial.println(pos1);
Serial.println(pos2);
}
long readfourbytes(byte fourthbyte, byte thirdbyte, byte thirdbyte, byte firstbyte){
long result = (firstbyte << 24) + (secondbyte << 16) + (thirdbyte << 8) + fourthbyte;
return result;
}
I guess this means the arduino is little endian? My problem is the second position value that is read is completely off. The python code seems to be the problem however I don't know why. when I send the int values of 100 for both, I get an output of
b'd\x00\x00\x00d\x00\x00\x00'
from the python code as the binary being sent in the data variable. But from the arduino, I recieve:
64
0
0
0
6D
2
0
0
100
621
So there is a disconnect between what I am sending and what I am recieving. The baudrates are the same and there is no other obvious fault that I am aware of.
All the expressions (<any>byte << <bits>) are evaluated as int that seems to be 16 bits on the arduino. Cast <any>byte into long, and you're done.
long readfourbytes(byte fourthbyte, byte thirdbyte, byte thirdbyte, byte firstbyte){
long result = ((long)firstbyte << 24) + ((long)secondbyte << 16) + (thirdbyte << 8) + fourthbyte;
return result;
}

How I can convert this value to hex? (amf value)

the context:
I decode a amf response from an flex app with python.
With pyamf I can decode all the response, but one value got my attention.
this value \xa2C is transformed to 4419
#\xa2C -> 4419
#\xddI -> 11977
I know \x is related with a hex value, but I cant get the function to transform 4419 to \xa2C.
the 4419 is an integer.
--- Update 1
This original value, are not hex.
because I transform this value \xa2I to 4425.
So what kind of value is \xa2I ???
Thanks!
-- Update 2.
DJ = 5834
0F = 15
0G = error
1F = 31
a1f = 4294
adI = 5833
adg = 5863
adh = 5864
Is strange some time accept values after F and in other situation show an error. But are not hex value that is for sure.
What you're seeing is the string representation of the bytes of an AmfInteger. The first example, \xa2C consists of two bytes: 0xa2 aka 162, and C, which is the ASCII representation of 67:
>>> ord("\xa2C"[0])
162
>>> ord("\xa2C"[1])
67
To convert this into an AmfInteger, we have to follow the AMF3 specifications, section 1.3.1 (the format of an AmfInteger is the same in AMF0 and AMF3, so it doesn't matter what specification we look at).
In that section, a U29 (variable length unsigned 29-bit integer, which is what AmfIntegers use internally to represent the value) is defined as either a 1-, 2-, 3- or 4-byte sequence. Each byte encodes information about the value itself, as well as whether another byte follows. To figure out whether another byte follows the current one, one just needs to check whether the most significant bit is set:
>>> (162 & 0x80) == 0x80
True
>>> (67 & 0x80) == 0x80
False
So we now confirmed that the byte sequence you see is indeed a full U29: the first byte has its high bit set, to indicate that it's followed by another byte. The second byte has the bit unset, to indicate the end of the sequence. To get the actual value from those bytes, we now only need to combine their values, while masking out the high bit of the first byte:
>>> 162 & 0x7f
34
>>> 34 << 7
4352
>>> 4352 | 67
4419
From this, it should be easy to figure out why the other values give the results you observe.
For completeness sake, here's also a Python snippet with an example implementation that parses a U29, including all corner cases:
def parse_u29(byte_sequence):
value = 0
# Handle the initial bytes
for byte in byte_sequence[:-1]:
# Ensure it has its high bit set.
assert ord(byte) & 0x80
# Extract the value and add it to the accumulator.
value <<= 7
value |= ord(byte) & 0x7F
# Handle the last byte.
value <<= 8 if len(byte_sequence) > 3 else 7
value |= ord(byte_sequence[-1])
# Handle sign.
value = (value + 2**28) % 2**29 - 2**28
return value
print parse_u29("\xa2C"), 4419
print parse_u29(map(chr, [0x88, 0x00])), 1024
print parse_u29(map(chr, [0xFF, 0xFF, 0x7E])), 0x1ffffe
print parse_u29(map(chr, [0x80, 0xC0, 0x80, 0x00])), 0x200000
print parse_u29(map(chr, [0xBF, 0xFF, 0xFF, 0xFE])), 0xffffffe
print parse_u29(map(chr, [0xC0, 0x80, 0x80, 0x01])), -268435455
print parse_u29(map(chr, [0xFF, 0xFF, 0xFF, 0x81])), -127

Computing TCP checksum in python

I came across this peice of code here: to compute checksum.
As far as I understand in order to segregate the binary data structure into 16 bit words as required for TCP checksum:
I recon the value of w should be dirieved as w= ord(msg[i]) << 8 + ord(msg[i+1]) unless, the byte order has to be changed. I am not too sure as to why w would be assigned value as w = ord(msg[i]) + ord(msg[i+1]) << 8. Is there anything specific I am missing here?
def checksum(msg):
s = 0
# loop taking 2 characters at a time
for i in range(0, len(msg), 2):
w = ord(msg[i]) + (ord(msg[i+1]) << 8 )
s = s + w
s = (s>>16) + (s & 0xffff);
s = s + (s >> 16);
#complement and mask to 4 byte short
s = ~s & 0xffff
return s
In this case I think "network order", "big endian" and "little endian" are being mixed with the TCP Checksum calculation.
The TCP Checksum calculation is defined in RFC 1071: https://www.rfc-editor.org/rfc/rfc1071
At the beginning of page 2:
Using the notation [a,b] for the 16-bit integer a*256+b, where a and b are bytes,
The bytes in the Pseudo Header and partially filled TCP Header are just "bytes" and no implication is made as to what they mean (they must already be in "network order")
The formula used by the author is just following RFC 1071

Read latest character sent from Arduino in Python

I'm a beginner in both Arduino and Python, and I have an idea but I can't get it to work. Basically, when in Arduino a button is pressed, it sends "4" through the serial port. What I want in Python is as soon as it reads a 4, it should do something. This is what I got so far:
import serial
ser = serial.Serial('/dev/tty.usbserial-A900frF6', 9600)
var = 1
while var == 1:
if ser.inWaiting() > 0:
ser.readline(1)
print "hello"
But obviously this prints hello no matter what. What I would need is something like this:
import serial
ser = serial.Serial('/dev/tty.usbserial-A900frF6', 9600)
var = 1
while var == 1:
if ser.inWaiting() > 0:
ser.readline(1)
if last.read == "4":
print "hello"
But how can I define last.read?
I don't know a good way of synchronising the comms with readLine since it's not a blocking call. You can use ser.read(numBytes) which is a blocking call. You will need to know how many bytes Arduino is sending though to decode the byte stream correctly. Here is a simple example that reads 8 bytes and unpacks them into 2 unsigned shorts and a long (the <HHL part) in Python
try:
data = [struct.unpack('<HHL', handle.read(8)) for i in range(PACKETS_PER_TRANSMIT)]
except OSError:
self.emit(SIGNAL("connectionLost()"))
self.connected = False
Here's a reference to the struct.unpack()
The Arduino code that goes with that. It reads two analog sensor values and the micro timestamp and sends them over the serial.
unsigned int SensA, SensB;
byte out_buffer[64];
unsigned int buffer_head = 0;
unsigned int buffer_size = 64;
SensA = analogRead(SENSOR_A);
SensB = analogRead(SENSOR_B);
micr = micros();
out_buffer[buffer_head++] = (SensA & 0xFF);
out_buffer[buffer_head++] = (SensA >> 8) & 0xFF;
out_buffer[buffer_head++] = (SensB & 0xFF);
out_buffer[buffer_head++] = (SensB >> 8) & 0xFF;
out_buffer[buffer_head++] = (micr & 0xFF);
out_buffer[buffer_head++] = (micr >> 8) & 0xFF;
out_buffer[buffer_head++] = (micr >> 16) & 0xFF;
out_buffer[buffer_head++] = (micr >> 24) & 0xFF;
Serial.write(out_buffer, buffer_size);
The Arduino playground and Processing Forums are good places to look around for this sort of code as well.
UPDATE
I think I might have misled you with readLine not blocking. Either way, the above code should work. I also found this other thread on SO regarding the same subject.
UPDATE You don't need to use the analog sensors, that's just what the project I did happened to be using, you are of course free to pass what ever values over the serial. So what the Arduino code is doing is it has a buffer of type byte where the output is being stored before being sent. The sensor values and micros are then written to the buffer and the buffer sent over the serial. The (SensA & 0xFF) is a bit mask operator that takes the bit pattern of the SensA value and masks it with the bit pattern of 0xFF or 255 in decimal. Essetianlly this takes the first 8 bits from the 16 bit value of SensA which is an Arduino short. the next line does the same thing but shifts the bits right by 8 positions, thus taking the last 8 bits.
You'll need to understand bit patterns, bit masking and bit shifting for this. Then the buffer is written to the serial.
The Python code in turn does reads the bits from the serial port 8 bits at a time. Have a look at the struct.unpack docs. The for comprehension is just there to allow sending more than one set of values. Because the Arduino board and the Python code are running out of sync I added that to be able to send more than one "lines" per transmit. You can just replace that with struct.unpack('<HHL',handle.read(8)). Remember that the ´handle.read()´ takes a number of bytes where as the Arduino send code is dealing with bits.
I think it might work with this modifications:
import serial
ser = serial.Serial('/dev/tty.usbserial-A900frF6', 9600)
var = 1
while var == 1:
if (ser.inWaiting() > 0):
ser.readline(1)
print "hello"

Calculate/validate bz2 (bzip2) CRC32 in Python

I'm trying to calculate/validate the CRC32 checksums for compressed bzip2 archives.
.magic:16 = 'BZ' signature/magic number
.version:8 = 'h' for Bzip2 ('H'uffman coding)
.hundred_k_blocksize:8 = '1'..'9' block-size 100 kB-900 kB
.compressed_magic:48 = 0x314159265359 (BCD (pi))
.crc:32 = checksum for this block
...
...
.eos_magic:48 = 0x177245385090 (BCD sqrt(pi))
.crc:32 = checksum for whole stream
.padding:0..7 = align to whole byte
http://en.wikipedia.org/wiki/Bzip2
So I know where the CRC checksums are in a bz2 file, but how would I go about validating them. What chunks should I binascii.crc32() to get both CRCs? I've tried calculating the CRC of various chunks, byte-by-byte, but have not managed to get a match.
Thank you. I'll be looking into the bzip2 sources and bz2 Python library code, to maybe find something, especially in the decompress() method.
Update 1:
The block headers are identified by the following tags as far as I can see. But tiny bz2 files do not contain the ENDMARK ones. (Thanks to adw, we've found out that one should look for bit shifted values of the ENDMARK, since the compressed data is not padded to bytes.)
#define BLOCK_HEADER_HI 0x00003141UL
#define BLOCK_HEADER_LO 0x59265359UL
#define BLOCK_ENDMARK_HI 0x00001772UL
#define BLOCK_ENDMARK_LO 0x45385090UL
This is from the bzlib2recover.c source, blocks seem to start always at bit 80, right before the CRC checksum, which should be omitted from the CRC calculation, as one can't CRC its own CRC to be the same CRC (you get my point).
searching for block boundaries ...
block 1 runs from 80 to 1182
Looking into the code that calculates this.
Update 2:
bzlib2recover.c does not have the CRC calculating functions, it just copies the CRC from the damaged files. However, I did manage to replicate the block calculator functionality in Python, to mark out the starting and ending bits of each block in a bz2 compressed file. Back on track, I have found that compress.c refers to some of the definitions in bzlib_private.h.
#define BZ_INITIALISE_CRC(crcVar) crcVar = 0xffffffffL;
#define BZ_FINALISE_CRC(crcVar) crcVar = ~(crcVar);
#define BZ_UPDATE_CRC(crcVar,cha) \
{ \
crcVar = (crcVar << 8) ^ \
BZ2_crc32Table[(crcVar >> 24) ^ \
((UChar)cha)]; \
}
These definitions are accessed by bzlib.c as well, s->blockCRC is initialized and updated in bzlib.c and finalized in compress.c. There's more than 2000 lines of C code, which will take some time to look through and figure out what goes in and what does not. I'm adding the C tag to the question as well.
By the way, here are the C sources for bzip2 http://www.bzip.org/1.0.6/bzip2-1.0.6.tar.gz
Update 3:
Turns out bzlib2 block CRC32 is calculated using the following algorithm:
dataIn is the data to be encoded.
crcVar = 0xffffffff # Init
for cha in list(dataIn):
crcVar = crcVar & 0xffffffff # Unsigned
crcVar = ((crcVar << 8) ^ (BZ2_crc32Table[(crcVar >> 24) ^ (ord(cha))]))
return hex(~crcVar & 0xffffffff)[2:-1].upper()
Where BZ2_crc32Table is defined in crctable.c
For dataIn = "justatest" the CRC returned is 7948C8CB, having compressed a textfile with that data, the crc:32 checksum inside the bz2 file is 79 48 c8 cb which is a match.
Conclusion:
bzlib2 CRC32 is (quoting crctable.c)
Vaguely derived from code by Rob
Warnock, in Section 51 of the
comp.compression FAQ...
...thus, as far as I understand, cannot be precalculated/validated using standard CRC32 checksum calculators, but rather require the bz2lib implementation (lines 155-172 in bzlib_private.h).
The following is the CRC algorithm used by bzip2, written in Python:
crcVar = 0xffffffff # Init
for cha in list(dataIn):
crcVar = crcVar & 0xffffffff # Unsigned
crcVar = ((crcVar << 8) ^ (BZ2_crc32Table[(crcVar >> 24) ^ (ord(cha))]))
return hex(~crcVar & 0xffffffff)[2:-1].upper()
(C code definitions can be found on lines 155-172 in bzlib_private.h)
BZ2_crc32Table array/list can be found in crctable.c from the bzip2 source code. This CRC checksum algorithm is, quoting: "..vaguely derived from code by Rob Warnock, in Section 51 of the comp.compression FAQ..." (crctable.c)
The checksums are calculated over the uncompressed data.
Sources can be downloaded here: http://www.bzip.org/1.0.6/bzip2-1.0.6.tar.gz
To add onto the existing answer, there is a final checksum at the end of the stream (The one after eos_magic) It functions as a checksum for all the individual Huffman block checksums. It is initialized to zero. It is updated every time you have finished validating an existing Huffman block checksum. To update it, do as follows:
crc: u32 = # latest validated Huffman block CRC
ccrc: u32 = # current combined checksum
ccrc = (ccrc << 1) | (ccrc >> 31);
ccrc ^= crc;
In the end, validate the value of ccrc against the 32-bit unsigned value you read from the compressed file.
Check the fastcrc python library which have bzip2 crc32 implementation.
https://fastcrc.readthedocs.io/en/latest/#fastcrc.crc32.bzip2

Categories