different crc16 C and Python3? - python

I have two crc16 calculators (in C and in Python). But Im receiving different results. Why?
calculator in C:
unsigned short __update_crc16 (unsigned char data, unsigned short crc16)
{
unsigned short t;
crc16 ^= data;
t = (crc16 ^ (crc16 << 4)) & 0x00ff;
crc16 = (crc16 >> 8) ^ (t << 8) ^ (t << 3) ^ (t >> 4);
return crc16;
}
unsigned short get_crc16 (void *src, unsigned int size, unsigned short start_crc)
{
unsigned short crc16;
unsigned char *p;
crc16 = start_crc;
p = (unsigned char *) src;
while (size--)
crc16 = __update_crc16 (*p++, crc16);
return crc16;
}
calculator in Python3:
def crc16(data):
crc = 0xFFFF
for i in data:
crc ^= i << 8
for j in range(0,8):
if (crc & 0x8000) > 0:
crc =(crc << 1) ^ 0x1021
else:
crc = crc << 1
return crc & 0xFFFF

There is more that one CRC-16. 22 catalogued at http://reveng.sourceforge.net/crc-catalogue/16.htm. A CRC is charactarised by its width, polynomial, initial state and the input and output bit order.
By applying the same data to each of your functions:
Python:
data = bytes([0x01, 0x23, 0x45, 0x67, 0x89])
print ( hex(crc16(data)) )
Result: 0x738E
C:
char data[] = {0x01, 0x23, 0x45, 0x67, 0x89};
printf ("%4X\n", get_crc16 (data, sizeof (data), 0xffffu));
Result: 0x9F0D
and also applying the same data to an online tool that generates multiple CRCs, such as https://crccalc.com/ you can identify the CRC from the result.
In this case your Python code is CRC-16-CCITT-FALSE, while the C result matches CRC-16/MCRF4XX. They both have the same polynomial, but differ in their input-reflected and output-reflected parameters (both false for CCITT, and true for MCRF4XX). This means that for MCRF4XX the bits are read from LSB first, and the entire CRC is nit reversed on output.
https://pypi.org/project/crccheck/ supports both CCITT and MCRF4XX and many others.

I implemented a version of crc16 in C based on python crc16 lib. This lib calculate CRC-CCITT (XModem) variant of CRC16. I used my implementation in a stm32l4 firmware. Here is my C implementation:
unsigned short _crc16(char *data_p, unsigned short length){
unsigned int crc = 0;
unsigned char i;
for(i = 0; i < length; i++){
crc = ((crc<<8)&0xff00) ^ CRC16_XMODEM_TABLE[((crc>>8)&0xff)^data_p[i]];
}
return crc & 0xffff;
}
In the Python side, I was reading 18 bytes that was transmited by stm32. Here is a bit of my code (crc's part):
import crc16
# read first time
crc_buffer = b''
bytes = serial_comunication.read(2) # int_16 - 2 bytes
crc_buffer = crc_buffer.join([crc_buffer,bytes])
crc = crc16.crc16xmodem(crc_buffer,0)
while aux < 8:
crc_buffer = b''
bytes = serial_comunication.read(2)
crc_buffer = crc_buffer.join([crc_buffer,bytes])
crc = crc16.crc16xmodem(crc_buffer,crc)
print(crc)
In my tests, C and Python crc16 values always match, unless some connection problem occurs. Hope this helps someone!

Related

Getting wrong values when I stitch 2 shorts back into an unsigned long

I am doing BLE communications with an Arduino Board and an FPGA.
I have this requirement which restraints me from changing the packet structure (the packet structure is basically short data types). Thus, to send a timestamp (form millis()) over, I have to split an unsigned long into 2 shorts on the Arduino side and stitch it back up on the FPGA side (python).
This is the implementation which I have:
// Arduino code in c++
unsigned long t = millis();
// bitmask to get bits 1-16
short LSB = (short) (t & 0x0000FFFF);
// bitshift to get bits 17-32
short MSB = (short) (t >> 16);
// I then send the packet with MSB and LSB values
# FPGA python code to stitch it back up (I receive the packet and extract the MSB and LSB)
MSB = data[3]
LSB = data[4]
data = MSB << 16 | LSB
Now the issue is that my output for data on the FPGA side is sometimes negative, which tells me that I must have missed something somewhere as timestamps are not negative. Does any one know why ?
When I transfer other data in the packet (i.e. other short values and not the timestamp), I am able to receive them as expected, so the problem most probably lies in the conversion that I did and not the sending/receiving of data.
short defaults to signed, and in case of a negative number >> will keep the sign by shifting in one bits in from the left. See e.g. Microsoft.
From my earlier comment:
In Python avoid attempting that by yourself (by the way short from C perspective has no idea concerning its size, you always have to look into the compiler manual or limits.h) and use the struct module instead.
you probably need/want to first convert the long to network byte order using hotnl
As guidot reminded “short” is signed and as data are transferred to Python the code has an issue:
For t=0x00018000 most significant short MSB = 1, least significant short LSB = -32768 (0x8000 in C++ and -0x8000 in Python) and Python code expression
time = MSB << 16 | LSB
returns time = -32768 (see the start of Python code below).
So, we have incorrect sign and we are loosing MSB (any value, not only 1 in our example).
MSB is lost because in the expression above LSB is extended with sign bit 1 to the left 16 bits, then new 16 “1” bits override with “|” operator whatever MSB we have and then all new 16 “1” bits are discarded and the expression returns LSB.
Straightforward fix (1.1 Fix) would be fixing MSB, LSB to unsigned short. This could be enough without any changes in Python code.
To exclude bit operations we could use “union” as per 1.2 Fix.
Without access to C++ code we could fix in Python by converting signed LSB, MSB (2.1 Fix) or use “Union” (similar to C++ “union”, 2.2 Fix).
C++
#include <iostream>
using namespace std;
int main () {
unsigned long t = 0x00018000;
short LSB = (short)(t & 0x0000FFFF);
short MSB = (short)(t >> 16);
cout << hex << "t = " << t << endl;
cout << dec << "LSB = " << LSB << " MSB = " << MSB << endl;
// 1.1 Fix Use unsigned short instead of short
unsigned short fixedLSB = (unsigned short)(t & 0x0000FFFF);
unsigned short fixedMSB = (unsigned short)(t >> 16);
cout << "fixedLSB = " << fixedLSB << " fixedMSB = " << fixedMSB << endl;
// 1.2 Fix Use union
union {
unsigned long t2;
unsigned short unsignedShortArray[2];
};
t2 = 0x00018000;
fixedLSB = unsignedShortArray [0];
fixedMSB = unsignedShortArray [1];
cout << "fixedLSB = " << fixedLSB << " fixedMSB = " << fixedMSB << endl;
}
Output
t = 18000
LSB = -32768 MSB = 1
fixedLSB = 32768 fixedMSB = 1
fixedLSB = 32768 fixedMSB = 1
Python
DATA=[0, 0, 0, 1, -32768]
MSB=DATA[3]
LSB=DATA[4]
data = MSB << 16 | LSB
print (f"MSB = {MSB} ({hex(MSB)})")
print (f"LSB = {LSB} ({hex(LSB)})")
print (f"data = {data} ({hex(data)})")
time = MSB << 16 | LSB
print (f"time = {time} ({hex(time)})")
# 2.1 Fix
def twosComplement (short):
if short >= 0: return short
return 0x10000 + short
fixedTime = twosComplement(MSB) << 16 | twosComplement(LSB)
# 2.2 Fix
import ctypes
class UnsignedIntUnion(ctypes.Union):
_fields_ = [('unsignedInt', ctypes.c_uint),
('ushortArray', ctypes.c_ushort * 2),
('shortArray', ctypes.c_short * 2)]
unsignedIntUnion = UnsignedIntUnion(shortArray = (LSB, MSB))
print ("unsignedIntUnion")
print ("unsignedInt = ", hex(unsignedIntUnion.unsignedInt))
print ("ushortArray[1] = ", hex(unsignedIntUnion.ushortArray[1]))
print ("ushortArray[0] = ", hex(unsignedIntUnion.ushortArray[0]))
print ("shortArray[1] = ", hex(unsignedIntUnion.shortArray[1]))
print ("shortArray[0] = ", hex(unsignedIntUnion.shortArray[0]))
unsignedIntUnion.unsignedInt=twosComplement(unsignedIntUnion.shortArray[1]) << 16 | twosComplement(unsignedIntUnion.shortArray[0])
def toUInt(msShort: int, lsShort: int):
return UnsignedIntUnion(ushortArray = (lsShort, msShort)).unsignedInt
fixedTime = toUInt(MSB, LSB)
print ("fixedTime = ", hex(fixedTime))
print()
Output
MSB = 1 (0x1)
LSB = -32768 (-0x8000)
data = -32768 (-0x8000)
time = -32768 (-0x8000)
unsignedIntUnion
unsignedInt = 0x18000
ushortArray[1] = 0x1
ushortArray[0] = 0x8000
shortArray[1] = 0x1
shortArray[0] = -0x8000
fixedTime = 0x18000

Python: How to calculate png crc value

crc_table = None
def make_crc_table():
global crc_table
crc_table = [0] * 256
for n in xrange(256):
c = n
for k in xrange(8):
if c & 1:
c = 0xedb88320L ^ (c >> 1)
else:
c = c >> 1
crc_table[n] = c
make_crc_table()
"""
/* Update a running CRC with the bytes buf[0..len-1]--the CRC
should be initialized to all 1's, and the transmitted value
is the 1's complement of the final running CRC (see the
crc() routine below)). */
"""
def update_crc(crc, buf):
c = crc
for byte in buf:
c = crc_table[int((c ^ ord(byte)) & 0xff)] ^ (c >> 8)
return c
# /* Return the CRC of the bytes buf[0..len-1]. */
def crc(buf):
return update_crc(0xffffffffL, buf) ^ 0xffffffffL
I used this code to calculate png crc value
My IHDR chunk data is 000008A0 000002FA 08020000 00 and the result of that code was 0xa1565b1L
However real crc was 0x84E42B87. I checked this value with well known png checker tool and correct crc was 0x84E42B87.
I can't understand how this value is calculated and correct value.
The CRC is calculated over the chunk type and the data, not just the data. So those bytes would be preceded by the four bytes IHDR. Then you get the correct CRC.
As an aside, I have no idea how you got 0xa1565b1L from 000008A0 000002FA 08020000 00. I get 0xa500050a as the CRC of those bytes. There must be something else that you're doing wrong as well. You would need to provide a complete example for us to be able to tell.

Can't reproduce working C bitwise encoding function in Python

I'm reverse engineering a proprietary network protocol that generates a (static) one-time pad on launch and then uses that to encode/decode each packet it sends/receives. It uses the one-time pad in a series of complex XORs, shifts, and multiplications.
I have produced the following C code after walking through the decoding function in the program with IDA. This function encodes/decodes the data perfectly:
void encodeData(char *buf)
{
int i;
size_t bufLen = *(unsigned short *)buf;
unsigned long entropy = *((unsigned long *)buf + 2);
int xorKey = 9 * (entropy ^ ((entropy ^ 0x3D0000) >> 16));
unsigned short baseByteTableIndex = (60205 * (xorKey ^ (xorKey >> 4)) ^ (668265261 * (xorKey ^ (xorKey >> 4)) >> 15)) & 0x7FFF;
//Skip first 24 bytes, as that is the header
for (i = 24; i <= (signed int)bufLen; i++)
buf[i] ^= byteTable[((unsigned short)i + baseByteTableIndex) & 2047];
}
Now I want to try my hand at making a Peach fuzzer for this protocol. Since I'll need a custom Python fixup to do the encoding/decoding prior to doing the fuzzing, I need to port this C code to Python.
I've made the following Python function but haven't had any luck with it decoding the packets it receives.
def encodeData(buf):
newBuf = bytearray(buf)
bufLen = unpack('H', buf[:2])
entropy = unpack('I', buf[2:6])
xorKey = 9 * (entropy[0] ^ ((entropy[0] ^ 0x3D0000) >> 16))
baseByteTableIndex = (60205 * (xorKey ^ (xorKey >> 4)) ^ (668265261 * (xorKey ^ (xorKey >> 4)) >> 15)) & 0x7FFF;
#Skip first 24 bytes, since that is header data
for i in range(24,bufLen[0]):
newBuf[i] = xorPad[(i + baseByteTableIndex) & 2047]
return str(newBuf)
I've tried with and without using array() or pack()/unpack() on various variables to force them to be the right size for the bitwise operations, but I must be missing something, because I can't get the Python code to work as the C code does. Does anyone know what I'm missing?
In case it would help you to try this locally, here is the one-time pad generating function:
def buildXorPad():
global xorPad
xorKey = array('H', [0xACE1])
for i in range(0, 2048):
xorKey[0] = -(xorKey[0] & 1) & 0xB400 ^ (xorKey[0] >> 1)
xorPad = xorPad + pack('B',xorKey[0] & 0xFF)
And here is the hex-encoded original (encoded) and decoded packet.
Original: 20000108fcf3d71d98590000010000000000000000000000a992e0ee2525a5e5
Decoded: 20000108fcf3d71d98590000010000000000000000000000ae91e1ee25252525
Solution
It turns out that my problem didn't have much to do with the difference between C and Python types, but rather some simple programming mistakes.
def encodeData(buf):
newBuf = bytearray(buf)
bufLen = unpack('H', buf[:2])
entropy = unpack('I', buf[8:12])
xorKey = 9 * (entropy[0] ^ ((entropy[0] ^ 0x3D0000) >> 16))
baseByteTableIndex = (60205 * (xorKey ^ (xorKey >> 4)) ^ (668265261 * (xorKey ^ (xorKey >> 4)) >> 15)) & 0x7FFF;
#Skip first 24 bytes, since that is header data
for i in range(24,bufLen[0]):
padIndex = (i + baseByteTableIndex) & 2047
newBuf[i] ^= unpack('B',xorPad[padIndex])[0]
return str(newBuf)
Thanks everyone for your help!
This line of C:
unsigned long entropy = *((unsigned long *)buf + 2);
should translate to
entropy = unpack('I', buf[8:12])
because buf is cast to an unsigned long first before adding 2 to the address, which adds the size of 2 unsigned longs to it, not 2 bytes (assuming an unsigned long is 4 bytes in size).
Also:
newBuf[i] = xorPad[(i + baseByteTableIndex) & 2047]
should be
newBuf[i] ^= xorPad[(i + baseByteTableIndex) & 2047]
to match the C, otherwise the output isn't actually based on the contents of the buffer.
Python integers don't overflow - they are automatically promoted to arbitrary precision when they exceed sys.maxint (or -sys.maxint-1).
>>> sys.maxint
9223372036854775807
>>> sys.maxint + 1
9223372036854775808L
Using array and/or unpack does not seem to make a difference (as you discovered)
>>> array('H', [1])[0] + sys.maxint
9223372036854775808L
>>> unpack('H', '\x01\x00')[0] + sys.maxint
9223372036854775808L
To truncate your numbers, you'll have to simulate overflow by manually ANDing with an appropriate bitmask whenever you're increasing the size of the variable.

C++ - Reading in 16bit .wav files

I'm trying to read in a .wav file, which I thought was giving me the correct result, however, when I plot the same audio file in Matlab or Python, the results are different.
This is the result that I get:
This is the result that Python (plotted with matplotlib) gives:
The results do not seem that different, but, when it comes to analysis, this is messing up my results.
Here is the code that converts:
for (int i = 0; i < size; i += 2)
{
int c = (data[i + 1] << 8) | data[i];
double t = c/32768.0;
//cout << t << endl;
rawSignal.push_back(t);
}
Where am I going wrong? Since, this conversion seems fine and does produce such a similar results.
Thanks
EDIT:
Code to read the header / data:
voidreadHeader(ifstream& file) {
s_riff_hdr riff_hdr;
s_chunk_hdr chunk_hdr;
long padded_size; // Size of extra bits
vector<uint8_t> fmt_data; // Vector to store the FMT data.
s_wavefmt *fmt = NULL;
file.read(reinterpret_cast<char*>(&riff_hdr), sizeof(riff_hdr));
if (!file) return false;
if (memcmp(riff_hdr.id, "RIFF", 4) != 0) return false;
//cout << "size=" << riff_hdr.size << endl;
//cout << "type=" << string(riff_hdr.type, 4) << endl;
if (memcmp(riff_hdr.type, "WAVE", 4) != 0) return false;
{
do
{
file.read(reinterpret_cast<char*>(&chunk_hdr), sizeof(chunk_hdr));
if (!file) return false;
padded_size = ((chunk_hdr.size + 1) & ~1);
if (memcmp(chunk_hdr.id, "fmt ", 4) == 0)
{
if (chunk_hdr.size < sizeof(s_wavefmt)) return false;
fmt_data.resize(padded_size);
file.read(reinterpret_cast<char*>(&fmt_data[0]), padded_size);
if (!file) return false;
fmt = reinterpret_cast<s_wavefmt*>(&fmt_data[0]);
sample_rate2 = fmt->sample_rate;
if (fmt->format_tag == 1) // PCM
{
if (chunk_hdr.size < sizeof(s_pcmwavefmt)) return false;
s_pcmwavefmt *pcm_fmt = reinterpret_cast<s_pcmwavefmt*>(fmt);
bits_per_sample = pcm_fmt->bits_per_sample;
}
else
{
if (chunk_hdr.size < sizeof(s_wavefmtex)) return false;
s_wavefmtex *fmt_ex = reinterpret_cast<s_wavefmtex*>(fmt);
if (fmt_ex->extra_size != 0)
{
if (chunk_hdr.size < (sizeof(s_wavefmtex) + fmt_ex->extra_size)) return false;
uint8_t *extra_data = reinterpret_cast<uint8_t*>(fmt_ex + 1);
// use extra_data, up to extra_size bytes, as needed...
}
}
//cout << "extra_size=" << fmt_ex->extra_size << endl;
}
else if (memcmp(chunk_hdr.id, "data", 4) == 0)
{
// process chunk data, according to fmt, as needed...
size = padded_size;
if(bits_per_sample == 16)
{
//size = padded_size / 2;
}
data = new unsigned char[size];
file.read(data, size);
file.ignore(padded_size);
if (!file) return false;
}
{
// process other chunks as needed...
file.ignore(padded_size);
if (!file) return false;
}
}while (!file.eof());
return true;
}
}
This is where the "conversion to double" happens :
if(bits_per_sample == 8)
{
uint8_t c;
//cout << size;
for(unsigned i=0; (i < size); i++)
{
c = (unsigned)(unsigned char)(data[i]);
double t = (c-128)/128.0;
rawSignal.push_back(t);
}
}
else if(bits_per_sample == 16)
{
for (int i = 0; i < size; i += 2)
{
int c;
c = (unsigned) (unsigned char) (data[i + 2] << 8) | data[i];
double t = c/32768.0;
rawSignal.push_back(t);
}
Note how "8bit" files work correctly?
I suspect your problem may be that data is an array of signed char values. So, when you do this:
int c = (data[i + 1] << 8) | data[i];
… it's not actually doing what you wanted. Let's look at some simple examples.
If data[i+1] == 64 and data[i] == 64, that's going to be 0x4000 | 0x40, or 0x4040, all good.
If data[i+1] == -64 and data[i] == -64, that's going to be 0xffffc000 | 0xffffffc0, or 0xffffffc0, which is obviously wrong.
If you were using unsigned char values, this would work, because instead of -64 those numbers would be 192, and you'd end up with 0xc000 | 0xc0 or 0xc0c0, just as you want. (But then your /32768.0 would give you numbers in the range 0.0 to 2.0, when you presumably want -1.0 to 1.0.)
Suggesting a "fix" is difficult without knowing what exactly you're trying to do. Obviously you want to convert some kind of 16-bit little-endian integer format into some kind of floating-point format, but a lot rests on the exact details of those formats, and you haven't provided any such details. The default .wav format is 16-bit unsigned little-endian integers, so just using unsigned char * would fix that part of the equation. But I don't know of any audio format that uses 64-bit floating point numbers from 0.0 to 2.0, and I don't know what audio format you're actually aiming for, so I can't say what that /32768.0 should actually be, just that it's probably wrong.

How to use struct.unpack and convert it to a value in Objective-c

Code in Python
struct.unpack("< I",data.read(4))[0] # Unpack to int.
The data is read from a file, then read is used,
My question is how can we use, read and struct.unpack in Objective-c
I have the data in the format NSFileHandle which I could read byte by byte, so reading is not a problem now. The problem is converting the NSData I got into (int, short, float, string).
I don't know about Objective-C but in plain C you could use fread():
#include <inttypes.h> /* uint32_t and PRIu32 macros */
#include <stdbool.h> /* bool type */
#include <stdio.h>
/*
gcc *.c &&
python -c'import struct, sys; sys.stdout.write(struct.pack("<I", 123))' |
./a.out
*/
static bool is_little_endian(void) {
/* Find endianness of the system. */
const int n = 1;
return (*(char*)&n) == 1; /* 01 00 00 00 for little-endian */
}
static uint32_t reverse_byteorder(uint32_t n) {
uint32_t i;
char *c = (char*) &n;
char *p = (char*) &i;
p[0] = c[3];
p[1] = c[2];
p[2] = c[1];
p[3] = c[0];
return i;
}
int main() {
uint32_t n; /* '<' format assumes 4-byte integer */
if (fread(&n, sizeof(n), 1, stdin) != 1) {
fprintf(stderr, "error while reading unsigned from stdin");
return 1;
}
if (! is_little_endian())
/* convert from big-endian to little-endian ('<' format) */
n = reverse_byteorder(n);
printf("%" PRIu32 " 0x%08x\n", n, n);
return 0;
}
Output
123 0x0000007b

Categories