C function to Python (different results) - python

I am trying to port this snippet of code to python from C. The outputs are different even though it's the same code.
This is the C version of the code which works:
int main(void)
{
uint8_t pac[] = {0x033,0x55,0x22,0x65,0x76};
uint8_t len = 5;
uint8_t chan = 0x64;
btLeWhiten(pac, len, chan);
for(int i = 0;i<=len;i++)
{
printf("Whiten %02d \r\n",pac[i]);
}
while(1)
{
}
return 0;
}
void btLeWhiten(uint8_t* data, uint8_t len, uint8_t whitenCoeff)
{
uint8_t m;
while(len--){
for(m = 1; m; m <<= 1){
if(whitenCoeff & 0x80){
whitenCoeff ^= 0x11;
(*data) ^= m;
}
whitenCoeff <<= 1;
}
data++;
}
}
What I currently have in Python is:
def whiten(data, len, whitenCoeff):
idx = len
while(idx > 0):
m = 0x01
for i in range(0,8):
if(whitenCoeff & 0x80):
whitenCoeff ^= 0x11
data[len - idx -1 ] ^= m
whitenCoeff <<= 1
m <<= 0x01
idx = idx - 1
pac = [0x33,0x55,0x22,0x65,0x76]
len = 5
chan = 0x64
def main():
whiten(pac,5,chan)
print pac
if __name__=="__main__":
main()
The problem i see is that whitenCoeff always remain 8 bits in the C snippet but it gets larger than 8 bits in Python on each loop pass.

You've got a few more problems.
whitenCoeff <<= 1; is outside of the if block in your C code, but it's inside of the if block in your Python code.
data[len - idx -1 ] ^= m wasn't translated correctly, it works backwards from the C code.
This code produces the same output as your C code:
def whiten(data, whitenCoeff):
for index in range(len(data)):
for i in range(8):
if (whitenCoeff & 0x80):
whitenCoeff ^= 0x11
data[index] ^= (1 << i)
whitenCoeff = (whitenCoeff << 1) & 0xff
return data
if __name__=="__main__":
print whiten([0x33,0x55,0x22,0x65,0x76], 0x64)

In C you are writing data from 0 to len-1 but in Python you are writing data from -1 to len-2. Remove the -1 from this line:
data[len - idx -1 ] ^= m
like this
data[len - idx] ^= m
you also need to put this line outside the if:
whitenCoeff <<= 1

whitenCoeff <<= 1 in C becomes 0 after a while because it is a 8-bit data.
In python, there's no such limit, so you have to write:
whitenCoeff = (whitenCoeff<<1) & 0xFF
to mask higher bits out.
(don't forget to check vz0 remark on array boundary)
plus there was an indentation issue.
rewritten code which gives same result:
def whiten(data, whitenCoeff):
idx = len(data)
while(idx > 0):
m = 0x01
for i in range(0,8):
if(whitenCoeff & 0x80):
whitenCoeff ^= 0x11
data[-idx] ^= m
whitenCoeff = (whitenCoeff<<1) & 0xFF
m <<= 0x01
idx = idx - 1
pac = [0x33,0x55,0x22,0x65,0x76]
chan = 0x64
def main():
whiten(pac,chan)
print(pac)
if __name__=="__main__":
main()
Slightly off-topic: Note that the C version already has problems:
for(int i = 0;i<=len;i++)
should be
for(int i = 0;i<len;i++)

I solved it by anding the python code with 0xFF. That keeps the variable from increasing beyond 8 bits.

Your code in C does not appear to work as intended since it displays one more value than is available in pac. Correcting for this should cause 5 values to be displayed instead of 6 values. To copy the logic from C over to Python, the following was written in an attempt to duplicate the results:
#! /usr/bin/env python3
def main():
pac = bytearray(b'\x33\x55\x22\x65\x76')
chan = 0x64
bt_le_whiten(pac, chan)
print('\n'.join(map('Whiten {:02}'.format, pac)))
def bt_le_whiten(data, whiten_coeff):
for offset in range(len(data)):
m = 1
while m & 0xFF:
if whiten_coeff & 0x80:
whiten_coeff ^= 0x11
data[offset] ^= m
whiten_coeff <<= 1
whiten_coeff &= 0xFF
m <<= 1
if __name__ == '__main__':
main()
To simulate 8-bit unsigned integers, the snippet & 0xFF is used in several places to truncate numbers to the proper size. The bytearray data type is used to store pac since that appears to be the most appropriate storage method in this case. The code still needs documentation to properly understand it.

Related

Why does this algorithm work so much faster in python than in C++?

I was reading "Algorithms in C++" by Robert Sedgewick and I was given this exercise: rewrite this weigted quick-union with path compression by halving algorithm in another programming language.
The algorithm is used to check if two objects are connected, for example for entry like 1 - 2, 2 - 3 and 1 - 3 first two entries create new connections whereas in the third entry 1 and 3 are already connected because 3 can be reached from 1: 1 - 2 - 3, so the third entry would not require creating a new connection.
Sorry if the algorithm description is not understandable, english is not my mother's tongue.
So here is the algorithm itself:
#include <iostream>
#include <ctime>
using namespace std;
static const int N {100000};
int main()
{
srand(time(NULL));
int i;
int j;
int id[N];
int sz[N]; // Stores tree sizes
int Ncount{}; // Counts the numbeer of new connections
int Mcount{}; // Counts the number of all attempted connections
for (i = 0; i < N; i++)
{
id[i] = i;
sz[i] = 1;
}
while (Ncount < N - 1)
{
i = rand() % N;
j = rand() % N;
for (; i != id[i]; i = id[i])
id[i] = id[id[i]];
for (; j != id[j]; j = id[j])
id[j] = id[id[j]];
Mcount++;
if (i == j) // Checks if i and j are connected
continue;
if (sz[i] < sz[j]) // Smaller tree will be
// connected to a bigger one
{
id[i] = j;
sz[j] += sz[i];
}
else
{
id[j] = i;
sz[i] += sz[j];
}
Ncount++;
}
cout << "Mcount: " << Mcount << endl;
cout << "Ncount: " << Ncount << endl;
return 0;
}
I know a tiny bit of python so I chose it for this exercise. This is what got:
import random
N = 100000
idList = list(range(0, N))
sz = [1] * N
Ncount = 0
Mcount = 0
while Ncount < N - 1:
i = random.randrange(0, N)
j = random.randrange(0, N)
while i is not idList[i]:
idList[i] = idList[idList[i]]
i = idList[i]
while j is not idList[j]:
idList[j] = idList[idList[j]]
j = idList[j]
Mcount += 1
if i is j:
continue
if sz[i] < sz[j]:
idList[i] = j
sz[j] += sz[i]
else:
idList[j] = i
sz[i] += sz[j]
Ncount += 1
print("Mcount: ", Mcount)
print("Ncount: ", Ncount)
But I stumbled upon this interesting nuance: when I set N to 100000 or more C++ version version appears to be a lot slower than the python one - it took about 10 seconds to complete the task for the algorithm in python whereas C++ version was doing it so slow I just had to shut it down.
So my question is: what is the cause of that? Does this happen because of the difference in rand() % N and random.randrange(0, N)? Or have I just done something wrong?
I'd be very grateful if someone could explain this to me, thanks in advance!
Those codes do different things.
You have to compare numbers in python with ==.
>>> x=100000
>>> y=100000
>>> x is y
False
There might be other problems, haven't checked. Have you compared the results of the apps?
As pointed out above the codes are not equivalent and especially when it comes to the use of is vs ==.
Look at the following Pyhton code:
while i is not idList[i]:
idList[i] = idList[idList[i]]
i = idList[i]
This is evaluated 0 or 1 times. Why?. Because if the while evaluates to True the 1st time, then i = idList[i] makes the condition True in the 2nd pass, because now i is for sure a number which is in idList
The equivalent c++
for (; i != id[i]; i = id[i])
id[i] = id[id[i]];
Here the code is checking against equality and not against presence and the number of times it runs it is not fixed to be 0 or 1
So yes ... using is vs == makes a huge difference because in Python you are testing instance equality and being contained in, rather than testing simple equality in the sense of equivalence.
The comparison of Python and C++ above is like comparing apples and pears.
Note: The short answer to the question would be: The Python version runs much faster because it is doing a lot less than the C++ version

CRC32 hash of python string

Using an existing C example algorithm, I want to generate the correct CRC32 hash for a string in python. However, I am receiving incorrect results. I mask the result of every operation and attempt to copy the original algorithm's logic. The C code was provided by the same website which has a webpage string hash checking tool, so it is likely to be correct.
Below is a complete Python file including C code in its comments which it attempts to mimic. All pertinent information is in the file.
P_32 = 0xEDB88320
init = 0xffffffff
_ran = True
tab32 = []
def mask32(n):
return n & 0xffffffff
def mask8(n):
return n & 0x000000ff
def mask1(n):
return n & 0x00000001
def init32():
for i in range(256):
crc = mask32(i)
for j in range(8):
if (mask1(crc) == 1):
crc = mask32(mask32(crc >> 1) ^ P_32)
else:
crc = mask32(crc >> 1)
tab32.append(crc)
global _ran
_ran = False
def update32(crc, char):
char = mask8(char)
t = crc ^ char
crc = mask32(mask32(crc >> 8) ^ tab32[mask8(t)])
return crc
def run(string):
if _ran:
init32()
crc = init
for c in string:
crc = update32(crc, ord(c))
print(hex(crc)[2:].upper())
check0 = "The CRC32 of this string is 4A1C449B"
check1 = "123456789" # CBF43926
run(check0) # Produces B5E3BB64
run(check1) # Produces 340BC6D9
# Check CRC-32 on http://www.lammertbies.nl/comm/info/crc-calculation.html#intr
"""
/* http://www.lammertbies.nl/download/lib_crc.zip */
#define P_32 0xEDB88320L
static int crc_tab32_init = FALSE;
static unsigned long crc_tab32[256];
/*******************************************************************\
* *
* unsigned long update_crc_32( unsigned long crc, char c ); *
* *
* The function update_crc_32 calculates a new CRC-32 value *
* based on the previous value of the CRC and the next byte *
* of the data to be checked. *
* *
\*******************************************************************/
unsigned long update_crc_32( unsigned long crc, char c ) {
unsigned long tmp, long_c;
long_c = 0x000000ffL & (unsigned long) c;
if ( ! crc_tab32_init ) init_crc32_tab();
tmp = crc ^ long_c;
crc = (crc >> 8) ^ crc_tab32[ tmp & 0xff ];
return crc;
} /* update_crc_32 */
/*******************************************************************\
* *
* static void init_crc32_tab( void ); *
* *
* The function init_crc32_tab() is used to fill the array *
* for calculation of the CRC-32 with values. *
* *
\*******************************************************************/
static void init_crc32_tab( void ) {
int i, j;
unsigned long crc;
for (i=0; i<256; i++) {
crc = (unsigned long) i;
for (j=0; j<8; j++) {
if ( crc & 0x00000001L ) crc = ( crc >> 1 ) ^ P_32;
else crc = crc >> 1;
}
crc_tab32[i] = crc;
}
crc_tab32_init = TRUE;
} /* init_crc32_tab */
"""
There's just one thing that's wrong with the current implementation and the fix is actually just one line of code to the end of your run function which is:
crc = crc ^ init
Which if added to your run function look like this:
def run(string):
if _ran:
init32()
crc = init
for c in string:
crc = update32(crc, ord(c))
crc = crc ^ init
print(hex(crc)[2:].upper())
This will give you the correct results you are expecting.The reason that this is necessary is after you are done updating the CRC32, the finalization of it is XORing it with the 0xFFFFFFFF. Since you only had the init table and update functions and not the finalize, you were one step off from the actual crc.
Another C implimentation that is a little more straightforward is this one it's a little bit easier to see the whole process. The only thing slightly obsure is the init poly ~0x0 is the same (0xFFFFFFFF).

Rice coding in Cython

Here is an implementation of well-known Rice coding (= Golomb code with M = 2^k http://en.wikipedia.org/wiki/Golomb_coding), widely used in compression algorithms, in Python.
Unfortunately it is rather slow. What could be the cause of this low speed ? (StringIO? the fact that data is written byte after byte?)
What would you recommand to use in order to speed it the encoding ? What trick would you use to speed it up with Cython ?
import struct
import StringIO
def put_bit(f, b):
global buff, filled
buff = buff | (b << (7-filled))
if (filled == 7):
f.write(struct.pack('B',buff))
buff = 0
filled = 0
else:
filled += 1
def rice_code(f, x, k):
q = x / (1 << k)
for i in range(q):
put_bit(f, 1)
put_bit(f, 0)
for i in range(k-1, -1, -1):
put_bit(f, (x >> i) & 1)
def compress(L, k):
f = StringIO.StringIO()
global buff, filled
buff = 0
filled = 0
for x in L: # encode all numbers
rice_code(f, x, k)
for i in range(8-filled): # write the last byte (if necessary pad with 1111...)
put_bit(f, 1)
return f.getvalue()
if __name__ == '__main__':
print struct.pack('BBB', 0b00010010, 0b00111001, 0b01111111) #see http://fr.wikipedia.org/wiki/Codage_de_Rice#Exemples
print compress([1,2,3,10],k = 3)
PS : Should this question be moved to https://codereview.stackexchange.com/ ?
I would use a C-style buffer instead of StringIO when building the compressed result and I would attempt to use only C-style temporaries in the encoding loop. I also noticed that you can pre-initialize your buffer to be filled with set bits ('1' bits), and this will make encoding values with a large quotient faster because you can simply skip over those bits in the output buffer. I rewrote the compress function with those things in mind, and measured the speed of the result, and it seems my version is more than ten times faster than your encoder, but the resulting code is less readable.
Here is my version:
cimport cpython.string
cimport libc.stdlib
cimport libc.string
import struct
cdef int BUFFER_SIZE = 4096
def compress(L, k):
result = ''
cdef unsigned cvalue
cdef char *position
cdef int bit, nbit
cdef unsigned q, r
cdef unsigned ck = k
cdef unsigned mask = (1 << ck) - 1
cdef char *buff = <char *>libc.stdlib.malloc(BUFFER_SIZE)
if buff is NULL:
raise MemoryError
try:
# Initialize the buffer space is assumed to contain all set bits
libc.string.memset(buff, 0xFF, BUFFER_SIZE)
position = buff
bit = 7
for value in L:
cvalue = value
q = cvalue >> ck
r = cvalue & mask
# Skip ahead some number of pre-set one bits for the quotient
position += q / 8
bit -= q % 8
if bit < 0:
bit += 8
position += 1
# If we have gone off the end of the buffer, extract
# the result and reset buffer pointers
while position - buff >= BUFFER_SIZE:
block = cpython.string.PyString_FromStringAndSize(
buff, BUFFER_SIZE)
result = result + block
libc.string.memset(buff, 0xFF, BUFFER_SIZE)
position = position - BUFFER_SIZE
# Clear the final bit to indicate the end of the quotient
position[0] = position[0] ^ (1 << bit)
if bit > 0:
bit = bit - 1
else:
position += 1
bit = 7
# Check for buffer overflow
if position - buff >= BUFFER_SIZE:
block = cpython.string.PyString_FromStringAndSize(
buff, BUFFER_SIZE)
result = result + block
libc.string.memset(buff, 0xFF, BUFFER_SIZE)
position = buff
# Encode the remainder bits one by one
for nbit in xrange(k - 1, -1, -1):
position[0] = (position[0] & ~(1 << bit)) | \
(((r >> nbit) & 1) << bit)
if bit > 0:
bit = bit - 1
else:
position += 1
bit = 7
# Check for buffer overflow
if position - buff >= BUFFER_SIZE:
block = cpython.string.PyString_FromStringAndSize(
buff, BUFFER_SIZE)
result = result + block
libc.string.memset(buff, 0xFF, BUFFER_SIZE)
position = buff
# Advance if we have partially used the last byte
if bit < 7:
position = position + 1
# Extract the used portion of the buffer
block = cpython.string.PyString_FromStringAndSize(
buff, position - buff)
result = result + block
return result
finally:
libc.stdlib.free(buff)
def test():
a = struct.pack('BBB', 0b00010010, 0b00111001, 0b01111111) #see http://fr.wikipedia.org/wiki/Codage_de_Rice#Exemples
b = compress([1,2,3,10],k = 3)
assert a == b

converting C code to Python by hand

Im am trying to covert the following C code to Python. I have no experience in C but a little in Python.
main( int argc, char *argv[])
{
char a[] = "ds dsf ds sd dsfas";
unsigned char c;
int d, j;
for(d = 0; d < 26; d++)
{
printf("d = %d: ", d);
for (j = 0; j < 21; j++ )
{
if( a[j] == ' ')
c = ' ';
else
{
c = a[j] + d;
if (c > 'z')
c = c - 26;
}
printf("%c", c);
}
printf("\n");
}
I have managed to up to this point: Where I get an list index out of range exception, any suggestions?
d=0
a=["ds dsf ds sd dsfas"]
while (d <26):
print("d = ",d)
d=d+1
j=0
while(j<21):
if a[j]=='':
c =''
else:
c = answer[j]+str(d)
if c>'z':
c=c-26
j=j+1
print("%c",c)
I hope this does what your C code is trying to achieve:
#! /usr/bin/python2.7
import string
a = 'ds dsf ds sd dsfas' #input
for d in range (26): #the 26 possible Caesar's cypher keys
shift = string.ascii_lowercase [d:] + string.ascii_lowercase [:d] #rotate the lower ase ascii with offset d
tt = string.maketrans (string.ascii_lowercase, shift) #convenience function to create a transformation, mapping each character to its encoded counterpart
print 'd = {}:'.format (d) #print out the key
print a.translate (tt) #translate the plain text and print it
the loop executes till j becomes 21.But I dont think you have that many elements in the a list. Thats why you get the error. I think len(a) is 18. So changing the loop as:
while j<len(a):
#code
or
while j<18:
#code
Will clear the error
See this, It's been explained with comments:
d=0
a=["ds dsf ds sd dsfas"]
# this will print 1 as a is a list object
# and it's length is 1 and a[0] is "ds dsf ds sd dsfas"
print len(a)
# and your rest of program is like this
while (d <26):
print("d = ",d)
d=d+1
#j=0
# while(j<21): it's wrong as list length is 1
# so it will give list index out of bound error
# in c array does not check for whether array's index is within
# range or not so it will not give out of bound error
for charValue in a:
if charValue is '':
c =''
else:
c = charValue +str(d) # here you did not initialized answer[i]
if c>'z':
c=c-26
#j=j+1
print("%c",c)

Need help porting C function to Python

I'm trying to port a C function which calculates a GPS checksum over to Python. According to the receiving end I am sometimes miscalculating the checksum, so must still have a bug in there.
C code is
void ComputeAsciiChecksum(unsigned char *data, unsigned int len,
unsigned char *p1, unsigned char *p2)
{
unsigned char c,h,l;
assert(Stack_Low());
c = 0;
while (len--) {
c ^= *data++;
}
h = (c>>4);
l = c & 0xf;
h += '0';
if (h > '9') {
h += 'A'-'9'-1;
}
l += '0';
if (l > '9') {
l += 'A'-'9'-1;
}
*p1 = h;
*p2 = l;
}
My attempt at a Python function is
def calcChecksum(line):
c = 0
i = 0
while i < len(line):
c ^= ord(line[i]) % 256
i += 1
return '%02X' % c;
Here is how you can set up a testing environment to diagnose your problem.
Copy the above C function to a file, remove the assert() line, and compile it to a shared library with
gcc -shared -o checksum.so checksum.c
(If you are on Windows or whatever, do the equivalent of the above.)
Copy this code to a Python file:
import ctypes
import random
c = ctypes.CDLL("./checksum.so")
c.ComputeAsciiChecksum.rettype = None
c.ComputeAsciiChecksum.argtypes = [ctypes.c_char_p, ctypes.c_uint,
ctypes.c_char_p, ctypes.c_char_p]
def compute_ascii_checksum_c(line):
p1 = ctypes.create_string_buffer(1)
p2 = ctypes.create_string_buffer(1)
c.ComputeAsciiChecksum(line, len(line), p1, p2)
return p1.value + p2.value
def compute_ascii_checksum_py(line):
c = 0
i = 0
while i < len(line):
c ^= ord(line[i]) % 256
i += 1
return '%02X' % c;
Now you have access to both versions of the checksum function and can compare the results. I wasn't able to find any differences.
(BTW, how are you computing the length of the string in C? If you are using strlen(), this would stop at NUL bytes.)
As a side note, your Python version isn't really idiomatic Python. Here are two more idiomatic versions:
def compute_ascii_checksum_py(line):
checksum = 0
for c in line:
checksum ^= ord(c)
return "%02X" % checksum
or
def compute_ascii_checksum_py(line):
return "%02X" % reduce(operator.xor, map(ord, line))
Note that these implementations should do exactly the same as yours.
Have you checked out this cookbook recipe? It hints at what input you should include in "line", returns a asterisk in front of the checksum, and gives one (input, output) data pair that you can use as test data.
Are you sure that "the receiver" is working correctly? Is the problem due to upper vs lower case hex letters?

Categories