How to apply the CRC16 on a received bitstream? (Python) - python

could anyone here help me understand, how I apply the CRC16 on the received string?
stream = [0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0]
rest = [1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1]
https://pastebin.com/7CMtSEE6
I received this APRS message. I decoded it also with mutlimon directly and the I got the same ascii message as with my own PYTHON code.
Inside the stream variable is the received message after removing the bitstuffing and before reversing the bit sequence of each byte.
What I tried so far, but did not work, was to apply the x^16+x^12+x^5+1 on it. I wanted to get the same "rest" as you can see it in the pastebin. This rest was also inside the stream but for better readability I put that to the other variable. Bits inside the rest were just copynpasted out, so they have the same order.
So, where in my train of thoughts am I wrong?
Be frankly with me and I would like to ask you to tell me step by step, because I need to "see" how it works to understand it better. That would be really helpful for me.
So starting first:
stream = [0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, ...
I would now have the stream devided by the reverse of 0x11021
0,1,1,0,0,0,0,1,0,1,0,0,0,1,0,1,0,0,0,0,0,0,1,...
1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1
0,0,1,0,0,0,1,1,0,1,0,0,0,0,0,1,0,1
Is that the correct start? Or do I have to reverse each set of bytes in the stream first?
Thank you in advance,
Andreas
This question is related to this one, but specifically around crc16...
How to decode Bell 202 signal? (APRS data from International Space Station)

There are eleven CRC-16's in Greg Cook's catalog with that polynomial, and probably more that aren't in his catalog. So when you say "CRC-16", it could be a lot of different things.
Your binary data can be interpreted as bytes in two different ways, either most significant bit first, or least significant bit first. If the former, then the CRC in "rest" would be the CRC-16/GENIBUS. If the latter, then it is the X-25 CRC-16.
In both cases the CRC register is initialized with all 1's (0xffff), and at the end is inverted, i.e. exclusive-or'ed with 0xffff. It is that initialization and post-processing that you are probably missing in your attempt to replicate the CRC.
Here is some simple C code for the X-25 CRC-16, which would operate on your data assuming least significant bit first:
unsigned crc16x25(unsigned char *data, size_t len) {
unsigned crc = 0xffff;
while (len--) {
crc ^= *data++;
for (unsigned k = 0; k < 8; k++)
crc = crc & 1 ? (crc >> 1) ^ 0x8408 : crc >> 1;
}
crc ^= 0xffff;
return crc;
}
So if you feed that 0x86 0xa2 ... 0x47, you will get 0xec3f as the CRC.

Related

Conflicting versions of sha256 bit calculation

I'm computing sha256 from two different sources, both ran on bit arrays. In Python, I run
from bitarray import bitarray
from hashlib import sha256
inbits = bitarray([1,0,1,0,1,0,1,0,1,0])
sha = sha256(inbits)
outbits = bitarray()
outbits.frombytes(sha.digest())
The other source is a circuit implementation of sha256 (implemented in circom). I'm just wondering if there are different implementations of sha256, as running the sha256 circuit and python code give different outputs.
Output from circom:
0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0,
0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0,
1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0,
1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0,
0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1,
0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1,
1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1,
1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1,
1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1,
0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1,
0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0,
1, 1, 0, 0, 1, 0, 1, 1, 0]
and output from python:
bitarray('1110111001111011010110111001100001000011011101011100000100111011001101011111000000010101110000001001100011100001100011010111011110001100100010110010111111110011111101010111111110101000101111011010010001011101000001101110101110111011011010111100101101111100')

Conversion between binary vector and 128 bit number

Is there a way to convert back and forth between a binary vector and a 128-bit number? I have the following binary vector:
import numpy as np
bits = np.array([1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1,
0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0,
1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1,
0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1,
1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1], dtype=np.uint8)
which is a MD5 hash that I am trying use as a feature for a scikit-learn machine learning classifier (I need to represent the hash as a single feature).
As commented above, numpy only goes up to 64bits, but python has variable length ints, so we can do 128bits int no problem.
The following will go from binary in np.array to python int back to binary in np.array.
import numpy as np
bits = np.array(
[
1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1,
0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0,
1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1,
0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1,
1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1
],
dtype=np.uint8
)
s = "".join(bits.astype("str")) # move from array to string
n = int(s, 2) # convert to int from string of base 2
print(n)
s = bin(n)[2:] # get binary of int, cut "0b" prefix
np.array(list(s), dtype=np.uint8) # put back in np.array

How to count 0's & 1's in a matrix using python?

I have a matrix as shown below ,
matrix([[0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1,
1, 0, 1, 1, 0, 1, 0, 0, 1],
[0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0,
1, 1, 0, 0, 0, 1, 0, 0, 1],
[0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0,
1, 1, 0, 0, 1, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1,
0, 1, 0, 0, 0, 1, 1, 0, 0],
[1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0,
1, 0, 0, 0, 0, 1, 1, 0, 1],
[0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0,
0, 1, 0, 1, 0, 1, 0, 0, 0],
[0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1,
0, 1, 1, 1, 1, 0, 0, 1, 0],
[0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1,
1, 0, 1, 0, 1, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0,
1, 0, 0, 1, 0, 0, 1, 1, 1],
[1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 1, 1, 0, 1, 0]])
I want to count the 0's and 1's in that matrix.
The code i tried is ,
def countZeroes(mat):
# start from bottom-left
# corner of the matrix
N = 10;
row = N-1;
col = 0;
# stores number of
# zeroes in the matrix
count = 0;
while (col < N):
# move up until
# you find a 0
while (mat[row][col]):
# if zero is not found
# in current column, we
# are done
if (row < 0):
return count;
row = row - 1;
# add 0s present in
# current column to result
count = count + (row + 1);
# move right to
# next column
col = col + 1;
return count;
The above code is to count 0's.
Could you help me in solving this problem?
I would request you to provide me an answer using loops.
Thanks!
Looks like we don't care at all about the positions of the various 0s and 1s - so, we're not counting on a per-row or per-column basis, then.
matrix = [[0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1,
1, 0, 1, 1, 0, 1, 0, 0, 1],
[0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0,
1, 1, 0, 0, 0, 1, 0, 0, 1],
[0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0,
1, 1, 0, 0, 1, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1,
0, 1, 0, 0, 0, 1, 1, 0, 0],
[1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0,
1, 0, 0, 0, 0, 1, 1, 0, 1],
[0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0,
0, 1, 0, 1, 0, 1, 0, 0, 0],
[0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1,
0, 1, 1, 1, 1, 0, 0, 1, 0],
[0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1,
1, 0, 1, 0, 1, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0,
1, 0, 0, 1, 0, 0, 1, 1, 1],
[1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 1, 1, 0, 1, 0]]
The pythonic solution would be to use a comprehension inside a call to the built-in sum() to just count the number of 1s, then subtract that from the size of the matrix:
matrix_height = len(matrix)
matrix_width = len(matrix[0])
num_ones = sum(cell for row in matrix for cell in row)
num_zeroes = (matrix_height * matrix_width) - num_ones
# num_zeroes = 155
However, we can also just use two nested for loops to count the number of zeroes that way:
num_zeroes = 0
for row in matrix:
for cell in row:
if cell == 0:
num_zeroes += 1
# num_zeroes = 155
You can see that the comprehension I demonstrated before is essentially just these two for loops, but in a single line.
You can use sum() and map() to obtain the number of ones. Then subtract that number from the total number of elements to obtain the number of zeroes:
ones = sum(map(sum,matrix))
zeros = sum(map(len,matrix))-ones

only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer arrays are valid indices

I have an array called predictions
array([0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0,
1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1,
1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1,
1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1,
0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1,
1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0,
1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,
0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0,
1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0,
0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0,
1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1,
0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0],
dtype=int64)
I try to convert this array to dataframe column but have an error mentioned in the title. I try to convert elements of ndarray to an integer, but none of this works.
Xtest["Survived"] = [int(x) for x in predictions]
Xtest["Survived"] = [x.astype(int) for x in predictions]
Xtest["Survived"] = pd.Series(predictions)
Also when I type
type(predictions[1])
I got
numpy.int64
As solved in the comments, the error is thrown because Xdata was an ndarray and not a DataFrame. An ndarray can't be indexed with a string ("Survived"), thus the error message about indeces. The problem is unrelated to the predictions data. If Xtest is a DataFrame, any of the lines above should work. They could be simplified to
Xtest["Survived"] = predictions

Morgan fingerprint rdkit

Working in an example I realized that there are at least two ways of computing morgan fingerprints for a molecule using rdkit. But using the exact same properties in both ways I get different vectors. Am I missing something?
First approach:
info = {}
mol = Chem.MolFromSmiles('C/C1=C\\C[C#H]([C+](C)C)CC/C(C)=C/CC1')
fp = AllChem.GetMorganFingerprintAsBitVect(mol, useChirality=True, radius=2, nBits = 124, bitInfo=info)
vector = np.array(fp)
vector
array([0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1])
Second approach:
morgan_fp_gen = rdFingerprintGenerator.GetMorganGenerator(includeChirality=True, radius=2, fpSize=124)
mol = Chem.MolFromSmiles('C/C1=C\\C[C#H]([C+](C)C)CC/C(C)=C/CC1')
fp = morgan_fp_gen.GetFingerprint(mol)
vector = np.array(fp)
vector
array([0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0,
1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0])
Which are clearly different, even using chirality in both cases.
Besides, there is a way to get the bitInfo from a bit vector using the second approach?
By default the Morgan Generator uses "count simulation": adding extra bits to a bit vector fingerprint in order to get bit-vector similarities. If you turn this off by passing useCountSimulation=False the fingerprints should be equivalent:
mol = Chem.MolFromSmiles('C/C1=C\\C[C#H]([C+](C)C)CC/C(C)=C/CC1')
fp1 = AllChem.GetMorganFingerprintAsBitVect(mol, useChirality=True, radius=2, nBits=124)
vec1 = np.array(fp1)
morgan_fp_gen = rdFingerprintGenerator.GetMorganGenerator(includeChirality=True, radius=2, fpSize=124, useCountSimulation=False)
fp2 = morgan_fp_gen.GetFingerprint(mol)
vec2 = np.array(fp2)
assert np.all(vec1 == vec2) == True
as for the bitInfo I am not sure this can be done with the second method, although someone may correct me

Categories