Add mask to a byte-array knowing bits length and starting position - python

I have to apply a bit mask to a CAN-bus payload message (8 bytes) to filter a single signal (there are a multiple signals in a message) in Python 3 and my inputs are:
Length of the signal I want to filter in binary (think about a set of '1's).
The starting position of the signal.
The problem is that the signal can start in the middle of a byte and occupy more than 1 byte.
For example I have to filter a signal with starting bit position = 50 and length = 10
The mask will be byte 6 = (00111111) and byte 7 = (11000000). All other bytes set to 0.
I've tried to build an array of bytes with 1's and then apply | with an empty 8-byte length array to have the mask. And also create directly the 8-byte array but can't achieve how to bitwise correctly the starting position.
I tried with bitstring module and bytearray but can't find a good solution.
Could anyone help?
Thank you very much.
Edit: adding non-functional code if signal starts in the middle of byte:
my_mask_byte = [0, 0, 0, 0, 0, 0, 0, 0]
message_bit_pos = 50
message_signal_lenght = 10
byte_pos = message_bit_pos // 8
bit_pos = message_bit_pos % 8
for i in range(0, message_signal_lenght):
if i < 8:
my_mask_byte[byte_pos + i // 8] = 1 << i + bit_pos | my_mask_byte[byte_pos + i // 8]
else:
my_mask_byte[byte_pos + i // 8] = 1 << i-8 | my_mask_byte[byte_pos + i // 8]
for byte in my_mask_byte:
print(bin(byte))

byte 6 = (00111111) and byte 7 = (11110000)
You missed 2 bits since length is 10.
You can easily achieve this with numpy:
import numpy as np
message_bit_pos = 50
message_signal_length = 10
mask = np.uint64(0)
while message_signal_length > 0:
mask |= np.uint64(1 << (64-50-message_signal_length))
message_signal_length -= 1
print(f'mask: 0b{mask:064b}')
n = np.uint64(0b0000000000000000011000000000000000000000000000000011111111000000)
print(f'n: 0b{n:064b}')
n &= mask
print(f'n&m: 0b{n:064b}')
output:
mask: 0b0000000000000000000000000000000000000000000000000011111111110000
n: 0b0000000000000000011000000000000000000000000000000011111111000000
n&m: 0b0000000000000000000000000000000000000000000000000011111111000000

Related

Converting lower and upper 8 bits into one value with python

I have pairs like these: (-102,-56), (123, -56). First value from the pairs represents the lower 8 bits and the second value represents the upper 8 bits, both are in signed decimal form. I need to convert these pairs into a single 16 bit values.
I think I was able to convert (-102,-56) pair by:
l = bin(-102 & 0b1111111111111111)[-8:]
u = bin(-56 & 0b1111111111111111)[-8:]
int(u+l,2)
But when I try to do the same with (123, -56) pair I get the following error:
ValueError: invalid literal for int() with base 2: '11001000b1111011'.
I understand that it's due to the different lengths for different values and I need to fill them up to 8 bits.
Am I approaching this completely wrong? What's the best way to do this so it works both on negative and positive values?
UPDATE:
I was able to solve this by:
low_int = 123
up_int = -56
(low_int & 0xFF) | ((up_int & 0xFF) << 8)
You can try to shift the first value 8 bits: try to use the logic described here https://stackoverflow.com/a/1857965/8947333
Just guessing
l, u = -102 & 255, -56 & 255
# shift 8 bits to left
u << 8 + l
Bitwise operations are fine, but not strictly required.
In the most common 2's complement representation for 8 bits:
-1 signed == 255 unsigned
-2 signed == 254 unsigned
...
-127 signed = 129 usigned
-128 signed = 128 usigned
simply the two absolute values always give the sum 256.
Use this to convert negative values:
if b < 0:
b += 256
and then combine the high and low byte:
value = 256 * hi8 + lo8

Represent 3 values as 1 in numpy array

I am attempting to convert a 3 channel numpy array to a single channel numpy array. I want to combine all 3 element values into 1 number using:
x << 16 + y << 8 + z
My code below does that but it seems to make alot of the numbers zero. Is that correct? Or am I doing something wrong? Should those last 2 numbers be zero or something else?
ar = np.array((
((255,255,255),),
((255,20,255),),
((0,255,255),), # this becomes zero, is that correct?
((22,10,12),), # this becomes zero, is that correct?
), dtype='uint8')
c1,c2,c3 = cv2.split(ar)
single = np.int32(c1) << 16 + np.int32(c2) << 8 + np.int32(c3)
print(single)
print(ar.shape)
[[1069547520]
[ 522240]
[ 0]
[ 0]]
(4, 1, 3)
Add a column of zeros to make the array 4 bytes wide:
ar4 = np.insert(ar, 0, 0, 2)
Then simply view it as a big-endian array of 4-byte integers:
ar4.view('>u4')
This gives:
array([[[16777215]],
[[16717055]],
[[ 65535]],
[[ 1444364]]], dtype=uint32)
The only step here which really takes time is np.insert(), so if you are able to add that extra column while loading your data, the rest of the transformation is basically free (i.e. does not require copying data).

Python - reading 10 bit integers from a binary file

I have a binary file containing a stream of 10-bit integers. I want to read it and store the values in a list.
It is working with the following code, which reads my_file and fills pixels with integer values:
file = open("my_file", "rb")
pixels = []
new10bitsByte = ""
try:
byte = file.read(1)
while byte:
bits = bin(ord(byte))[2:].rjust(8, '0')
for bit in reversed(bits):
new10bitsByte += bit
if len(new10bitsByte) == 10:
pixels.append(int(new10bitsByte[::-1], 2))
new10bitsByte = ""
byte = file.read(1)
finally:
file.close()
It doesn't seem very elegant to read the bytes into bits, and read it back into "10-bit" bytes. Is there a better way to do it?
With 8 or 16 bit integers I could just use file.read(size) and convert the result to an int directly. But here, as each value is stored in 1.25 bytes, I would need something like file.read(1.25)...
Here's a generator that does the bit operations without using text string conversions. Hopefully, it's a little more efficient. :)
To test it, I write all the numbers in range(1024) to a BytesIO stream, which behaves like a binary file.
from io import BytesIO
def tenbitread(f):
''' Generate 10 bit (unsigned) integers from a binary file '''
while True:
b = f.read(5)
if len(b) == 0:
break
n = int.from_bytes(b, 'big')
#Split n into 4 10 bit integers
t = []
for i in range(4):
t.append(n & 0x3ff)
n >>= 10
yield from reversed(t)
# Make some test data: all the integers in range(1024),
# and save it to a byte stream
buff = BytesIO()
maxi = 1024
n = 0
for i in range(maxi):
n = (n << 10) | i
#Convert the 40 bit integer to 5 bytes & write them
if i % 4 == 3:
buff.write(n.to_bytes(5, 'big'))
n = 0
# Rewind the stream so we can read from it
buff.seek(0)
# Read the data in 10 bit chunks
a = list(tenbitread(buff))
# Check it
print(a == list(range(maxi)))
output
True
Doing list(tenbitread(buff)) is the simplest way to turn the generator output into a list, but you can easily iterate over the values instead, eg
for v in tenbitread(buff):
or
for i, v in enumerate(tenbitread(buff)):
if you want indices as well as the data values.
Here's a little-endian version of the generator which gives the same results as your code.
def tenbitread(f):
''' Generate 10 bit (unsigned) integers from a binary file '''
while True:
b = f.read(5)
if not len(b):
break
n = int.from_bytes(b, 'little')
#Split n into 4 10 bit integers
for i in range(4):
yield n & 0x3ff
n >>= 10
We can improve this version slightly by "un-rolling" that for loop, which lets us get rid of the final masking and shifting operations.
def tenbitread(f):
''' Generate 10 bit (unsigned) integers from a binary file '''
while True:
b = f.read(5)
if not len(b):
break
n = int.from_bytes(b, 'little')
#Split n into 4 10 bit integers
yield n & 0x3ff
n >>= 10
yield n & 0x3ff
n >>= 10
yield n & 0x3ff
n >>= 10
yield n
This should give a little more speed...
As there is no direct way to read a file x-bit by x-bit in Python, we have to read it byte by byte. Following MisterMiyagi and PM 2Ring's suggestions I modified my code to read the file by 5 byte chunks (i.e. 40 bits) and then split the resulting string into 4 10-bit numbers, instead of looping over the bits individually. It turned out to be twice as fast as my previous code.
file = open("my_file", "rb")
pixels = []
exit_loop = False
try:
while not exit_loop:
# Read 5 consecutive bytes into fiveBytesString
fiveBytesString = ""
for i in range(5):
byte = file.read(1)
if not byte:
exit_loop = True
break
byteString = format(ord(byte), '08b')
fiveBytesString += byteString[::-1]
# Split fiveBytesString into 4 10-bit numbers, and add them to pixels
pixels.extend([int(fiveBytesString[i:i+10][::-1], 2) for i in range(0, 40, 10) if len(fiveBytesString[i:i+10]) > 0])
finally:
file.close()
Adding a Numpy based solution suitable for unpacking large 10-bit packed byte buffers like the ones you might receive from AVT and FLIR cameras.
This is a 10-bit version of #cyrilgaudefroy's answer to a similar question; there you can also find a Numba alternative capable of yielding an additional speed increase.
import numpy as np
def read_uint10(byte_buf):
data = np.frombuffer(byte_buf, dtype=np.uint8)
# 5 bytes contain 4 10-bit pixels (5x8 == 4x10)
b1, b2, b3, b4, b5 = np.reshape(data, (data.shape[0]//5, 5)).astype(np.uint16).T
o1 = (b1 << 2) + (b2 >> 6)
o2 = ((b2 % 64) << 4) + (b3 >> 4)
o3 = ((b3 % 16) << 6) + (b4 >> 2)
o4 = ((b4 % 4) << 8) + b5
unpacked = np.reshape(np.concatenate((o1[:, None], o2[:, None], o3[:, None], o4[:, None]), axis=1), 4*o1.shape[0])
return unpacked
Reshape can be omitted if returning a buffer instead of a Numpy array:
unpacked = np.concatenate((o1[:, None], o2[:, None], o3[:, None], o4[:, None]), axis=1).tobytes()
Or if image dimensions are known it can be reshaped directly, e.g.:
unpacked = np.reshape(np.concatenate((o1[:, None], o2[:, None], o3[:, None], o4[:, None]), axis=1), (1024, 1024))
If the use of the modulus operator appears confusing, try playing around with:
np.unpackbits(np.array([255%64], dtype=np.uint8))
Edit: It turns out that the Allied Vision Mako-U cameras employ a different ordering than the one I originally suggested above:
o1 = ((b2 % 4) << 8) + b1
o2 = ((b3 % 16) << 6) + (b2 >> 2)
o3 = ((b4 % 64) << 4) + (b3 >> 4)
o4 = (b5 << 2) + (b4 >> 6)
So you might have to test different orders if images come out looking wonky initially for your specific setup.

Ascii string of bytes packed into bitmap/bitstring back to string?

I have a string that is packed such that each character was originally an unsigned byte but is stored as 7 bits and then packed into an unsigned byte array. I'm trying to find a quick way to unpack this string in Python but the function I wrote that uses the bitstring module works well but is very slow. It seems like something like this should not be so slow but I'm probably doing it very inefficiently...
This seems like something that is probably trivial but I just don't know what to use, maybe there is already a function that will unpack the string?
from bitstring import BitArray
def unpackString(raw):
msg = ''
bits = BitArray(bytes=raw)
mask = BitArray('0b01111111')
i = 0
while 1:
try:
iByte = (bits[i:i + 8] & mask).int
# value of 0 denotes a line break
if iByte == 0:
msg += '\n'
elif iByte >= 32 and iByte <= 126:
msg += chr(iByte)
i += 7
except:
break
return msg
This took me a while to figure out, as your solution seems to ignore the first bit of data. Given the input byte of 129 (0b10000001) I would expect to see 64 '1000000' printed by the following, but your code produces 1 '0000001' -- ignoring the first bit.
bs = b'\x81' # one byte string, whose value is 129 (0x81)
arr = BitArray(bs)
mask = BitArray('0b01111111')
byte = (arr[0:8] & mask).int
print(byte, repr("{:07b}".format(byte)))
Simplest solution would be to modify your solution to use bitstring.ConstBitStream -- I got an order of magnitude speed increase with the following.
from bitstring import ConstBitStream
def unpack_bitstream(raw):
num_bytes, remainder = divmod(len(raw) * 8 - 1, 7)
bitstream = ConstBitStream(bytes=raw, offset=1) # use offset to ignore leading bit
msg = b''
for _ in range(num_bytes):
byte = bitstream.read("uint:7")
if not byte:
msg += b'\n'
elif 32 <= byte <= 126:
msg += bytes((byte,))
# msg += chr(byte) # python 2
return msg
However, this can be done quite easily using only the standard library. This makes the solution more portable and, in the instances I tried, faster by another order of magnitude (I didn't try the cythonised version of bitstring).
def unpack_bytes(raw, zero_replacement=ord("\n")):
# use - 1 to ignore leading bit
num_bytes, remainder = divmod(len(raw) * 8 - 1, 7)
i = int.from_bytes(raw, byteorder="big")
# i = int(raw.encode("hex"), 16) # python 2
if remainder:
# remainder means there are unused trailing bits, so remove these
i >>= remainder
msg = []
for _ in range(num_bytes):
byte = i & 127
if not byte:
msg.append(zero_replacement)
elif 32 <= byte <= 126:
msg.append(byte)
i >>= 7
msg.reverse()
return bytes(msg)
# return b"".join(chr(c) for c in msg) # python 2
I've used python 3 to create these methods. If you're using python 2 then there are a number of adjustments you'll need to make. I've added these as comments after the line they are intended to replace and marked them python 2.

Does python Sha1 takes integer

How to take integer in hashlib.sha1(int).
Please see the code in which i am taking IP as string converting it as integer now at hash.sha1 doest take integer...
import hashlib
import socket
import struct
class blommy(object):
def __init__(self):
self.bitarray= [0]*2048
def hashes(self,ip):
#convert decimal dotted quad string to long integer"
intip= struct.unpack('>L',socket.inet_aton(ip))[0]
index = [0, 1]
hbyte = hashlib.sha1(intip) # how to take sha1 of integer??
index[0] = ord(hbyte[0])| ord(hbyte[1])<< 8
index[1] = ord(hbyte[2])| ord(hbyte[3])<< 8
Need to convert this C code to python. Please advice some part of code is written above. If i take ip as int I cannot compute sha1 + even if convert ip using socket than sha1 dont accept it suggestion? see comments below
//fixed parameters
k = 2
m = 256*8
//the filter
byte[m/8] bloom ##
function insertIP(byte[] ip) {
byte[20] hash = sha1(ip)
int index1 = hash[0] | hash[1] << 8 # how to in python?
int index2 = hash[2] | hash[3] << 8
// truncate index to m (11 bits required)
index1 %= m ## ?
index2 %= m ## ?
// set bits at index1 and index2
bloom[index1 / 8] |= 0x01 << index1 % 8 ## ??
bloom[index2 / 8] |= 0x01 << index2 % 8 ## ??
}
// insert IP 192.168.1.1 into the filter:
insertIP(byte[4] {192,168,1,1})
The answer to your question is no, you can calculate the hash of strings but not integers. Try something like:
hashlib.sha1(str(1234)).digest()
to get the hash of your integer as a string.

Categories