How to concatenate bits in Python - python

I have two bytes, e.g. 01010101 and 11110000. I need to concatenate the four most significant bit of the second byte "1111" and the first whole byte, resulting something like 0000010101011111, namely, padding four zeros, the first whole byte and finally the four most significant bit of the second byte.
Any idea?

Try this:
first = 0b01010101
second = 0b11110000
res = (first<<4) | (second>>4)
print bin(res)
By shifting the first byte by 4 bits to the left (first<<4) you'll add 4 trailing zero bits. Second part (second>>4) will shift out to the right 4 LSB bits of your second byte to discard them, so only the 4 MSB bits will remain, then you can just bitwise OR both partial results (| in python) to combine them.
Splitting result back
To answer #JordanMackie 's question, you can split the res back to two variables, just you will loose original 4 least significant bits from second.
first = 0b01010101
second = 0b11110000
res = (first<<4) | (second>>4)
print ("res : %16s" %(bin(res)) )
first2 = (res>>4) & 255
second2 = (res&0b1111)<<4
print ("first2 : %16s" % (bin(first2)) )
print ("second2: %16s" % (bin(second2)) )
Output looks like this:
res : 0b10101011111
first2 : 0b1010101
second2: 0b11110000
First of the commands extracts original first byte. It shifts 4 LSB bits that came from second variable to the right (operator >>), so they will be thrown away. Next logical and operation & keeps only 8 lowest bits of the operation and any extra higher bits are thrown away:
first2 = (res>>4) & 255
Second of the commands can restore only 4 MSB bits of the second variable. It selects only 4 LSB from the result that belong to second using logical multiplication (&). (anything & 1 = anything, anything & 0 = 0). Higher bits are discarded because they are AND'ed with 0 bit.
Next those 4 bits are shifted to the left. Zero bits appear at 4 lowest significant bit positions:
second2 = (res&0b1111)<<4

Related

Not able to understand random.sample(range(1,40),3)) with sum(1<<i for i in random.sample(range(1,40),3))

I am trying to debug this piece of code and learn a little. I learn that it generates 3 unique values for i but how come the values of sum become this much big?
If I run it and debug it, it goes something like this. It keep changes since values are chosen randomly.
i = 6
i = 26
i = 38
test_num = 274945015872
Output:
100000000000100000000000000000001000000
Why the value for test_num 274945015872? It then uses this value to generate 39-bit binary string. Like how?
Can someone explain?
Here is the code:
test_num = sum(1<<i for i in random.sample(range(1,40),3))
#left to right one bit shifting
print (print (f"{test_num:039b}"))
this is how addition works ...
0b1000 8
+ 0b0100 + 4
--------- -----
0b1100 12
each 1<<N creates an integer that has a binary value with exactly 1 '1' and the rest zero
suming them sets all the one bits (a integer with 3 distinct bits set to '1') if your indexes happen to be [0,1,2] you end up with the smallest possible value of 0b111 (7) but there are 40 total position 0..39 so it would be rare to get [0,1,2] as your output
as pointed out in the comments the sum can be replaced with
reduce(lambda x,y: x|y ,[i for i in ...])
this works because when you have a guarantee of distinct one positions NUMERIC ADDITION and BITWISE OR are identical operations

trouble with grabbing tag bits

I'm implementing a direct mapped cache using python which is direct mapped. Each line in cache contains 4 bytes. I'm having trouble for some reason with pulling out the first (in this case) 27 bits, and also the last 5 bits by using bit shifting.
I'm not sure what exactly I'm doing wrong in terms of bitshifting, but everything I've done is not giving me the desired bits I want. I'm doing a sort of "hard-coded" solution for now but converting the integer stored in cache to a bit string, and using python's string indexing to just get the first 27 bits, though I do want to know how to do it via bit shifting.
def getTag(d_bytes):
b = bin(d_bytes)
b = b[2:]
return (b[0:27])
Is the hard-coded solution I'm referring to.
If the value stored in cache is
0b11010101010101010000100010001
I would like to have a tag of:
110101010101010100001000 (The first 27 bits, as tag = (line size - index - offset)
An index of:
100 - next 3 bits following tag
and an offset of:
01 (The last two bits) - last two bits
You can extract the bits by masking and shifting.
To get the first n bits, the mask to use is 000011..(n times)..11. This mask can simply be generated with (1<<n)-1. This is equal to the number 2^n-1 whose code is exactly the mask that we want.
Now if you want to extract a bitfield that is at any position in your word, you have first to shift it right to the proper position, then use masking.
So for your problem, you can use
# extract n bits of x starting at position m
def getfield(x,n,m):
r=x>>m # shift it right to have lsb of bitfield at position 0
return r&((1<<n)-1) # then mask to extract n bits
lsb27=getfield(tag,27,0) # get bits x[26:0]
msb5=getfield(tag,5,27) # get bits x[31:27]

decode/revert characters with shift in python

I have a function. the input would be a word, and every time each character will be added to the shifted value of the result.
def magic2(b):
res = 0
for c in b:
res = (res << 8) + ord(c)
print(res)
return res
Because it uses shifts, I will lose some data. I wanna to decode/reverse this with exact letters of the input word.
For example, if input would be "saman", the output is result "495555797358" and step by step would be:
115
29537
7561581
1935764833
495555797358
How can I get back to the input word just with these outputs?
Consider what you're doing: for each character, you shift left by 8 bits, and then add on another 8 bits.1
So, how do you undo that? Well, for each character, you grab the rightmost 8 bits, then shift everything else right by 8 bits. How do you know when you're done? When shifting right by 8 bits leaves you with 0, you must have just gotten the leftmost character. So:
def unmagic2(n):
while n > 0:
c = chr(n & 0xff) # 0xff is (1 << 8) - 1
n = n >> 8
Now you just have to figure out what to do with each c to get your original string back. It's not quite as trivial as you might at first think, because we're getting the leftmost character last, not first. But you should be able to figure it out from here.
1. If you're using the full gamut of Unicode, this is of course lossy, because you're shifting left by 8 bits and then adding on another 21 bits, so there's no way to reverse that. But I'll assume you're using Latin-1 strings here, or bytes—or Python 2 str.

Python: mask/remove the least significant 2 bits of every 16-bit integer

I want to remove the least significant 2 bits of every 16-bit integer from a bitarray. They're stored like this:
010101**00**10010101101100**00**10101010.....
(The zeroes between the asterisks will be removed. There are two of them every 16 bits (ignoring the very first)).
I can simply eliminate them with a regular for loop checking indexes (the 7th and 8th after every 16 bits).
But... is there another more pythonic way to do this? I'm thinking about some slice notation or maybe comprehension lists. Perhaps I could divide every number by 4 and encode every one with 14 bits (if there's a way to do that).
You can clear bits quite easily with masking. If you want to clear bits 8 and 7 you can do it like this:
a = int('10010101101100',2)
mask = ~((1 << 7) | (1 << 8))
bin(a&mask)
more information about masking from here!

Read 14 bit number from 2 bytes

I am trying to decode the run-length-encoding described in this specification here.
it says:
There may be 1, 2, 3, or 4 bytes per count. The first two bits of the first count byte contains 0,1,2,3 indicating that the count is contained in 1, 2,3, or 4 bytes. Then the rest of the byte (6 bits) represent the six most significant bytes of the count. The next byte, if present, represents decreasing significance
I have successfully read the first 2 bits for the length, but am unable to figure out how to get the value encoded in the next 14 bits.
heres how I got the length:
number_of_bytes = (firstbyte >> 6) + 1
It seams that the data is big endian. I have tried bit shifting and unpacking and repacking with different endiannesses bit I cant get the numbers I expect.
To get the 6 least significant bits, use
firstbyte & 0b111111
so to get a 14 bit value
((firstbyte & 0b111111) << 8) + secondbyte

Categories