Parsing out bit offsets from a hex number in Python - python

I have a 64-bit hex number inputting into my script
0x0000040800000000. I want to take this number and extract bits 39:32.
How is this possible? I have been parsing individual parts of a string and have ended up in a mess.
I was initially converting it into binary and parsing out sections of the string from
command_register = "".join(["{0:04b}".format(int(c,16)) for c in str(command_register)])

You simply need to first convert your hex string into an integer and then use normal maths to extract the bits.
Bit numbering is usually done from the least significant bit, i.e. the furthest right when displayed in binary is bit 0. So to extract bits 39:32 (8 consecutive bits), you would simply need a mask of 0xFF00000000. Simply AND your number and shift the result 32 bits to the right.
Using your hex value and extracting bits 39 to 32 would give you a value of 0x08. The following script shows you how:
hex_string = "0x0000040800000000"
number = int(hex_string, 16) # Convert to an integer
mask_39_to_32 = 0xFF00000000 # Suitable mask to extract the bits with
print(f"As hex: 0x{number:X}")
print()
print("Bits 39-32: xxxxxxxx")
print(f" As binary: {bin(number)[2:]:0>64s}")
print(f" Mask: {bin(mask_39_to_32)[2:]:0>64s}")
print(f"AND result: {bin(number & mask_39_to_32)[2:]:0>64s}")
print(f" Shifted: {bin((number & mask_39_to_32) >> 32)[2:]:0>64s}")
print(f" As an int: {(number & mask_39_to_32) >> 32}")
Which displays the following output:
As hex: 0x40800000000
Bits 39-32: xxxxxxxx
As binary: 0000000000000000000001000000100000000000000000000000000000000000
Mask: 0000000000000000000000001111111100000000000000000000000000000000
AND result: 0000000000000000000000000000100000000000000000000000000000000000
Shifted: 0000000000000000000000000000000000000000000000000000000000001000
As an int: 8
The mask needed for 47 to 40 would be:
Bits 47-40: xxxxxxxx
As binary: 0000000000000000111111110000000000000000000000000000000000000000
As hex: 0xFF0000000000
The use of hexadecimal simply makes it less verbose, and clearer once you get used to it. Groups of 8 bits for masks always end up as 'FF'.
The Wikipedia article on bitwise operations should help you to understand the process.

Related

How does Python infer the number of bits to use in a binary NOT?

I'm translating this from C to Python:
int_fast16_t x;
int_fast16_t y = compute_value();
x = ~y;
y = x+1;
I don't think
y = compute_value()
y = (~y) + 1
will work: how would it know on how many bits should the binary NOT be done? (8? 16? 32?).
In the case of C, it is explicit with the type int_fast16_t, but with Python we don't know a number of bits in advance.
How do you do this in Python?
I have read How do I do a bitwise Not operation in Python? but here it's more specific: how does Python infer the number of bits to use in a binary NOT?
Example:
How does Python know if the binary NOT of 3 (0b11) should be done on 4 bits: 0b00 or on 8 bits: 0b11111100 or on 16 bits: 0b1111111111111100?
Python int has an infinite number of bits. The only way to invert them all is to negate the sign of the number. So for example ~1 is -2. To get a specific number of bits in the result, you have to mask off those bits. (~1)&0xffff is 0xfffe which is one bit off, I assume due to two's complement.
how does Python infer the number of bits to use in a binary NOT?
It uses all bits in the binary representation it has for the operand. Python integers consist of one or more longs (see Python integer objects).
How does Python know if the binary NOT of 3 (0b11) should be done on 4 bits: 0b00 or on 8 bits: 0b11111100 or on 16 bits: 0b1111111111111100?
There is no case where it would be just 4 or 8 bits. It is done on all bits in the 64 bits it has for the integer (or a multiple of that, if the original value needed more long words).

trouble with grabbing tag bits

I'm implementing a direct mapped cache using python which is direct mapped. Each line in cache contains 4 bytes. I'm having trouble for some reason with pulling out the first (in this case) 27 bits, and also the last 5 bits by using bit shifting.
I'm not sure what exactly I'm doing wrong in terms of bitshifting, but everything I've done is not giving me the desired bits I want. I'm doing a sort of "hard-coded" solution for now but converting the integer stored in cache to a bit string, and using python's string indexing to just get the first 27 bits, though I do want to know how to do it via bit shifting.
def getTag(d_bytes):
b = bin(d_bytes)
b = b[2:]
return (b[0:27])
Is the hard-coded solution I'm referring to.
If the value stored in cache is
0b11010101010101010000100010001
I would like to have a tag of:
110101010101010100001000 (The first 27 bits, as tag = (line size - index - offset)
An index of:
100 - next 3 bits following tag
and an offset of:
01 (The last two bits) - last two bits
You can extract the bits by masking and shifting.
To get the first n bits, the mask to use is 000011..(n times)..11. This mask can simply be generated with (1<<n)-1. This is equal to the number 2^n-1 whose code is exactly the mask that we want.
Now if you want to extract a bitfield that is at any position in your word, you have first to shift it right to the proper position, then use masking.
So for your problem, you can use
# extract n bits of x starting at position m
def getfield(x,n,m):
r=x>>m # shift it right to have lsb of bitfield at position 0
return r&((1<<n)-1) # then mask to extract n bits
lsb27=getfield(tag,27,0) # get bits x[26:0]
msb5=getfield(tag,5,27) # get bits x[31:27]

Convert an Integer into 32bit Binary Python

I am trying to make a program that converts a given integer(limited by the value 32 bit int can hold) into 32 bit binary number. For example 1 should return (000..31times)1. I have been searching the documents and everything but haven't been able to find some concrete way. I got it working where number of bits are according to the number size but in String. Can anybody tell a more efficient way to go about this?
'{:032b}'.format(n) where n is an integer. If the binary representation is greater than 32 digits it will expand as necessary:
>>> '{:032b}'.format(100)
'00000000000000000000000001100100'
>>> '{:032b}'.format(8589934591)
'111111111111111111111111111111111'
>>> '{:032b}'.format(8589934591 + 1)
'1000000000000000000000000000000000' # N.B. this is 33 digits long
You can just left or right shift integer and convert it to string for display if you need.
>>> 1<<1
2
>>> "{:032b}".format(2)
'00000000000000000000000000000010'
>>>
or if you just need a binary you can consider bin
>>> bin(4)
'0b100'
Say for example the number you want to convert into 32 bit binary is 4. So, num=4.
Here is the code that does this: "s" is the empty string initially.
for i in range(31,-1,-1):
cur=(num>>i) & 1 #(right shift operation on num and i and bitwise AND with 1)
s+=str(cur)
print(s)#s contains 32 bit binary representation of 4(00000000000000000000000000000100)
00000000000000000000000000000100
Lets say
a = 4
print(bin(a)) # 0b101
For the output you may append 0s from LSB to till 101 to get the 32bit address for the integer - 4.
If you don't want 0b you may slice it
print(bin(a)[-3:]) # 101

How to create Python fixed length bits?

I wish to do bitwise negation in Python.
My expectation:
negate(0001) => 1110
But Python's ~0b0001 returns -0b10. It seems Python truncate 1110 into -0b10.
How to keep the leading bits?
Moreover, why
bin(~0b1) yields -0b10?
How many bits are reserved for that datatype?
Python uses arbitrary precision arithmetic, so you don't have to worry about the number of bits used. Also it returns -0b10 for bin(~0b1), because it understands that the result is -2 and represents the number as it is 10 and keeps the sign in the front (only for the negative numbers).
But we can represent the number as we like using format function, like this
def negate(number, bits = 32):
return format(~number & 2 ** bits - 1, "0{}b".format(bits))
print(negate(1))
# 11111111111111111111111111111110
print(negate(1, bits = 4))
# 1110
Or, as suggested by eryksun,
def negate(number, bits = 32):
return "{:0{}b}".format(~number & 2 ** bits - 1, bits)
Python acts as if its integers have infinitely many bits. As such, if you use ~ on one, the string representation can't start with an infinite number of 1s, or generating the string would never terminate. Instead, Python chooses to represent it as a negative number as it would be using two's compliment. If you want to restrict the integer to a number of bits, & it against an appropriate mask:
>>> bin((~1) & 0b1111)
'0b1110'

Interpreting 5bit subsets within Packed Binary Data Python

I have been having some real trouble with this for a while. I am receiving a string of binary data in python and I am having trouble unpacking and interpreting only a 5bit subset (not an entire byte) of the data. It seems like whatever method comes to mind just simply fails miserably.
Let's say I have two bytes packed binary data, and I would like to interpret the first 10bits within the 16. How could I convert this to an 2 integers representing 5bits each?
Use bitmasks and bitshifting:
>>> example = 0x1234 # Hexadecimal example; 2 bytes, 4660 decimal.
>>> bin(example) # Show as binary digits
'0b1001000110100'
>>> example & 31 # Grab 5 most significant bits
20
>>> bin(example & 31) # Same, now represented as binary digits
'0b10100'
>>> (example >> 5) & 31 # Grab the next 5 bits (shift right 5 times first)
17
>>> bin(example >> 5 & 31)
'0b10001'
The trick here is to know that 31 is a 5-bit bitmask:
>>> bin(31)
'0b11111'
>>> 0b11111
31
>>> example & 0b11111
20
As you can see you could also just use the 0b binary number literal notation if you find that easier to work with.
See the Python Wiki on bit manipulation for more background info.

Categories