Understanding binary addition in python using bit manipulation - python

I am looking at solutions for this question:
Given two integers a and b, return the sum of the two integers without using the operators + and -. (Input Limits: -1000 <= a, b <= 1000)
In all these solutions, I am struggling to understand why the solutions do ~(a ^ mask) when a exceeds 32-bit number max 0x7fffffff when evaluating a + b [see code below].
def getSum(self, a: int, b: int) -> int:
# 32bit mask
mask = 0xFFFFFFFF # 8Fs = all 1s for 32 bits
while True:
# Handle addition and carry
a, b = (a ^ b) & mask, ((a & b) << 1) & mask
if b == 0:
break
max_int = 0x7FFFFFFF
print("A:", a)
print("Bin A:", bin(a))
print("Bin M:", bin(mask))
print(" A^M:", bin(a ^ mask))
print("~ A^M:", bin(~(a ^ mask)))
print(" ~ A:", bin(~a))
return a if a < max_int else ~(a ^ mask)
I don't get why we need to mask a again when returning answer?
When exiting the loop it was already masked: a = (a ^ b) & mask. So why can't we just do ~a if the 32nd bit is set to 1 for a?
I looked at The meaning of Bit-wise NOT in Python, to understand ~ operation, but did not get it.
Output for a = -12, b = -8. Correctly returns -20:
A: 4294967276
Bin A: 0b11111111111111111111111111101100
Bin M: 0b11111111111111111111111111111111
A^M: 0b10011
~ A^M: -0b10100
~ A: -0b11111111111111111111111111101101
Edit:
Here's a helpful link - Differentiating a 2's complement negative number and the corresponding positive number.
Because of the bounds, we will never have a positive number overflow. (numbers are only between [-1000, 1000]). So we know that when the 32nd bit was set it was because it was a negative number.
So we do ~(a ^ mask) to retain our a value till first 32 bits and add infinite leading 1s to it so the compiler knows this is a negative number.
If you look at Bin A, it is already the 2s complement representation of -20. But right now Python treats it as a positive number. So we need to tell Python to treat it as a negative number using ~(a ^ mask).

You forgot to specify constraints of the original problem, that the a and b are in [-1000, 1000]. That code is a port of a C or Java implementation, which assumes a and b are 4 byte signed integers. And I'm not sure you understand the (a ^ b) and (a & b) << 1 code correctly. The former adds i-th bit of the a and b ignoring a carry bit for every bit. The latter gets all that ignored carry bits.
To the main point, the last operation ~(a ^ mask) is for dealing with a negative integer. The a ^ mask inverts the lower 32 bits of the a and preserves the other upper bits. So, the bitwise NOT of the result, ~(a ^ mask) preserves the lower 32 bits of the a and inverts the other upper bits. Because the upper bits(other than the lower 32 bits) of the a are all zero, the ~(a ^ mask) is just setting all the upper bits to one. It's equivalent to a | ~mask, which is much more readable.
See the following example.
print(~(0xffffffff ^ 0xffffffff)) # This will output -1.
print(~(0xfffffffe ^ 0xffffffff)) # This will output -2.
print(~0xffffffff) # This will output -4294967296(-0x100000000).

Related

Simulating a C cast in Python [duplicate]

Let's say I have this number i = -6884376.
How do I refer to it as to an unsigned variable?
Something like (unsigned long)i in C.
Assuming:
You have 2's-complement representations in mind; and,
By (unsigned long) you mean unsigned 32-bit integer,
then you just need to add 2**32 (or 1 << 32) to the negative value.
For example, apply this to -1:
>>> -1
-1
>>> _ + 2**32
4294967295L
>>> bin(_)
'0b11111111111111111111111111111111'
Assumption #1 means you want -1 to be viewed as a solid string of 1 bits, and assumption #2 means you want 32 of them.
Nobody but you can say what your hidden assumptions are, though. If, for example, you have 1's-complement representations in mind, then you need to apply the ~ prefix operator instead. Python integers work hard to give the illusion of using an infinitely wide 2's complement representation (like regular 2's complement, but with an infinite number of "sign bits").
And to duplicate what the platform C compiler does, you can use the ctypes module:
>>> import ctypes
>>> ctypes.c_ulong(-1) # stuff Python's -1 into a C unsigned long
c_ulong(4294967295L)
>>> _.value
4294967295L
C's unsigned long happens to be 4 bytes on the box that ran this sample.
To get the value equivalent to your C cast, just bitwise and with the appropriate mask. e.g. if unsigned long is 32 bit:
>>> i = -6884376
>>> i & 0xffffffff
4288082920
or if it is 64 bit:
>>> i & 0xffffffffffffffff
18446744073702667240
Do be aware though that although that gives you the value you would have in C, it is still a signed value, so any subsequent calculations may give a negative result and you'll have to continue to apply the mask to simulate a 32 or 64 bit calculation.
This works because although Python looks like it stores all numbers as sign and magnitude, the bitwise operations are defined as working on two's complement values. C stores integers in twos complement but with a fixed number of bits. Python bitwise operators act on twos complement values but as though they had an infinite number of bits: for positive numbers they extend leftwards to infinity with zeros, but negative numbers extend left with ones. The & operator will change that leftward string of ones into zeros and leave you with just the bits that would have fit into the C value.
Displaying the values in hex may make this clearer (and I rewrote to string of f's as an expression to show we are interested in either 32 or 64 bits):
>>> hex(i)
'-0x690c18'
>>> hex (i & ((1 << 32) - 1))
'0xff96f3e8'
>>> hex (i & ((1 << 64) - 1)
'0xffffffffff96f3e8L'
For a 32 bit value in C, positive numbers go up to 2147483647 (0x7fffffff), and negative numbers have the top bit set going from -1 (0xffffffff) down to -2147483648 (0x80000000). For values that fit entirely in the mask, we can reverse the process in Python by using a smaller mask to remove the sign bit and then subtracting the sign bit:
>>> u = i & ((1 << 32) - 1)
>>> (u & ((1 << 31) - 1)) - (u & (1 << 31))
-6884376
Or for the 64 bit version:
>>> u = 18446744073702667240
>>> (u & ((1 << 63) - 1)) - (u & (1 << 63))
-6884376
This inverse process will leave the value unchanged if the sign bit is 0, but obviously it isn't a true inverse because if you started with a value that wouldn't fit within the mask size then those bits are gone.
Python doesn't have builtin unsigned types. You can use mathematical operations to compute a new int representing the value you would get in C, but there is no "unsigned value" of a Python int. The Python int is an abstraction of an integer value, not a direct access to a fixed-byte-size integer.
Since version 3.2 :
def unsignedToSigned(n, byte_count):
return int.from_bytes(n.to_bytes(byte_count, 'little', signed=False), 'little', signed=True)
def signedToUnsigned(n, byte_count):
return int.from_bytes(n.to_bytes(byte_count, 'little', signed=True), 'little', signed=False)
output :
In [3]: unsignedToSigned(5, 1)
Out[3]: 5
In [4]: signedToUnsigned(5, 1)
Out[4]: 5
In [5]: unsignedToSigned(0xFF, 1)
Out[5]: -1
In [6]: signedToUnsigned(0xFF, 1)
---------------------------------------------------------------------------
OverflowError Traceback (most recent call last)
Input In [6], in <cell line: 1>()
----> 1 signedToUnsigned(0xFF, 1)
Input In [1], in signedToUnsigned(n, byte_count)
4 def signedToUnsigned(n, byte_count):
----> 5 return int.from_bytes(n.to_bytes(byte_count, 'little', signed=True), 'little', signed=False)
OverflowError: int too big to convert
In [7]: signedToUnsigned(-1, 1)
Out[7]: 255
Explanations : to/from_bytes convert to/from bytes, in 2's complement considering the number as one of size byte_count * 8 bits. In C/C++, chances are you should pass 4 or 8 as byte_count for respectively a 32 or 64 bit number (the int type).
I first pack the input number in the format it is supposed to be from (using the signed argument to control signed/unsigned), then unpack to the format we would like it to have been from. And you get the result.
Note the Exception when trying to use fewer bytes than required to represent the number (In [6]). 0xFF is 255 which can't be represented using a C's char type (-128 ≤ n ≤ 127). This is preferable to any other behavior.
You could use the struct Python built-in library:
Encode:
import struct
i = -6884376
print('{0:b}'.format(i))
packed = struct.pack('>l', i) # Packing a long number.
unpacked = struct.unpack('>L', packed)[0] # Unpacking a packed long number to unsigned long
print(unpacked)
print('{0:b}'.format(unpacked))
Out:
-11010010000110000011000
4288082920
11111111100101101111001111101000
Decode:
dec_pack = struct.pack('>L', unpacked) # Packing an unsigned long number.
dec_unpack = struct.unpack('>l', dec_pack)[0] # Unpacking a packed unsigned long number to long (revert action).
print(dec_unpack)
Out:
-6884376
[NOTE]:
> is BigEndian operation.
l is long.
L is unsigned long.
In amd64 architecture int and long are 32bit, So you could use i and I instead of l and L respectively.
[UPDATE]
According to the #hl037_ comment, this approach works on int32 not int64 or int128 as I used long operation into struct.pack(). Nevertheless, in the case of int64, the written code would be changed simply using long long operand (q) in struct as follows:
Encode:
i = 9223372036854775807 # the largest int64 number
packed = struct.pack('>q', i) # Packing an int64 number
unpacked = struct.unpack('>Q', packed)[0] # Unpacking signed to unsigned
print(unpacked)
print('{0:b}'.format(unpacked))
Out:
9223372036854775807
111111111111111111111111111111111111111111111111111111111111111
Next, follow the same way for the decoding stage. As well as this, keep in mind q is long long integer — 8byte and Q is unsigned long long
But in the case of int128, the situation is slightly different as there is no 16-byte operand for struct.pack(). Therefore, you should split your number into two int64.
Here's how it should be:
i = 10000000000000000000000000000000000000 # an int128 number
print(len('{0:b}'.format(i)))
max_int64 = 0xFFFFFFFFFFFFFFFF
packed = struct.pack('>qq', (i >> 64) & max_int64, i & max_int64)
a, b = struct.unpack('>QQ', packed)
unpacked = (a << 64) | b
print(unpacked)
print('{0:b}'.format(unpacked))
Out:
123
10000000000000000000000000000000000000
111100001011110111000010000110101011101101001000110110110010000000011110100001101101010000000000000000000000000000000000000
just use abs for converting unsigned to signed in python
a=-12
b=abs(a)
print(b)
Output:
12

How does this bitwise manipulation change a single bit a particular index?

I understand what each of the individual operators does by itself, but I don't know how they interact in order to get the correct results.
def kill(n, k):
#Takes int n and replaces the bit k from right with 0. Returns the new number
return n & ~(1<<k-1)
I tested the program with the n as 37 and k as 3.
def b(n,s=""):
print (str(format(n, 'b')) +" "+ s)
def kill(n, k):
b(n, "n ")
b(1<<k-1, "1<<k-1")
b(~(1<<k-1), "~(1<<k-1) ")
b( n & ~(1<<k-1)," n & ~(1<<k-1) ")
return n & ~(1<<k-1)
#TESTS
kill(37, 3)
I decided to run through it step by step.
I printed both the binary representations of both n and ~(1<<k-1) but after that I was lost. ~(1<<k-1) gave me -101 and I'm not sure how to visualize that in binary. Can someone go through it step by step with visualizations for the binary?
All numbers below are printed in binary representation.
Say, n has m digits in binary representation. Observe that n & 11...1 (m ones) would return n. Indeed, working bitwise, if x is a one-bit digit (0 or 1), then x & 1 = x.
Moreover, observe that x & 0 = x. Therefore, to set up kth digit of number n to 0, we need to do operation and (&) with 11111..1011..1, where 0 is exactly on kth location from the end.
Now we need to generate 11111..1011..0. It has all ones except one digit. If we negate it, we get 00000..0100..1 which we get by 1 << k-1.
All in all: 1 << k-1 gives us 00000..0100..0. Its negation provides 11111..1011..1. Finally, we do & with the input.

Need help understanding Python Code

Please help in understanding the logic behind the following function:
def bit_rev (x, b): # reverse b lower bits of x
return sum (1<<(b-1-i) for i in range (0, b) if (x>>i) & 1)
I took a look at the code and it doesn't seem to account for bits past the b'th bit. So, I added another addition. (Unless all you want is up to the b'th bit):
def bit_rev (x, b): # reverse b lower bits of x
return (x >> b << b) + sum(1 << (b - 1 - i) for i in range (0, b) if (x >> i) & 1)
Now for the explaining the logic.
x >> b << b
So, let's say we're using 5 (as x) in this example with 2 as b.
The binary representation of 5 is 101. So, we want to switch only the last 2 bits. Which is 01. However, in our other code we are swapping them, but we are ignoring the bits past b. So, we are ignoring the first (from left to right) 1.
Now the first operations:
x >> b in our case is 5 >> 2. 101 moving to the right 2 is 1, since we end up chopping off the 01.
Next we shift it back. We are guaranteed (in Python) to get 0's back from the bit shift, so we now have 100, or 4.
Now for the meaty part,
sum(1 << (b - 1 - i) for i in range (0, b) if (x >> i) & 1)
It's probably would be easier to understand this outside of a list comprehension, so I rewrote it as a for-loop.
summation = 0
for i in range (0, b):
if (x >> i) & 1:
summation += 1 << (b - 1 - i)
Basically on each iteration we are finding the reverse bit an then adding it to the total (summation).
This code seems to be kind of difficult to understand because there is a lot going on.
Let's start with the for loop itself. for i in range (0, b) is iterating over all values between 0 and b. (Or the last bit you want to change). All the reversing happens later on in the code.
Next we check to see if the bit we are going to swap is a 1. In binary only 1's add value to the total number, so its logical to ignore all 0's. In if (x >> i) & 1:. We bitshift x to the right i bits. So, 101 bit shifted to the right 1 bit is 10. We now check to see if that last bit is a 1 by doing & 1. Basically what & 1 does in this program is ignore all bits beyond the first bit.
The and bitwise operator works as follows:
0101
&1100
=0100
And requires both to be true. Since all bits past 1 would be 0, it effectively ignores the rest.
Now we get a 0 or a 1 from (x >> i) & 1 and Python processes all non-zero integers as True and zero as False. This will make use ignore all bits that are zero.
Next, we add to summation using: summation += 1 << (b - 1 - i). We get the location of where it the bit is going to be by using b - 1 - i. Then we shift 1 over to that location and then add it to the total.
When adding two binary integers, you can add a 1 to a location in the number similar to how you would base 10. So, if I had the number 9000 and I wanted a 1 in the hundredths digit I could do 9000 + 100. That is similar to what we are doing here. We are moving it over to the left in base 2 by using the << operator instead of taking 10^i. So we are setting the newly reversed bit to whatever the original bit was.

16 bit hex into 14 bit signed int python?

I get a 16 bit Hex number (so 4 digits) from a sensor and want to convert it into a signed integer so I can actually use it.
There are plenty of codes on the internet that get the job done, but with this sensor it is a bit more arkward.
In fact, the number has only 14 bit, the first two (from the left) are irrelevant.
I tried to do it (in Python 3) but failed pretty hard.
Any suggestions how to "cut" the first two digits of the number and then make the rest a signed integer?
The Datasheet says, that E002 should be -8190 ane 1FFE should be +8190.
Thanks a lot!
Let's define a conversion function:
>>> def f(x):
... r = int(x, 16)
... return r if r < 2**15 else r - 2**16
...
Now, let's test the function against the values that the datahsheet provided:
>>> f('1FFE')
8190
>>> f('E002')
-8190
The usual convention for signed numbers is that a number is negative if the high bit is set and positive if it isn't. Following this convention, '0000' is zero and 'FFFF' is -1. The issue is that int assumes that a number is positive and we have to correct for that:
For any number equal to or less than 0x7FFF, then high bit is unset and the number is positive. Thus we return r=int(x,16) if r<2**15.
For any number r-int(x,16) that is equal to or greater than 0x8000, we return r - 2**16.
While your sensor may only produce 14-bin data, the manufacturer is following the standard convention for 16-bit integers.
Alternative
Instead of converting x to r and testing the value of r, we can directly test whether the high bit in x is set:
>>> def g(x):
... return int(x, 16) if x[0] in '01234567' else int(x, 16) - 2**16
...
>>> g('1FFE')
8190
>>> g('E002')
-8190
Ignoring the upper bits
Let's suppose that the manufacturer is not following standard conventions and that the upper 2-bits are unreliable. In this case, we can use modulo, %, to remove them and, after adjusting the other constants as appropriate for 14-bit integers, we have:
>>> def h(x):
... r = int(x, 16) % 2**14
... return r if r < 2**13 else r - 2**14
...
>>> h('1FFE')
8190
>>> h('E002')
-8190
There is a general algorithm for sign-extending a two's-complement integer value val whose number of bits is nbits (so that the top-most of those bits is the sign bit).
That algorithm is:
treat the value as a non-negative number, and if needed, mask off additional bits
invert the sign bit, still treating the result as a non-negative number
subtract the numeric value of the sign bit considered as a non-negative number, producing as a result, a signed number.
Expressing this algorithm in Python produces:
from __future__ import print_function
def sext(val, nbits):
assert nbits > 0
signbit = 1 << (nbits - 1)
mask = (1 << nbits) - 1
return ((val & mask) ^ signbit) - signbit
if __name__ == '__main__':
print('sext(0xe002, 14) =', sext(0xe002, 14))
print('sext(0x1ffe, 14) =', sext(0x1ffe, 14))
which when run shows the desired results:
sext(0xe002, 14) = -8190
sext(0x1ffe, 14) = 8190

Writing a bit flip algorithm

I am trying to write an algorithm for the following problem.
Problem Statement.
You will be given a list of 32-bits unsigned integers. You are required to output the list of the unsigned integers you get by flipping bits in its binary representation (i.e. unset bits must be set, and set bits must be unset).
The code is as follows:
def bit_flip(a):
return ~a & 0xffffffff
t = raw_input("")
a = map(int, t.split())
map(lambda x: x ^ 0xffffffff, a)
for i in a:
print bit_flip(int(i))
The input is
3
2147483647
1
0
The output that i get is
4294967292
whereas the output is supposed to be
**2147483648
4294967294
4294967295**
I am not sure where I am wrong. The output is close to at least one line of the output, but not the same.
Your output is correct. The 32-bit unsigned complement of 3 is indeed 4294967292. The full output produced by your program is:
4294967292
2147483648
4294967294
4294967295
which corresponds correctly to the numbers 3, 2147483647, 1, 0. You can more easily see this if you write them in hex:
Dec Hex ~Hex ~Dec
3 3 FFFFFFFC 4294967292
2147483647 7FFFFFFF 80000000 2147483648
1 1 FFFFFFFE 4294967294
0 0 FFFFFFFF 4294967295
You seem to be flipping the bits twice, but throwing away the results the first time: map(lambda x: x ^ 0xffffffff, a) returns a list containing the flipped values, which you don't assign to anything. If you changed that line to assign the result back to a, you wouldn't need bit_flip anymore (which also flips the bits, just via a different method):
t = raw_input("")
a = map(int, t.split())
a = map(lambda x: x ^ 0xffffffff, a)
for i in a:
print i
Or even shorter:
for i in map(lambda x: int(x) ^ 0xffffffff, raw_input("").split()):
print i
The list returned by the map(lambda x: x ^ 0xffffffff, a) is the answer, but you're not using it.
At least it has the integers with the bits flipped, I'm not sure why the expected output has one fewer element than the input.

Categories