Python invert operator - visualizing the negative number - python

In simple words, the definition given for ~n is -n - 1.
For example ~1
1 = 0001
~1 = 1110 (which is -2)
For even numbers
say ~2
2 = 0010
~2 = 1101 (which is the representation of -3 in twos complement)
But the question is
1110 = -2 can be easily visualized as -2 (the right two bits are 10 and the rest all 1)
1101 = -3 can't be visualized like this (going by the above logic it should be -5)
So I am wondering is there a simple way to see and tell from twos complement binary what the negative number represents without doing much calculations .

~n = -n - 1 is equivalent to -n = ~n + 1. That means that to figure out what the negative of a number is, you can invert it (in your head) and add one. Pretend zeros are ones and vice versa, then add one.
Example: Pretend this
1101
is this
0010
then add 1
0011
Thus, 1101 represents -3.

The definition comes because of 2's complement representation of integers. So "really" ~n is n with all the bits flipped. But the result of flipping "all" the bits depends in a sense on how many bits n has in the first place. CPython uses fixed-width integers internally but the language doesn't present them to the programmer, so the only definition that makes sense in general is the arithmetic one ~n = -n - 1. But the motivation for that definition is flipping the bits of a fixed-width 2's complement integer.
1110 = ~2 can be easily visualized as -2
...1110 is not ~2, it is -2. ~2 is ...1101 because 2 is ...010.
1101 = ~3 can't be visualized like this
...1101 is not ~3, it is -3. ~3 is ...1100 because 3 is ...011.
The way I visualize this (when I visualize it at all -- as a mathematician by training I prefer not to consider specific numbers), is to know that in 2's complement, ...10... is always a negated power of two. So ...10 is -2, ...100 is -4, etc.
Then in order to know what for example ...110110 is, it's ...110000 + 110, that is to say -16 + 6, which is -10.
Of course ...110110 is also (by bit flipping) ~1001, that is to say ~9, which by the formula is -9-1, which is also -10. So the system works ;-)

Related

Python - weird properties for bit operations [duplicate]

I was trying to understand bitwise NOT in python.
I tried following:
print('{:b}'.format(~ 0b0101))
print(~ 0b0101)
The output is
-110
-6
I tried to understand the output as follows:
Bitwise negating 0101 gives 1010. With 1 in most significant bit, python interprets it as a negative number in 2's complement form and to get back corresponding decimal it further takes 2's complement of 1010 as follows:
1010
0101 (negating)
0110 (adding 1 to get final value)
So it prints it as -110 which is equivalent to -6.
Am I right with this interpretation?
You're half right..
The value is indeed represented by ~x == -(x+1) (add one and invert), but the explanation of why is a little misleading.
Two's compliment numbers require setting the MSB of the integer, which is a little difficult if the number can be an arbitrary number of bits long (as is the case with python). Internally python keeps a separate number (there are optimizations for short numbers however) that tracks how long the digit is. When you print a negative int using the binary format: f'{-6:b}, it just slaps a negative sign in front of the binary representation of the positive value (one's compliment). Otherwise, how would python determine how many leading one's there should be? Should positive values always have leading zeros to indicate they're positive? Internally it does indeed use two's compliment for the math though.
If we consider signed 8 bit numbers (and display all the digits) in 2's compliment your example becomes:
~ 0000 0101: 5
= 1111 1010: -6
So in short, python is performing correct bitwise negation, however the display of negative binary formatted numbers is misleading.
Python integers are arbitrarily long, so if you invert 0b0101, it would be 1111...11111010. How many ones do you write? Well, a 4-bit twos complement -6 is 1010, and a 32-bit twos complement -6 is 11111111111111111111111111111010. So an arbitrarily long -6 could ideally just be written as -6.
Check what happens when ~5 is masked to look at the bits it represents:
>>> ~5
-6
>>> format(~5 & 0xF,'b')
'1010'
>>> format(~5 & 0xFFFF,'b')
'1111111111111010'
>>> format(~5 & 0xFFFFFFFF,'b')
'11111111111111111111111111111010'
>>> format(~5 & 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,'b')
'11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111010'
A negative decimal representation makes sense and you must mask to limit a representation to a specific number of bits.

Unsigned ints of arbitrary length in python

I am trying to simulate a fixed-point filter implementation. I want to capture low-level hardware features like 2s-complement wraparound/overflow and fixed register widths. Some of the registers widths are set by hardware features at unusual and long widths (ie 72b).
I've been making some progress using the built-in integers. The infinite width is incredibly useful... but I find myself fighting Python a lot because it sometimes wants to interpret a binary as a positive integer, and sometimes it seems to want to interpret a very similar binary as a negative 2's complement. For example:
>> a = 0b11111 # sign-extended -1
>> b = 0b0011
>> print("{0:b}".format(a*b))
5f
>> print("{0:b}".format((a*b)&a)) # Truncate to correct product length
11101 # == -3 in 2s complement. Great!
>> print("{0:b}".format(~((a*b)&a)+1)) # Actually perform the 2's complement
-11101 # Arrrrggggghhh
>> print("{0:b}".format((~((a*b)&a)&a)+1)) # Truncate with extreme prejudice
11 # OK. Fine.
I guess if I think hard enough I can figure out why all this works the way it does, but if I could just do it all in unsigned space without worrying about python adding sign bits it would make things easier and less error-prone. Anyone know if there's a relatively easy way to do this? I considered bit strings, but I have to do a lot of adds & multiplies in this application and built-in integer arithmetic is really useful for that.
~x is literally defined on arbitrary precision integers as -(x+1). It does not do bit arithmetic: ~0 is 255 in one-byte integers, 65535 in two-byte integers, 1023 for 10-bit integers etc; so defining ~ via bit inversion on stretchy integers is useless.
If a defines the fixed width of your integers (with 0b11111 saying you are working with five-bit numbers), bit inversion is as simple as a^x.
print("{0:b}".format(a ^ b)
# => 11100
Two's complement is meanwhile easiest done as a+1-b, or equivalently a^b+1:
print("{0:b}".format((a + 1) - b))
# => 11101
print("{0:b}".format((a ^ b) + 1))
# => 11101
tl;dr: Don't use ~ if you want to stay unsigned.

Get binary mask in python

What is the easiest/fastest way to get a int in python which can be represented by all ones in binary. This is for generating N bit masks.
E.g:
If total number of bits is 4, then binary '1111' or int 15
If total number of bits is 8 then, binary '1111 1111' or 255
I was under the impression ~0 is for that purpose, looks like that is not the case or I am missing something.
it's very easy to achieve with bit shifting:
>>> (1<<4)-1
15
shifting 4 times 1 to the left gives you 0b10000, substract 1 you get 0b1111 aka 15.
(the int("1"*4,2) method is overkill because it involves building a string and parsing it back)

How to create Python fixed length bits?

I wish to do bitwise negation in Python.
My expectation:
negate(0001) => 1110
But Python's ~0b0001 returns -0b10. It seems Python truncate 1110 into -0b10.
How to keep the leading bits?
Moreover, why
bin(~0b1) yields -0b10?
How many bits are reserved for that datatype?
Python uses arbitrary precision arithmetic, so you don't have to worry about the number of bits used. Also it returns -0b10 for bin(~0b1), because it understands that the result is -2 and represents the number as it is 10 and keeps the sign in the front (only for the negative numbers).
But we can represent the number as we like using format function, like this
def negate(number, bits = 32):
return format(~number & 2 ** bits - 1, "0{}b".format(bits))
print(negate(1))
# 11111111111111111111111111111110
print(negate(1, bits = 4))
# 1110
Or, as suggested by eryksun,
def negate(number, bits = 32):
return "{:0{}b}".format(~number & 2 ** bits - 1, bits)
Python acts as if its integers have infinitely many bits. As such, if you use ~ on one, the string representation can't start with an infinite number of 1s, or generating the string would never terminate. Instead, Python chooses to represent it as a negative number as it would be using two's compliment. If you want to restrict the integer to a number of bits, & it against an appropriate mask:
>>> bin((~1) & 0b1111)
'0b1110'

The tilde operator in Python

What's the usage of the tilde operator in Python?
One thing I can think about is do something in both sides of a string or list, such as check if a string is palindromic or not:
def is_palindromic(s):
return all(s[i] == s[~i] for i in range(len(s) / 2))
Any other good usage?
It is a unary operator (taking a single argument) that is borrowed from C, where all data types are just different ways of interpreting bytes. It is the "invert" or "complement" operation, in which all the bits of the input data are reversed.
In Python, for integers, the bits of the twos-complement representation of the integer are reversed (as in b <- b XOR 1 for each individual bit), and the result interpreted again as a twos-complement integer. So for integers, ~x is equivalent to (-x) - 1.
The reified form of the ~ operator is provided as operator.invert. To support this operator in your own class, give it an __invert__(self) method.
>>> import operator
>>> class Foo:
... def __invert__(self):
... print 'invert'
...
>>> x = Foo()
>>> operator.invert(x)
invert
>>> ~x
invert
Any class in which it is meaningful to have a "complement" or "inverse" of an instance that is also an instance of the same class is a possible candidate for the invert operator. However, operator overloading can lead to confusion if misused, so be sure that it really makes sense to do so before supplying an __invert__ method to your class. (Note that byte-strings [ex: '\xff'] do not support this operator, even though it is meaningful to invert all the bits of a byte-string.)
~ is the bitwise complement operator in python which essentially calculates -x - 1
So a table would look like
i ~i
-----
0 -1
1 -2
2 -3
3 -4
4 -5
5 -6
So for i = 0 it would compare s[0] with s[len(s) - 1], for i = 1, s[1] with s[len(s) - 2].
As for your other question, this can be useful for a range of bitwise hacks.
One should note that in the case of array indexing, array[~i] amounts to reversed_array[i]. It can be seen as indexing starting from the end of the array:
[0, 1, 2, 3, 4, 5, 6, 7, 8]
^ ^
i ~i
Besides being a bitwise complement operator, ~ can also help revert a boolean value, though it is not the conventional bool type here, rather you should use numpy.bool_.
This is explained in,
import numpy as np
assert ~np.True_ == np.False_
Reversing logical value can be useful sometimes, e.g., below ~ operator is used to cleanse your dataset and return you a column without NaN.
from numpy import NaN
import pandas as pd
matrix = pd.DataFrame([1,2,3,4,NaN], columns=['Number'], dtype='float64')
# Remove NaN in column 'Number'
matrix['Number'][~matrix['Number'].isnull()]
The only time I've ever used this in practice is with numpy/pandas. For example, with the .isin() dataframe method.
In the docs they show this basic example
>>> df.isin([0, 2])
num_legs num_wings
falcon True True
dog False True
But what if instead you wanted all the rows not in [0, 2]?
>>> ~df.isin([0, 2])
num_legs num_wings
falcon False False
dog True False
I was solving this leetcode problem and I came across this beautiful solution by a user named Zitao Wang.
The problem goes like this for each element in the given array find the product of all the remaining numbers without making use of divison and in O(n) time
The standard solution is:
Pass 1: For all elements compute product of all the elements to the left of it
Pass 2: For all elements compute product of all the elements to the right of it
and then multiplying them for the final answer
His solution uses only one for loop by making use of. He computes the left product and right product on the fly using ~
def productExceptSelf(self, nums):
res = [1]*len(nums)
lprod = 1
rprod = 1
for i in range(len(nums)):
res[i] *= lprod
lprod *= nums[i]
res[~i] *= rprod
rprod *= nums[~i]
return res
Explaining why -x -1 is correct in general (for integers)
Sometimes (example), people are surprised by the mathematical behaviour of the ~ operator. They might reason, for example, that rather than evaluating to -19, the result of ~18 should be 13 (since bin(18) gives '0b10010', inverting the bits would give '0b01101' which represents 13 - right?). Or perhaps they might expect 237 (treating the input as signed 8-bit quantity), or some other positive value corresponding to larger integer sizes (such as the machine word size).
Note, here, that the signed interpretation of the bits 11101101 (which, treated as unsigned, give 237) is... -19. The same will happen for larger numbers of bits. In fact, as long as we use at least 6 bits, and treating the result as signed, we get the same answer: -19.
The mathematical rule - negate, and then subtract one - holds for all inputs, as long as we use enough bits, and treat the result as signed.
And, this being Python, conceptually numbers use an arbitrary number of bits. The implementation will allocate more space automatically, according to what is necessary to represent the number. (For example, if the value would "fit" in one machine word, then only one is used; the data type abstracts the process of sign-extending the number out to infinity.) It also does not have any separate unsigned-integer type; integers simply are signed in Python. (After all, since we aren't in control of the amount of memory used anyway, what's the point in denying access to negative values?)
This breaks intuition for a lot of people coming from a C environment, in which it's arguably best practice to use only unsigned types for bit manipulation and then apply 2s-complement interpretation later (and only if appropriate; if a value is being treated as a group of "flags", then a signed interpretation is unlikely to make sense). Python's implementation of ~, however, is consistent with its other design choices.
How to force unsigned behaviour
If we wanted to get 13, 237 or anything else like that from inverting the bits of 18, we would need some external mechanism to specify how many bits to invert. (Again, 18 conceptually has arbitrarily many leading 0s in its binary representation in an arbitrary number of bits; inverting them would result in something with leading 1s; and interpreting that in 2s complement would give a negative result.)
The simplest approach is to simply mask off those arbitrarily-many bits. To get 13 from inverting 18, we want 5 bits, so we mask with 0b11111, i.e., 31. More generally (and giving the same interface for the original behaviour):
def invert(value, bits=None):
result = ~value
return result if bits is None else (result & ((1 << bits) - 1))
Another way, per Andrew Jenkins' answer at the linked example question, is to XOR directly with the mask. Interestingly enough, we can use XOR to handle the default, arbitrary-precision case. We simply use an arbitrary-sized mask, i.e. an integer that conceptually has an arbitrary number of 1 bits in its binary representation - i.e., -1. Thus:
def invert(value, bits=None):
return value ^ (-1 if bits is None else ((1 << bits) - 1))
However, using XOR like this will give strange results for a negative value - because all those arbitrarily-many set bits "before" (in more-significant positions) the XOR mask weren't cleared:
>>> invert(-19, 5) # notice the result is equal to 18 - 32
-14
it's called Binary One’s Complement (~)
It returns the one’s complement of a number’s binary. It flips the bits. Binary for 2 is 00000010. Its one’s complement is 11111101.
This is binary for -3. So, this results in -3. Similarly, ~1 results in -2.
~-3
Output : 2
Again, one’s complement of -3 is 2.
This is minor usage is tilde...
def split_train_test_by_id(data, test_ratio, id_column):
ids = data[id_column]
in_test_set = ids.apply(lambda id_: test_set_check(id_, test_ratio))
return data.loc[~in_test_set], data.loc[in_test_set]
the code above is from "Hands On Machine Learning"
you use tilde (~ sign) as alternative to - sign index marker
just like you use minus - is for integer index
ex)
array = [1,2,3,4,5,6]
print(array[-1])
is the samething as
print(array[~1])

Categories