Get binary mask in python - python

What is the easiest/fastest way to get a int in python which can be represented by all ones in binary. This is for generating N bit masks.
E.g:
If total number of bits is 4, then binary '1111' or int 15
If total number of bits is 8 then, binary '1111 1111' or 255
I was under the impression ~0 is for that purpose, looks like that is not the case or I am missing something.

it's very easy to achieve with bit shifting:
>>> (1<<4)-1
15
shifting 4 times 1 to the left gives you 0b10000, substract 1 you get 0b1111 aka 15.
(the int("1"*4,2) method is overkill because it involves building a string and parsing it back)

Related

Python - weird properties for bit operations [duplicate]

I was trying to understand bitwise NOT in python.
I tried following:
print('{:b}'.format(~ 0b0101))
print(~ 0b0101)
The output is
-110
-6
I tried to understand the output as follows:
Bitwise negating 0101 gives 1010. With 1 in most significant bit, python interprets it as a negative number in 2's complement form and to get back corresponding decimal it further takes 2's complement of 1010 as follows:
1010
0101 (negating)
0110 (adding 1 to get final value)
So it prints it as -110 which is equivalent to -6.
Am I right with this interpretation?
You're half right..
The value is indeed represented by ~x == -(x+1) (add one and invert), but the explanation of why is a little misleading.
Two's compliment numbers require setting the MSB of the integer, which is a little difficult if the number can be an arbitrary number of bits long (as is the case with python). Internally python keeps a separate number (there are optimizations for short numbers however) that tracks how long the digit is. When you print a negative int using the binary format: f'{-6:b}, it just slaps a negative sign in front of the binary representation of the positive value (one's compliment). Otherwise, how would python determine how many leading one's there should be? Should positive values always have leading zeros to indicate they're positive? Internally it does indeed use two's compliment for the math though.
If we consider signed 8 bit numbers (and display all the digits) in 2's compliment your example becomes:
~ 0000 0101: 5
= 1111 1010: -6
So in short, python is performing correct bitwise negation, however the display of negative binary formatted numbers is misleading.
Python integers are arbitrarily long, so if you invert 0b0101, it would be 1111...11111010. How many ones do you write? Well, a 4-bit twos complement -6 is 1010, and a 32-bit twos complement -6 is 11111111111111111111111111111010. So an arbitrarily long -6 could ideally just be written as -6.
Check what happens when ~5 is masked to look at the bits it represents:
>>> ~5
-6
>>> format(~5 & 0xF,'b')
'1010'
>>> format(~5 & 0xFFFF,'b')
'1111111111111010'
>>> format(~5 & 0xFFFFFFFF,'b')
'11111111111111111111111111111010'
>>> format(~5 & 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,'b')
'11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111010'
A negative decimal representation makes sense and you must mask to limit a representation to a specific number of bits.

Python Integer Bit masking while keeping sign

I am trying to replicate/validate bitwise arithmetic logic in Python.
I have cases where bits from the absolute value are truncated (no matter if they are 0 or 1) while the sign is preserved. This happens in various bit length representations.
How to implement truncating bits from the absolute value also for negative integers elegantly in Python ?
For positive integers I can easily apply a bit mask:
n=7; nMod=n & 0b11; print(nMod) #truncate MSB
#expected and actual: 3
For negative integers it does not work, probably due to the internal 2's complement and variable number of bits representation:
n=-7; nMod=n & 0b11; print(nMod)
#expected:-3; actual: 1
One could certainly analyze the absolute value, determine which bits are actually Ones and remove them by shifting left and right but my wish would be a simple one-liner like for the positive numbers.

Convert an Integer into 32bit Binary Python

I am trying to make a program that converts a given integer(limited by the value 32 bit int can hold) into 32 bit binary number. For example 1 should return (000..31times)1. I have been searching the documents and everything but haven't been able to find some concrete way. I got it working where number of bits are according to the number size but in String. Can anybody tell a more efficient way to go about this?
'{:032b}'.format(n) where n is an integer. If the binary representation is greater than 32 digits it will expand as necessary:
>>> '{:032b}'.format(100)
'00000000000000000000000001100100'
>>> '{:032b}'.format(8589934591)
'111111111111111111111111111111111'
>>> '{:032b}'.format(8589934591 + 1)
'1000000000000000000000000000000000' # N.B. this is 33 digits long
You can just left or right shift integer and convert it to string for display if you need.
>>> 1<<1
2
>>> "{:032b}".format(2)
'00000000000000000000000000000010'
>>>
or if you just need a binary you can consider bin
>>> bin(4)
'0b100'
Say for example the number you want to convert into 32 bit binary is 4. So, num=4.
Here is the code that does this: "s" is the empty string initially.
for i in range(31,-1,-1):
cur=(num>>i) & 1 #(right shift operation on num and i and bitwise AND with 1)
s+=str(cur)
print(s)#s contains 32 bit binary representation of 4(00000000000000000000000000000100)
00000000000000000000000000000100
Lets say
a = 4
print(bin(a)) # 0b101
For the output you may append 0s from LSB to till 101 to get the 32bit address for the integer - 4.
If you don't want 0b you may slice it
print(bin(a)[-3:]) # 101

Python: mask/remove the least significant 2 bits of every 16-bit integer

I want to remove the least significant 2 bits of every 16-bit integer from a bitarray. They're stored like this:
010101**00**10010101101100**00**10101010.....
(The zeroes between the asterisks will be removed. There are two of them every 16 bits (ignoring the very first)).
I can simply eliminate them with a regular for loop checking indexes (the 7th and 8th after every 16 bits).
But... is there another more pythonic way to do this? I'm thinking about some slice notation or maybe comprehension lists. Perhaps I could divide every number by 4 and encode every one with 14 bits (if there's a way to do that).
You can clear bits quite easily with masking. If you want to clear bits 8 and 7 you can do it like this:
a = int('10010101101100',2)
mask = ~((1 << 7) | (1 << 8))
bin(a&mask)
more information about masking from here!

Safest way to convert float to integer in python?

Python's math module contain handy functions like floor & ceil. These functions take a floating point number and return the nearest integer below or above it. However these functions return the answer as a floating point number. For example:
import math
f=math.floor(2.3)
Now f returns:
2.0
What is the safest way to get an integer out of this float, without running the risk of rounding errors (for example if the float is the equivalent of 1.99999) or perhaps I should use another function altogether?
All integers that can be represented by floating point numbers have an exact representation. So you can safely use int on the result. Inexact representations occur only if you are trying to represent a rational number with a denominator that is not a power of two.
That this works is not trivial at all! It's a property of the IEEE floating point representation that int∘floor = ⌊⋅⌋ if the magnitude of the numbers in question is small enough, but different representations are possible where int(floor(2.3)) might be 1.
To quote from Wikipedia,
Any integer with absolute value less than or equal to 224 can be exactly represented in the single precision format, and any integer with absolute value less than or equal to 253 can be exactly represented in the double precision format.
Use int(your non integer number) will nail it.
print int(2.3) # "2"
print int(math.sqrt(5)) # "2"
You could use the round function. If you use no second parameter (# of significant digits) then I think you will get the behavior you want.
IDLE output.
>>> round(2.99999999999)
3
>>> round(2.6)
3
>>> round(2.5)
3
>>> round(2.4)
2
Combining two of the previous results, we have:
int(round(some_float))
This converts a float to an integer fairly dependably.
That this works is not trivial at all! It's a property of the IEEE floating point representation that int∘floor = ⌊⋅⌋ if the magnitude of the numbers in question is small enough, but different representations are possible where int(floor(2.3)) might be 1.
This post explains why it works in that range.
In a double, you can represent 32bit integers without any problems. There cannot be any rounding issues. More precisely, doubles can represent all integers between and including 253 and -253.
Short explanation: A double can store up to 53 binary digits. When you require more, the number is padded with zeroes on the right.
It follows that 53 ones is the largest number that can be stored without padding. Naturally, all (integer) numbers requiring less digits can be stored accurately.
Adding one to 111(omitted)111 (53 ones) yields 100...000, (53 zeroes). As we know, we can store 53 digits, that makes the rightmost zero padding.
This is where 253 comes from.
More detail: We need to consider how IEEE-754 floating point works.
1 bit 11 / 8 52 / 23 # bits double/single precision
[ sign | exponent | mantissa ]
The number is then calculated as follows (excluding special cases that are irrelevant here):
-1sign × 1.mantissa ×2exponent - bias
where bias = 2exponent - 1 - 1, i.e. 1023 and 127 for double/single precision respectively.
Knowing that multiplying by 2X simply shifts all bits X places to the left, it's easy to see that any integer must have all bits in the mantissa that end up right of the decimal point to zero.
Any integer except zero has the following form in binary:
1x...x where the x-es represent the bits to the right of the MSB (most significant bit).
Because we excluded zero, there will always be a MSB that is one—which is why it's not stored. To store the integer, we must bring it into the aforementioned form: -1sign × 1.mantissa ×2exponent - bias.
That's saying the same as shifting the bits over the decimal point until there's only the MSB towards the left of the MSB. All the bits right of the decimal point are then stored in the mantissa.
From this, we can see that we can store at most 52 binary digits apart from the MSB.
It follows that the highest number where all bits are explicitly stored is
111(omitted)111. that's 53 ones (52 + implicit 1) in the case of doubles.
For this, we need to set the exponent, such that the decimal point will be shifted 52 places. If we were to increase the exponent by one, we cannot know the digit right to the left after the decimal point.
111(omitted)111x.
By convention, it's 0. Setting the entire mantissa to zero, we receive the following number:
100(omitted)00x. = 100(omitted)000.
That's a 1 followed by 53 zeroes, 52 stored and 1 added due to the exponent.
It represents 253, which marks the boundary (both negative and positive) between which we can accurately represent all integers. If we wanted to add one to 253, we would have to set the implicit zero (denoted by the x) to one, but that's impossible.
If you need to convert a string float to an int you can use this method.
Example: '38.0' to 38
In order to convert this to an int you can cast it as a float then an int. This will also work for float strings or integer strings.
>>> int(float('38.0'))
38
>>> int(float('38'))
38
Note: This will strip any numbers after the decimal.
>>> int(float('38.2'))
38
math.floor will always return an integer number and thus int(math.floor(some_float)) will never introduce rounding errors.
The rounding error might already be introduced in math.floor(some_large_float), though, or even when storing a large number in a float in the first place. (Large numbers may lose precision when stored in floats.)
Another code sample to convert a real/float to an integer using variables.
"vel" is a real/float number and converted to the next highest INTEGER, "newvel".
import arcpy.math, os, sys, arcpy.da
.
.
with arcpy.da.SearchCursor(densifybkp,[floseg,vel,Length]) as cursor:
for row in cursor:
curvel = float(row[1])
newvel = int(math.ceil(curvel))
Since you're asking for the 'safest' way, I'll provide another answer other than the top answer.
An easy way to make sure you don't lose any precision is to check if the values would be equal after you convert them.
if int(some_value) == some_value:
some_value = int(some_value)
If the float is 1.0 for example, 1.0 is equal to 1. So the conversion to int will execute. And if the float is 1.1, int(1.1) equates to 1, and 1.1 != 1. So the value will remain a float and you won't lose any precision.

Categories