I'm trying to perform some 64 bit additions, ie:
a = 0x15151515
b = 0xFFFFFFFF
c = a + b
print hex(c)
My problem is that the above outputs:
0x115151514
I would like the addition to be 64 bit and disregard the overflow, ie expected output would be:
0x15151514
NB: I'm not looking to truncate the string output, I would like c = 0x15151514. I'm trying to simulator some 64 bit register operations.
Then just use the logical and operator &
c = 0xFFFFFFFF & (a+b)
By the way, these are 32 bit values, not 64 bit values (count the F; every two F is one byte == 8 bit; it's eight F, so four byte, so 32 bit).
Another solution using numpy:
import numpy as np
a = np.array([0x15151515], dtype=np.uint32) # use np.uint64 for 64 bits operations
b = np.array([0xFFFFFFFF], dtype=np.uint32)
c = a + b
print(c, c.dtype)
[353703188] uint32
pros: more readable than binary mask if many operations, especially if other operations such as division are used in which case you cannot just apply the mask at the final result but also need to apply it at intermediary operations ex: (0xFFFFFFFF + 1) // 2)
cons: adds a dependency, requires to be careful with literals:
c = a + 2**32 # 2**32 does not fit in np.uint32 so numpy changes the type of c
print(c, c.dtype)
[4648670485] uint64
Related
Let's say I have this number i = -6884376.
How do I refer to it as to an unsigned variable?
Something like (unsigned long)i in C.
Assuming:
You have 2's-complement representations in mind; and,
By (unsigned long) you mean unsigned 32-bit integer,
then you just need to add 2**32 (or 1 << 32) to the negative value.
For example, apply this to -1:
>>> -1
-1
>>> _ + 2**32
4294967295L
>>> bin(_)
'0b11111111111111111111111111111111'
Assumption #1 means you want -1 to be viewed as a solid string of 1 bits, and assumption #2 means you want 32 of them.
Nobody but you can say what your hidden assumptions are, though. If, for example, you have 1's-complement representations in mind, then you need to apply the ~ prefix operator instead. Python integers work hard to give the illusion of using an infinitely wide 2's complement representation (like regular 2's complement, but with an infinite number of "sign bits").
And to duplicate what the platform C compiler does, you can use the ctypes module:
>>> import ctypes
>>> ctypes.c_ulong(-1) # stuff Python's -1 into a C unsigned long
c_ulong(4294967295L)
>>> _.value
4294967295L
C's unsigned long happens to be 4 bytes on the box that ran this sample.
To get the value equivalent to your C cast, just bitwise and with the appropriate mask. e.g. if unsigned long is 32 bit:
>>> i = -6884376
>>> i & 0xffffffff
4288082920
or if it is 64 bit:
>>> i & 0xffffffffffffffff
18446744073702667240
Do be aware though that although that gives you the value you would have in C, it is still a signed value, so any subsequent calculations may give a negative result and you'll have to continue to apply the mask to simulate a 32 or 64 bit calculation.
This works because although Python looks like it stores all numbers as sign and magnitude, the bitwise operations are defined as working on two's complement values. C stores integers in twos complement but with a fixed number of bits. Python bitwise operators act on twos complement values but as though they had an infinite number of bits: for positive numbers they extend leftwards to infinity with zeros, but negative numbers extend left with ones. The & operator will change that leftward string of ones into zeros and leave you with just the bits that would have fit into the C value.
Displaying the values in hex may make this clearer (and I rewrote to string of f's as an expression to show we are interested in either 32 or 64 bits):
>>> hex(i)
'-0x690c18'
>>> hex (i & ((1 << 32) - 1))
'0xff96f3e8'
>>> hex (i & ((1 << 64) - 1)
'0xffffffffff96f3e8L'
For a 32 bit value in C, positive numbers go up to 2147483647 (0x7fffffff), and negative numbers have the top bit set going from -1 (0xffffffff) down to -2147483648 (0x80000000). For values that fit entirely in the mask, we can reverse the process in Python by using a smaller mask to remove the sign bit and then subtracting the sign bit:
>>> u = i & ((1 << 32) - 1)
>>> (u & ((1 << 31) - 1)) - (u & (1 << 31))
-6884376
Or for the 64 bit version:
>>> u = 18446744073702667240
>>> (u & ((1 << 63) - 1)) - (u & (1 << 63))
-6884376
This inverse process will leave the value unchanged if the sign bit is 0, but obviously it isn't a true inverse because if you started with a value that wouldn't fit within the mask size then those bits are gone.
Python doesn't have builtin unsigned types. You can use mathematical operations to compute a new int representing the value you would get in C, but there is no "unsigned value" of a Python int. The Python int is an abstraction of an integer value, not a direct access to a fixed-byte-size integer.
Since version 3.2 :
def unsignedToSigned(n, byte_count):
return int.from_bytes(n.to_bytes(byte_count, 'little', signed=False), 'little', signed=True)
def signedToUnsigned(n, byte_count):
return int.from_bytes(n.to_bytes(byte_count, 'little', signed=True), 'little', signed=False)
output :
In [3]: unsignedToSigned(5, 1)
Out[3]: 5
In [4]: signedToUnsigned(5, 1)
Out[4]: 5
In [5]: unsignedToSigned(0xFF, 1)
Out[5]: -1
In [6]: signedToUnsigned(0xFF, 1)
---------------------------------------------------------------------------
OverflowError Traceback (most recent call last)
Input In [6], in <cell line: 1>()
----> 1 signedToUnsigned(0xFF, 1)
Input In [1], in signedToUnsigned(n, byte_count)
4 def signedToUnsigned(n, byte_count):
----> 5 return int.from_bytes(n.to_bytes(byte_count, 'little', signed=True), 'little', signed=False)
OverflowError: int too big to convert
In [7]: signedToUnsigned(-1, 1)
Out[7]: 255
Explanations : to/from_bytes convert to/from bytes, in 2's complement considering the number as one of size byte_count * 8 bits. In C/C++, chances are you should pass 4 or 8 as byte_count for respectively a 32 or 64 bit number (the int type).
I first pack the input number in the format it is supposed to be from (using the signed argument to control signed/unsigned), then unpack to the format we would like it to have been from. And you get the result.
Note the Exception when trying to use fewer bytes than required to represent the number (In [6]). 0xFF is 255 which can't be represented using a C's char type (-128 ≤ n ≤ 127). This is preferable to any other behavior.
You could use the struct Python built-in library:
Encode:
import struct
i = -6884376
print('{0:b}'.format(i))
packed = struct.pack('>l', i) # Packing a long number.
unpacked = struct.unpack('>L', packed)[0] # Unpacking a packed long number to unsigned long
print(unpacked)
print('{0:b}'.format(unpacked))
Out:
-11010010000110000011000
4288082920
11111111100101101111001111101000
Decode:
dec_pack = struct.pack('>L', unpacked) # Packing an unsigned long number.
dec_unpack = struct.unpack('>l', dec_pack)[0] # Unpacking a packed unsigned long number to long (revert action).
print(dec_unpack)
Out:
-6884376
[NOTE]:
> is BigEndian operation.
l is long.
L is unsigned long.
In amd64 architecture int and long are 32bit, So you could use i and I instead of l and L respectively.
[UPDATE]
According to the #hl037_ comment, this approach works on int32 not int64 or int128 as I used long operation into struct.pack(). Nevertheless, in the case of int64, the written code would be changed simply using long long operand (q) in struct as follows:
Encode:
i = 9223372036854775807 # the largest int64 number
packed = struct.pack('>q', i) # Packing an int64 number
unpacked = struct.unpack('>Q', packed)[0] # Unpacking signed to unsigned
print(unpacked)
print('{0:b}'.format(unpacked))
Out:
9223372036854775807
111111111111111111111111111111111111111111111111111111111111111
Next, follow the same way for the decoding stage. As well as this, keep in mind q is long long integer — 8byte and Q is unsigned long long
But in the case of int128, the situation is slightly different as there is no 16-byte operand for struct.pack(). Therefore, you should split your number into two int64.
Here's how it should be:
i = 10000000000000000000000000000000000000 # an int128 number
print(len('{0:b}'.format(i)))
max_int64 = 0xFFFFFFFFFFFFFFFF
packed = struct.pack('>qq', (i >> 64) & max_int64, i & max_int64)
a, b = struct.unpack('>QQ', packed)
unpacked = (a << 64) | b
print(unpacked)
print('{0:b}'.format(unpacked))
Out:
123
10000000000000000000000000000000000000
111100001011110111000010000110101011101101001000110110110010000000011110100001101101010000000000000000000000000000000000000
just use abs for converting unsigned to signed in python
a=-12
b=abs(a)
print(b)
Output:
12
I was looking at the source of sorted_containers and was surprised to see this line:
self._load, self._twice, self._half = load, load * 2, load >> 1
Here load is an integer. Why use bit shift in one place, and multiplication in another? It seems reasonable that bit shifting may be faster than integral division by 2, but why not replace the multiplication by a shift as well? I benchmarked the the following cases:
(times, divide)
(shift, shift)
(times, shift)
(shift, divide)
and found that #3 is consistently faster than other alternatives:
# self._load, self._twice, self._half = load, load * 2, load >> 1
import random
import timeit
import pandas as pd
x = random.randint(10 ** 3, 10 ** 6)
def test_naive():
a, b, c = x, 2 * x, x // 2
def test_shift():
a, b, c = x, x << 1, x >> 1
def test_mixed():
a, b, c = x, x * 2, x >> 1
def test_mixed_swapped():
a, b, c = x, x << 1, x // 2
def observe(k):
print(k)
return {
'naive': timeit.timeit(test_naive),
'shift': timeit.timeit(test_shift),
'mixed': timeit.timeit(test_mixed),
'mixed_swapped': timeit.timeit(test_mixed_swapped),
}
def get_observations():
return pd.DataFrame([observe(k) for k in range(100)])
The question:
Is my test valid? If so, why is (multiply, shift) faster than (shift, shift)?
I run Python 3.5 on Ubuntu 14.04.
Edit
Above is the original statement of the question. Dan Getz provides an excellent explanation in his answer.
For the sake of completeness, here are sample illustrations for larger x when multiplication optimizations do not apply.
This seems to be because multiplication of small numbers is optimized in CPython 3.5, in a way that left shifts by small numbers are not. Positive left shifts always create a larger integer object to store the result, as part of the calculation, while for multiplications of the sort you used in your test, a special optimization avoids this and creates an integer object of the correct size. This can be seen in the source code of Python's integer implementation.
Because integers in Python are arbitrary-precision, they are stored as arrays of integer "digits", with a limit on the number of bits per integer digit. So in the general case, operations involving integers are not single operations, but instead need to handle the case of multiple "digits". In pyport.h, this bit limit is defined as 30 bits on 64-bit platform, or 15 bits otherwise. (I'll just call this 30 from here on to keep the explanation simple. But note that if you were using Python compiled for 32-bit, your benchmark's result would depend on if x were less than 32,768 or not.)
When an operation's inputs and outputs stay within this 30-bit limit, the operation can be handled in an optimized way instead of the general way. The beginning of the integer multiplication implementation is as follows:
static PyObject *
long_mul(PyLongObject *a, PyLongObject *b)
{
PyLongObject *z;
CHECK_BINOP(a, b);
/* fast path for single-digit multiplication */
if (Py_ABS(Py_SIZE(a)) <= 1 && Py_ABS(Py_SIZE(b)) <= 1) {
stwodigits v = (stwodigits)(MEDIUM_VALUE(a)) * MEDIUM_VALUE(b);
#ifdef HAVE_LONG_LONG
return PyLong_FromLongLong((PY_LONG_LONG)v);
#else
/* if we don't have long long then we're almost certainly
using 15-bit digits, so v will fit in a long. In the
unlikely event that we're using 30-bit digits on a platform
without long long, a large v will just cause us to fall
through to the general multiplication code below. */
if (v >= LONG_MIN && v <= LONG_MAX)
return PyLong_FromLong((long)v);
#endif
}
So when multiplying two integers where each fits in a 30-bit digit, this is done as a direct multiplication by the CPython interpreter, instead of working with the integers as arrays. (MEDIUM_VALUE() called on a positive integer object simply gets its first 30-bit digit.) If the result fits in a single 30-bit digit, PyLong_FromLongLong() will notice this in a relatively small number of operations, and create a single-digit integer object to store it.
In contrast, left shifts are not optimized this way, and every left shift deals with the integer being shifted as an array. In particular, if you look at the source code for long_lshift(), in the case of a small but positive left shift, a 2-digit integer object is always created, if only to have its length truncated to 1 later: (my comments in /*** ***/)
static PyObject *
long_lshift(PyObject *v, PyObject *w)
{
/*** ... ***/
wordshift = shiftby / PyLong_SHIFT; /*** zero for small w ***/
remshift = shiftby - wordshift * PyLong_SHIFT; /*** w for small w ***/
oldsize = Py_ABS(Py_SIZE(a)); /*** 1 for small v > 0 ***/
newsize = oldsize + wordshift;
if (remshift)
++newsize; /*** here newsize becomes at least 2 for w > 0, v > 0 ***/
z = _PyLong_New(newsize);
/*** ... ***/
}
Integer division
You didn't ask about the worse performance of integer floor division compared to right shifts, because that fit your (and my) expectations. But dividing a small positive number by another small positive number is not as optimized as small multiplications, either. Every // computes both the quotient and the remainder using the function long_divrem(). This remainder is computed for a small divisor with a multiplication, and is stored in a newly-allocated integer object, which in this situation is immediately discarded.
Or at least, that was the case when this question was originally asked. In CPython 3.6, a fast path for small int // was added, so // now beats >> for small ints too.
I want to write extendible hashing. On wiki I have found good implementation in python. But this code uses least significant bits, so when I have hash 1101 for d = 1 value is 1 and for d = 2 value is 01. I would like to use most significant bits. For exmaple: hash 1101, d = 1 value is 1, d = 2 value is 11. Is there any simple way to do that? I tried, but I can't.
Do you understand why it uses the least significant bits?
More or less. It makes efficient when we using arrays. Ok so for hash function I would like to use four least bits from 4-bytes integer but from left to right.
h = hash(k)
h = h & 0xf #use mask to get four least bits
p = self.pp[ h >> ( 4 - GD)]
And it doesn't work, and I don't know why.
Computing a hash using the least significant bits is the fastest way to compute a hash, because it only requires an AND bitwise operation. This makes it very popular.
Here is an implemetation (in C) for a hash using the most significant bits. Since there is no direct way to know the most significant bit, it repeatedly tests that the remaining value has only the specified amount of bits.
int significantHash(int value, int bits) {
int mask = (1 << bits) - 1;
while (value > mask) {
value >>= 1;
}
return value;
}
I recommend the overlapping hash, that makes use of all the bits of the number. Essentially, it cuts the number in parts of equal number of bits and XORs them. It runs slower than the least significant hash, but faster than the significant hash. Above all else, it offers a better dispersion than the other two methods, making it a better candidate when the numbers that must be hashed have a certain bit-related-pattern.
int overlappingHash(int value, int bits) {
int mask = (1 << bits) - 1;
int answer = 0;
do {
answer ^= (value & mask);
value >>= bits;
} while (value > 0);
return answer;
}
I want to convert an integer (int or long) a big-endian byte string. The byte string has to be of variable length, so that only the minimum number of bytes are used (the total length length of the preceding data is known, so the variable length can be inferred).
My current solution is
import bitstring
bitstring.BitString(hex=hex(456)).tobytes()
Which obviously depends on the endianness of the machine and gives false results, because 0 bits are append and no prepended.
Does any one know a way to do this without making any assumption about the length or endianess of an int?
Something like this. Untested (until next edit). For Python 2.x. Assumes n > 0.
tmp = []
while n:
n, d = divmod(n, 256)
tmp.append(chr(d))
result = ''.join(tmp[::-1])
Edit: tested.
If you don't read manuals but like bitbashing, instead of the divmod caper, try this:
d = n & 0xFF; n >>= 8
Edit 2: If your numbers are relatively small, the following may be faster:
result = ''
while n:
result = chr(n & 0xFF) + result
n >>= 8
Edit 3: The second method doesn't assume that the int is already bigendian. Here's what happens in a notoriously littleendian environment:
Python 2.7 (r27:82525, Jul 4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> n = 65539
>>> result = ''
>>> while n:
... result = chr(n & 0xFF) + result
... n >>= 8
...
>>> result
'\x01\x00\x03'
>>> import sys; sys.byteorder
'little'
>>>
A solution using struct and itertools:
>>> import itertools, struct
>>> "".join(itertools.dropwhile(lambda c: not(ord(c)), struct.pack(">i", 456))) or chr(0)
'\x01\xc8'
We can drop itertools by using a simple string strip:
>>> struct.pack(">i", 456).lstrip(chr(0)) or chr(0)
'\x01\xc8'
Or even drop struct using a recursive function:
def to_bytes(n):
return ([chr(n & 255)] + to_bytes(n >> 8) if n > 0 else [])
"".join(reversed(to_bytes(456))) or chr(0)
If you're using Python 2.7 or later then you can use the bit_length method to round the length up to the next byte:
>>> i = 456
>>> bitstring.BitString(uint=i, length=(i.bit_length()+7)/8*8).bytes
'\x01\xc8'
otherwise you can just test for whole-byteness and pad with a zero nibble at the start if needed:
>>> s = bitstring.BitString(hex=hex(i))
>>> ('0x0' + s if s.len%8 else s).bytes
'\x01\xc8'
I reformulated John Machins second answer in one line for use on my server:
def bytestring(n):
return ''.join([chr((n>>(i*8))&0xFF) for i in range(n.bit_length()/8,-1,-1)])
I have found that the second method, using bit-shifting, was faster for both large and small numbers, and not just small numbers.
Building on How Do You Express Binary Literals in Python, I was thinking about sensible, intuitive ways to do that Programming 101 chestnut of displaying integers in base-2 form. This is the best I came up with, but I'd like to replace it with a better algorithm, or at least one that should have screaming-fast performance.
def num_bin(N, places=8):
def bit_at_p(N, p):
''' find the bit at place p for number n '''
two_p = 1 << p # 2 ^ p, using bitshift, will have exactly one
# bit set, at place p
x = N & two_p # binary composition, will be one where *both* numbers
# have a 1 at that bit. this can only happen
# at position p. will yield two_p if N has a 1 at
# bit p
return int(x > 0)
bits = ( bit_at_p(N,x) for x in xrange(places))
return "".join( (str(x) for x in bits) )
# or, more consisely
# return "".join([str(int((N & 1 << x)>0)) for x in xrange(places)])
For best efficiency, you generally want to process more than a single bit at a time.
You can use a simple method to get a fixed width binary representation. eg.
def _bin(x, width):
return ''.join(str((x>>i)&1) for i in xrange(width-1,-1,-1))
_bin(x, 8) will now give a zero padded representation of x's lower 8 bits. This can be used to build a lookup table, allowing your converter to process 8 bits at a time (or more if you want to devote the memory to it).
_conv_table = [_bin(x,8) for x in range(256)]
Then you can use this in your real function, stripping off leading zeroes when returning it. I've also added handling for signed numbers, as without it you will get an infinite loop (Negative integers conceptually have an infinite number of set sign bits.)
def bin(x):
if x == 0:
return '0' #Special case: Don't strip leading zero if no other digits
elif x < 0:
sign='-'
x*=-1
else:
sign = ''
l=[]
while x:
l.append(_conv_table[x & 0xff])
x >>= 8
return sign + ''.join(reversed(l)).lstrip("0")
[Edit] Changed code to handle signed integers.
[Edit2] Here are some timing figures of the various solutions. bin is the function above, constantin_bin is from Constantin's answer and num_bin is the original version. Out of curiosity, I also tried a 16 bit lookup table variant of the above (bin16 below), and tried out Python3's builtin bin() function. All timings were for 100000 runs using an 01010101 bit pattern.
Num Bits: 8 16 32 64 128 256
---------------------------------------------------------------------
bin 0.544 0.586 0.744 1.942 1.854 3.357
bin16 0.542 0.494 0.592 0.773 1.150 1.886
constantin_bin 2.238 3.803 7.794 17.869 34.636 94.799
num_bin 3.712 5.693 12.086 32.566 67.523 128.565
Python3's bin 0.079 0.045 0.062 0.069 0.212 0.201
As you can see, when processing long values using large chunks really pays off, but nothing beats the low-level C code of python3's builtin (which bizarrely seems consistently faster at 256 bits than 128!). Using a 16 bit lookup table improves things, but probably isn't worth it unless you really need it, as it uses up a large chunk of memory, and can introduce a small but noticalbe startup delay to precompute the table.
Not screaming-fast, but straightforward:
>>> def bin(x):
... sign = '-' if x < 0 else ''
... x = abs(x)
... bits = []
... while x:
... x, rmost = divmod(x, 2)
... bits.append(rmost)
... return sign + ''.join(str(b) for b in reversed(bits or [0]))
It is also faster than num_bin:
>>> import timeit
>>> t_bin = timeit.Timer('bin(0xf0)', 'from __main__ import bin')
>>> print t_bin.timeit(number=100000)
4.19453350997
>>> t_num_bin = timeit.Timer('num_bin(0xf0)', 'from __main__ import num_bin')
>>> print t_num_bin.timeit(number=100000)
4.70694716882
Even more, it actually works correctly (for my definition of "correctness" :)):
>>> bin(1)
'1'
>>> num_bin(1)
'10000000'