Python high precision integer to array of numpy integer - python

I understand that numpy can't handle non-native integers, but how can I store python high precision integers as an array of native integers (in either endian)? e.g.
a = 105951305240504794066266398962584786593081186897777398483830058739006966285013
can't be stored as a native integer because it's 256 bit. But it can be stored as
A = array([18196013122530525909, 15462736877728584896,
12869567647602165677, 16879016735257358861], dtype=uint64)
using little-endian (i.e. a == A[0] + A[1]<<64 + A[2]<<128 + A[3]<<192) or A[::-1] as big-endian. How can I convert from a to A here?
I want to convert this "python-side" number to "numpy-side" so that I can run highly efficient algorithms on it (e.g. fast multiplication using Fourier transform).
I believe Python internally should already be using similar structure. All I need to do is to "expose" it to numpy, but I'm not sure about the exact structure or how can I "expose" it. The most straight forward way is of course using a while loop:
A = np.zeros(4, 'uint64')
i = 0
while a > 0:
A[i] = a & (2**64-1)
a >>= 64
i += 1
But I'm wondering are there more "native" or "efficient" ways?
Thanks for your help!

Related

Is there a built-in python function to count bit flip in a binary string?

Is there a built-in python function to count bit flip in a binary string? The question I am trying to solve is given a binary string of arbitrary length, how can I return the number of bit flips of the string. For example, the bit flip number of '01101' is 3, and '1110011' is 2, etc.
The way I can come up with to solve this problem is to use a for loop and a counter. However, that seems too lengthy. Is there a way I can do that faster? Or is there a built-in function in python that allows me to do that directly? Thanks for the help!
There is a very fast way to do that without any explicit loops and only using Python builtins: you can convert the string to a binary number, then detect all the bit flips using a XOR-based integer tricks and then convert the integer back to a string to count the number of bit flips. Here is the code:
# Convert the binary string `s` to an integer: "01101" -> 0b01101
n = int(s, 2)
# Build a binary mask to skip the most significant bit of n: 0b01101 -> 0b01111
mask = (1 << (len(s)-1)) - 1
# Check if the ith bit of n is different from the (i+1)th bit of n using a bit-wise XOR:
# 0b01101 & 0b01111 -> 0b1101 (discard the first bit)
# 0b01101 >> 1 -> 0b0110
# 0b1101 ^ 0b0110 -> 0b1011
bitFlips = (n & mask) ^ (n >> 1)
# Convert the integer back to a string and count the bit flips: 0b1011 -> "0b1011" -> 3
flipCount = bin(bitFlips).count('1')
This trick is much faster than other methods since integer operations are very optimized compare to a loop-based interpreted codes or the ones working on iterables. Here are performance results for a string of size 1000 on my machine:
ljdyer's solution: 96 us x1.0
Karl's solution: 39 us x2.5
This solution: 4 us x24.0
If you are working with short bounded strings, then there are even faster ways to count the number of bits set in an integer.
Don't know about a built in function, but here's a one-liner:
bit_flip_count = len([x for x in range(1, len(x0)) if x0[x] != x0[x-1]])
Given a sequence of values, you can find the number of times that the value changes by grouping contiguous values and then counting the groups. There will be one more group than the number of changes (since the elements before the first change are also in a group). (Of course, for an empty sequence, this gives you a result of -1; you may want to handle this case separately.)
Grouping in Python is built-in, via the standard library itertools.groupby. This tool only considers contiguous groups, which is often a drawback (if you want to make a histogram, for example, you have to sort the data first) but in our case is exactly what we want. The overall interface of this tool is a bit complex, but in our case we can use it simply:
from itertools import groupby
def changes_in(sequence):
return len(list(groupby(sequence))) - 1

Is there any efficient way to increment the corresponding set positions of an integer in an integer array?

Any solution consuming less than O(Bit Length) time is welcome. I need to process around 100 million large integers.
answer = [0 for i in xrange(100)]
def pluginBits(val):
global answer
for j in xrange(len(answer)):
if val <= 0:
break
answer[j] += (val & 1)
val >>= 1
A speedier way to do this would be to use '{:b}'.format(someval) to convert from integer to a string of '1's and '0's. Python still needs to do similar work to perform this conversion, but doing it at the C layer in the interpreter internals involves significantly less overhead for larger values.
For conversion to actual list of integer 1s and 0s, you could do something like:
# Done once at top level to make translation table:
import string
bitstr_to_intval = string.maketrans(b'01', b'\x00\x01')
# Done for each value to convert:
bits = bytearray('{:b}'.format(origint).translate(bitstr_to_intval))
Since bytearray is a mutable sequence of values in range(256) that iterates the actual int values, you don't need to convert to list; it should be usable in 99% of the places the list would be used, using less memory and running faster.
This does generate the values in the reverse of the order your code produces (that is, bits[-1] here is the same as your answer[0], bits[-2] is your answer[1], etc.), and it's unpadded, but since you're summing bits, the padding isn't needed, and reversing the result is a trivial reversing slice (add [::-1] to the end). Summing the bits from each input can be made much faster by making answer a numpy array (that allows a bulk element-wise addition at the C layer), and putting it all together gets:
import string
bitstr_to_intval = string.maketrans(b'01', b'\x00\x01')
answer = numpy.zeros(100, numpy.uint64)
def pluginBits(val):
bits = bytearray('{:b}'.format(val).translate(bitstr_to_intval))[::-1]
answer[:len(bits)] += bits
In local tests, this definition of pluginBits takes a little under one-seventh the time to sum the bits at each position for 10,000 random input integers of 100 bits each, and gets the same results.

Python f.read() and Octave fread(). => Reading a binary file showing the same values

I'm reading a binary file with signal samples both in Octave and Python.
The thing is, I want to obtain the same values for both codes, which is not the case.
The binary file is basically a signal in complex format I,Q recorded as a 16bits Int.
So, based on the Octave code:
[data, cnt_data] = fread(fid, 2 * secondOfData * fs, 'int16');
and then:
data = data(1:2:end) + 1i * data(2:2:end);
It seems simple, just reading the binary data as 16 bits ints. And then creating the final array of complex numbers.
Threfore I assume that in Python I need to do as follows:
rel=int(f.read(2).encode("hex"),16)
img=int(f.read(2).encode("hex"),16)
in_clean.append(complex(rel,img))
Ok, the main problem I have is that both real and imaginary parts values are not the same.
For instance, in Octave, the first value is: -20390 - 10053i
While in Python (applying the code above), the value is: (23216+48088j)
As signs are different, the first thing I thought was that maybe the endianness of the computer that recorded the file and the one I'm using for reading the file are different. So I turned to unpack function, as it allows you to force the endian type.
I was not able to find an "int16" in the unpack documentation:
https://docs.python.org/2/library/struct.html
Therefore I went for the "i" option adding "x" (padding bytes) in order to meet the requirement of 32 bits from the table in the "struct" documentation.
So with:
struct.unpack("i","xx"+f.read(2))[0]
the result is (-1336248200-658802568j) Using
struct.unpack("<i","xx"+f.read(2))[0] provides the same result.
With:
struct.unpack(">i","xx"+f.read(2))[0]
The value is: (2021153456+2021178328j)
With:
struct.unpack(">i",f.read(2)+"xx")[0]
The value is: (1521514616-1143441288j)
With:
struct.unpack("<i",f.read(2)+"xx")[0]
The value is: (2021175386+2021185723j)
I also tried with numpy and "frombuffer":
np.frombuffer(f.read(1).encode("hex"),dtype=np.int16)
With provides: (24885+12386j)
So, any idea about what I'm doing wrong? I'd like to obtain the same value as in Octave.
What is the proper way of reading and interpreting the values in Python so I can obtain the same value as in Octave by applying fread with an'int16'?
I've been searching on the Internet for an answer for this but I was not able to find a method that provides the same value
Thanks a lot
Best regards
It looks like the binary data in your question is 5ab0bbd8. To unpack signed 16 bit integers with struct.unpack, you use the 'h' format character. From that (23216+48088j) output, it appears that the data is encoded as little-endian, so we need to use < as the first item in the format string.
from struct import unpack
data = b'\x5a\xb0\xbb\xd8'
# The wrong way
rel=int(data[:2].encode("hex"),16)
img=int(data[2:].encode("hex"),16)
c = complex(rel, img)
print c
# The right way
rel, img = unpack('<hh', data)
c = complex(rel, img)
print c
output
(23216+48088j)
(-20390-10053j)
Note that rel, img = unpack('<hh', data) will also work correctly on Python 3.
FWIW, in Python 3, you could also decode 2 bytes to a signed integer like this:
def int16_bytes_to_int(b):
n = int.from_bytes(b, 'little')
if n > 0x7fff:
n -= 0x10000
return n
The rough equivalent in Python 2 is:
def int16_bytes_to_int(b):
lo, hi = b
n = (ord(hi) << 8) + ord(lo)
if n > 0x7fff:
n -= 0x10000
return n
But having to do that subtraction to handle signed numbers is annoying, and using struct.unpack is bound to be much more efficient.

Process RGBA data efficiently using python?

I'm trying to process an RGBA buffer (list of chars), and run "unpremultiply" on each pixel. The algorithm is color_out=color*255/alpha.
This is what I came up with:
def rgba_unpremultiply(data):
for i in range(0, len(data), 4):
a = ord(data[i+3])
if a != 0:
data[i] = chr(255*ord(data[i])/a)
data[i+1] = chr(255*ord(data[i+1])/a)
data[i+2] = chr(255*ord(data[i+2])/a)
return data
It works but causes a major drawback in performance.
I'm wondering besides writing a C module, what are my options to optimize this particular function?
This is exactly the kind of code NumPy is great for.
import numpy
def rgba_unpremultiply(data):
a = numpy.fromstring(data, 'B') # Treat the string as an array of bytes
a = a.astype('I') # Cast array of bytes to array of uints, since temporary values needs to be larger than byte
alpha = a[3::4] # Every 4th element starting from index 3
alpha = numpy.where(alpha == 0, 255, alpha) # Don't modify colors where alpha is 0
a[0::4] = a[0::4] * 255 // alpha # Operates on entire slices of the array instead of looping over each element
a[1::4] = a[1::4] * 255 // alpha
a[2::4] = a[2::4] * 255 // alpha
return a.astype('B').tostring() # Cast back to bytes
How big is data? Assuming this is on python2.X Try using xrange instead of range so that you don't have to constantly allocate and reallocate a large list.
You could convert all the data to integers for working with them so you're not constantly converting to and from characters.
Look into using numpy to vectorize this: Link I suspect that simply storing the data as integers and using a numpy array will greatly improve the performance.
And another relatively simple thing you could do is write a little Cython:
http://wiki.cython.org/examples/mandelbrot
Basically Cython will compile your above function into C code with just a few lines of type hints. It greatly reduces the barrier to writing a C extension.
I don't have a concrete answer, but some useful pointers might be:
Python's array module
numpy
OpenCV if you have actual image data
There are some minor things you can do, but I do not think you can improve a lot.
Anyway, here's some hint:
def rgba_unpremultiply(data):
# xrange() is more performant then range, it does not precalculate the whole array
for i in xrange(0, len(data), 4):
a = ord(data[i+3])
if a != 0:
# Not sure about this, but maybe (c << 8) - c is faster than c*255
# So maybe you can arrange this to do that
# Check for performance improvement
data[i] = chr(((ord(data[i]) << 8) - ord(data[i]))/a)
data[i+1] = chr(255*ord(data[i+1])/a)
data[i+2] = chr(255*ord(data[i+2])/a)
return data
I've just make some dummy benchmark on << vs *, and it seems not to be markable differences, but I guess you can do better evaluation on your project.
Anyway, a c module may be a good thing, even if it does not seem to be "language related" the problem.

Wrong results with Python multiply() and prod()

Can anyone explain the following? I'm using Python 2.5
Consider 1*3*5*7*9*11 ... *49. If you type all that from within IPython(x,y) interactive console, you'll get 58435841445947272053455474390625L, which is correct. (why odd numbers: just the way I did it originally)
Python multiply.reduce() or prod() should yield the same result for the equivalent range. And it does, up to a certain point. Here, it is already wrong:
: k = range(1, 50, 2)
: multiply.reduce(k)
: -108792223
Using prod(k) will also generate -108792223 as the result. Other incorrect results start to appear for equivalent ranges of length 12 (that is, k = range(1,24,2)).
I'm not sure why. Can anyone help?
This is because numpy.multiply.reduce() converts the range list to an array of type numpy.int32, and the reduce operation overflows what can be stored in 32 bits at some point:
>>> type(numpy.multiply.reduce(range(1, 50, 2)))
<type 'numpy.int32'>
As Mike Graham says, you can use the dtype parameter to use Python integers instead of the default:
>>> res = numpy.multiply.reduce(range(1, 50, 2), dtype=object)
>>> res
58435841445947272053455474390625L
>>> type(res)
<type 'long'>
But using numpy to work with python objects is pointless in this case, the best solution is KennyTM's:
>>> import functools, operator
>>> functools.reduce(operator.mul, range(1, 50, 2))
58435841445947272053455474390625L
The CPU doesn't multiply arbitrarily large numbers, it only performs specific operations defined on particular ranges of numbers represented in base 2, 0-1 bits.
Python '*' handles large integers perfectly through a proper representation and special code beyond the CPU or FPU instructions for multiply.
This is actually unusual as languages go.
In most other languages, usually a number is represented as a fixed array of bits. For example in C or SQL you could choose to have an 8 bit integer that can represent 0 to 255, or -128 to +127 or you could choose to have a 16 bit integer that can represent up to 2^16-1 which is 65535. When there is only a range of numbers that can be represented, going past the limit with some operation like * or + can have an undesirable effect, like getting a negative number. You may have encountered such a problem when using the external library which is probably natively C and not python.

Categories