itertools product speed up

itertools product speed up - python

I use itertools.product to generate all possible variations of 4 elements of length 13. The 4 and 13 can be arbitrary, but as it is, I get 4^13 results, which is a lot. I need the result as a Numpy array and currently do the following:
c = it.product([1,-1,np.complex(0,1), np.complex(0,-1)], repeat=length)
sendbuf = np.array(list(c))
With some simple profiling code shoved in between, it looks like the first line is pretty much instantaneous, whereas the conversion to a list and then Numpy array takes about 3 hours.
Is there a way to make this quicker? It's probably something really obvious that I am overlooking.
Thanks!

The NumPy equivalent of itertools.product() is numpy.indices(), but it will only get you the product of ranges of the form 0,...,k-1:
numpy.rollaxis(numpy.indices((2, 3, 3)), 0, 4)
array([[[[0, 0, 0],
[0, 0, 1],
[0, 0, 2]],
[[0, 1, 0],
[0, 1, 1],
[0, 1, 2]],
[[0, 2, 0],
[0, 2, 1],
[0, 2, 2]]],
[[[1, 0, 0],
[1, 0, 1],
[1, 0, 2]],
[[1, 1, 0],
[1, 1, 1],
[1, 1, 2]],
[[1, 2, 0],
[1, 2, 1],
[1, 2, 2]]]])
For your special case, you can use
a = numpy.indices((4,)*13)
b = 1j ** numpy.rollaxis(a, 0, 14)
(This won't run on a 32 bit system, because the array is to large. Extrapolating from the size I can test, it should run in less than a minute though.)
EIDT: Just to mention it: the call to numpy.rollaxis() is more or less cosmetical, to get the same output as itertools.product(). If you don't care about the order of the indices, you can just omit it (but it is cheap anyway as long as you don't have any follow-up operations that would transform your array into a contiguous array.)
EDIT2: To get the exact analogue of
numpy.array(list(itertools.product(some_list, repeat=some_length)))
you can use
numpy.array(some_list)[numpy.rollaxis(
numpy.indices((len(some_list),) * some_length), 0, some_length + 1)
.reshape(-1, some_length)]
This got completely unreadable -- just tell me whether I should explain it any further :)

The first line seems instantaneous because no actual operation is taking place. A generator object is just constructed and only when you iterate through it as the operating taking place. As you said, you get 4^13 = 67108864 numbers, all these are computed and made available during your list call. I see that np.array takes only list or a tuple, so you could try creating a tuple out of your iterator and pass it to np.array to see if there is any performance difference and it does not affect the overall performance of your program. This can be determined only by trying for your usecase though there are some points which say tuple is slightly faster.
To try with a tuple, instead of list just do
sendbuf = np.array(tuple(c))

You could speed things up by skipping the conversion to a list:
numpy.fromiter(c, count=…) # Using count also speeds things up, but it's optional
With this function, the NumPy array is first allocated and then initialized element by element, without having to go through the additional step of a list construction.
PS: fromiter() does not handle the tuples returned by product(), so this might not be a solution, for now. If fromiter() did handle dtype=object, this should work, though.
PPS: As Joe Kington pointed out, this can be made to work by putting the tuples in a structured array. However, this does not appear to always give a speed up.

Let numpy.meshgrid do all the job:
length = 13
x = [1, -1, 1j, -1j]
mesh = numpy.meshgrid(*([x] * length))
result = numpy.vstack([y.flat for y in mesh]).T
on my notebook it takes ~2 minutes

You might want to try a completely different approach: first create an empty array of the desired size:
result = np.empty((4**length, length), dtype=complex)
then use NumPy's slicing abilities to fill out the array yourself:
# Set up of the last "digit":
result[::4, length-1] = 1
result[1::4, length-1] = -1
result[2::4, length-1] = 1j
result[3::4, length-1] = -1j
You can do similar things for the other "digits" (i.e. the elements of result[:, 2], result[:, 1], and result[:, 0]). The whole thing could certainly be put in a loop that iterates over each digit.
Transposing the whole operation (np.empty((length, 4**length)…)) is worth trying, as it might bring a speed gain (through a better use of the memory cache).

Probably not optimized but much less reliant on python type conversions:
ints = [1,2,3,4]
repeat = 3
def prod(ints, repeat):
w = repeat
l = len(ints)
h = l**repeat
ints = np.array(ints)
A = np.empty((h,w), dtype=int)
rng = np.arange(h)
for i in range(w):
x = l**i
idx = np.mod(rng,l*x)/x
A[:,i] = ints[idx]
return A

Related

Multiply each element of a list by an entire other list

I have two lists which are very large. The basic structure is :
a = [1,0,0,0,1,1,0,0] and b=[1,0,1,0]. There is no restriction on the length of either list and there is also no restriction on the value of the elements in either list.
I want to multiply each element of a by the contents of b.
For example, the following code does the job:
multiplied = []
for a_bit in a:
for b_bit in b:
multiplied.append(a_bit*b_bit)
So for the even simpler case of a=[1,0] and b = [1,0,1,0], the output multiplied would be equal to:
>>> print(multiplied)
[1,0,1,0,0,0,0,0]
Is there a way with numpy or map or zip to do this? There are similar questions that are multiplying lists with lists and a bunch of other variations but I haven't seen this one. The problem is that, my nested for loops above are fine and they work but they take forever to process on larger arrays.

You can do this using matrix multiplication, and then flattening the result.
>>> a = np.array([1,0]).reshape(-1,1)
>>> b = np.array([1,0,1,0])
>>> a*b
array([[1, 0, 1, 0],
[0, 0, 0, 0]])
>>> (a*b).flatten()
array([1, 0, 1, 0, 0, 0, 0, 0])
>>>

How to convert [2,3,4] to [0,0,1,1,1,2,2,2,2] to utilize tf.math.segment_sum?

Assume I have an array like [2,3,4], I am looking for a way in NumPy (or Tensorflow) to convert it to [0,0,1,1,1,2,2,2,2] to apply tf.math.segment_sum() on a tensor that has a size of 2+3+4.
No elegant idea comes to my mind, only loops and list comprehension.

Would something like this work for you?
import numpy
arr = numpy.array([2, 3, 4])
numpy.repeat(numpy.arange(arr.size), arr)
# array([0, 0, 1, 1, 1, 2, 2, 2, 2])

You don't need to use numpy. You can use nothing but list comprehensions:
>>> foo = [2,3,4]
>>> sum([[i]*foo[i] for i in range(len(foo))], [])
[0, 0, 1, 1, 1, 2, 2, 2, 2]
It works like this:
You can create expanded arrays by multiplying a simple one with a constant, so [0] * 2 == [0,0]. So for each index in the array, we expand with [i]*foo[i]. In other words:
>>> [[i]*foo[i] for i in range(len(foo))]
[[0, 0], [1, 1, 1], [2, 2, 2, 2]]
Then we use sum to reduce the lists into a single list:
>>> sum([[i]*foo[i] for i in range(len(foo))], [])
[0, 0, 1, 1, 1, 2, 2, 2, 2]
Because we are "summing" lists, not integers, we pass [] to sum to make an empty list the starting value of the sum.
(Note that this likely will be slower than numpy, though I have not personally compared it to something like #Patol75's answer.)

I really like the answer from #Patol75 since it's neat. However, there is no pure tensorflow solution yet, so I provide one which maybe kinda complex. Just for reference and fun!
BTW, I didn't see tf.repeat this API in tf master. Please check this PR which adds tf.repeat support equivalent to numpy.repeat.
import tensorflow as tf
repeats = tf.constant([2,3,4])
values = tf.range(tf.size(repeats)) # [0,1,2]
max_repeats = tf.reduce_max(repeats) # max repeat is 4
tiled = tf.tile(tf.reshape(values, [-1,1]), [1,max_repeats]) # [[0,0,0,0],[1,1,1,1],[2,2,2,2]]
mask = tf.sequence_mask(repeats, max_repeats) # [[1,1,0,0],[1,1,1,0],[1,1,1,1]]
res = tf.boolean_mask(tiled, mask) # [0,0,1,1,1,2,2,2,2]

Patol75's answer uses Numpy but Gort the Robot's answer is actually faster (on your example list at least).
I'll keep this answer up as another solution, but it's slower than both.
Given that a = [2,3,4] this could be done using a loop like so:
b = []
for i in range(len(a)):
for j in range(a[i]):
b.append(range(len(a))[i])
Which, as a list comprehension one-liner, is this diabolical thing:
b = [range(len(a))[i] for i in range(len(a)) for j in range(a[i])]
Both end up with b = [0,0,1,1,1,2,2,2,2].

Numpy Conditionally Replace Column Elements

So I already took a look at this question.
I know you can conditionally replace a single column, but what about multiple columns? When I tried it, it doesn't seem to work.
the_data = np.array([[0, 1, 1, 1],
[0, 1, 3, 1],
[3, 4, 1, 3],
[0, 1, 2, 0],
[2, 1, 0, 0]])
the_data[:,0][the_data[:,0] == 0] = -1 # this works
columns_to_replace = [0, 1, 3]
the_data[:,columns_to_replace][the_data[:,columns_to_replace] == 0] = -1 # this does not work
I initially thought that the second case doesn't work because I thought the_data[:,columns_to_replace] creates a copy instead of directly referencing the elements. However, if that were the case, then the first case shouldn't work either, when you are only replacing the single column.

You're indeed getting a copy because you're using advanced indexing:
Advanced indexing is triggered when the selection object, obj, is a non-tuple sequence object, an ndarray (of data type integer or bool), or a tuple with at least one sequence object or ndarray (of data type integer or bool). There are two types of advanced indexing: integer and Boolean.
Advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view).
(Taken from the docs)
The first part works because it uses basic slicing.
I think you can do this without copying, but still with some memory overhead:
columns_to_replace = [0, 1, 3]
mask = np.zeros(the_data.shape, bool) # don't use too much memory
mask[:, columns_to_replace] = 1
np.place(the_data, (the_data == 0) * mask, [-1]) # this doesn't copy anything

Pythonic way to index intervals of integers from splitting points

I'm coding a hash-table-ish indexing mechanism that returns an integer's interval number (0 to n), according to a set of splitting points.
For example, if integers are split at value 3 (one split point, so two intervals), we can find the interval number for each array element using a simple comparison:
>>> import numpy as np
>>> x = np.array(range(7))
>>> [int(i>3) for i in x]
[0, 0, 0, 0, 1, 1, 1]
When there are many intervals, we can define a function as below:
>>> def get_interval_id(input_value, splits):
... for i,split_point in enumerate(splits):
... if input_value < split_point:
... return i
... return len(splits)
...
>>> [get_interval_id(i, [2,4]) for i in x]
[0, 0, 1, 1, 2, 2, 2]
But this solution does not look elegant. Is there any Pythonic (better) way to do this job?

Since you're already using it, I would suggest you use the digitize method from numpy:
>>> import numpy as np
>>> np.digitize(np.array([0, 1, 2, 3, 4, 5, 6]), [2, 4])
array([0, 0, 1, 1, 2, 2, 2])
From the documentation:
Return the indices of the bins to which each value in input array
belongs.

Python, per se, does not have a tractable function for this process, called binning. If you wanted, you could wrap your function into a one-line command, but it's more readable this way.
However, data frame packages usually have full-featured binning methods; the most popular one in Python is PANDAS. This allows you to collect or classify values by equal intervals, equal divisions (same quantity of entries in each bin), or custom split values (your case). See this question for a good discussion and examples.
Of course, this means that you'd have to install and import pandas and convert your list to a data frame. If that's too much trouble, just keep your current implementation; it's readable, straightforward, and reasonably short.

How about wrapping the whole process inside of one function instead of only half the process?
>>> get_interval_ids([0 ,1, 2, 3, 4, 5 ,6], [2, 4])
[0, 0, 1, 1, 2, 2, 2]
and your function would look like
def get_interval_ids(values, splits):
def get_interval_id(input_value):
for i,split_point in enumerate(splits):
if input_value < split_point:
return i
return len(splits)
return [get_interval_id(val) for val in values]

Is there any easy way to sparsely store a matrix with a redundant pattern in python?

The type of matrix I am dealing with was created from a vector as shown below:
Start with a 1-d vector V of length L.
To create a matrix A from V with N rows, make the i'th column of A the first N entries of V, starting from the i'th entry of V, so long as there are enough entries left in V to fill up the column. This means A has L - N + 1 columns.
Here is an example:
V = [0, 1, 2, 3, 4, 5]
N = 3
A =
[0 1 2 3
1 2 3 4
2 3 4 5]
Representing the matrix this way requires more memory than my machine has. Is there any reasonable way of storing this matrix sparsely? I am currently storing N * (L - N + 1) values, when I only need to store L values.

You can take a view of your original vector as follows:
>>> import numpy as np
>>> from numpy.lib.stride_tricks import as_strided
>>>
>>> v = np.array([0, 1, 2, 3, 4, 5])
>>> n = 3
>>>
>>> a = as_strided(v, shape=(n, len(v)-n+1), strides=v.strides*2)
>>> a
array([[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5]])
This is a view, not a copy of your original data, e.g.
>>> v[3] = 0
>>> v
array([0, 1, 2, 0, 4, 5])
>>> a
array([[0, 1, 2, 0],
[1, 2, 0, 4],
[2, 0, 4, 5]])
But you have to be careful no to do any operation on a that triggers a copy, since that would send your memory use through the ceiling.

If you're already using numpy, use its strided or sparse arrays, as Jaime explained.
If you're not already using numpy, you may to strongly consider using it.
If you need to stick with pure Python, there are three obvious ways to do this, depending on your use case.
For strided or sparse-but-clustered arrays, you could do effectively the same thing as numpy.
Or you could use a simple run-length-encoding scheme, plus maybe a higher-level list of runs for, or list of pointers to every Nth element, or even a whole stack of such lists (one for every 100 elements, one for every 10000, etc.).
But for mostly-uniformly-dense arrays, the easiest thing is to simply store a dict or defaultdict mapping indices to values. Random-access lookups or updates are still O(1)—albeit with a higher constant factor—and the storage you waste storing (in effect) a hash, key, and value instead of just a value for each non-default element is more than made up for by not storing values for the default elements, as long as you're less than 0.33 density.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

itertools product speed up - python

Let numpy.meshgrid do all the job: length = 13 x = [1, -1, 1j, -1j] mesh = numpy.meshgrid(([x] length)) result = numpy.vstack([y.flat for y in mesh]).T on my notebook it takes ~2 minutes

Related

Multiply each element of a list by an entire other list

How to convert [2,3,4] to [0,0,1,1,1,2,2,2,2] to utilize tf.math.segment_sum?

Numpy Conditionally Replace Column Elements

Pythonic way to index intervals of integers from splitting points

Is there any easy way to sparsely store a matrix with a redundant pattern in python?

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

itertools product speed up - python

Let numpy.meshgrid do all the job: length = 13 x = [1, -1, 1j, -1j] mesh = numpy.meshgrid(*([x] * length)) result = numpy.vstack([y.flat for y in mesh]).T on my notebook it takes ~2 minutes

Related

Multiply each element of a list by an entire other list

How to convert [2,3,4] to [0,0,1,1,1,2,2,2,2] to utilize tf.math.segment_sum?

Numpy Conditionally Replace Column Elements

Pythonic way to index intervals of integers from splitting points

Is there any easy way to sparsely store a matrix with a redundant pattern in python?

Categories

Resources

Let numpy.meshgrid do all the job: length = 13 x = [1, -1, 1j, -1j] mesh = numpy.meshgrid(([x] length)) result = numpy.vstack([y.flat for y in mesh]).T on my notebook it takes ~2 minutes