I am trying to write some code that will give me the itertools product, for a varying number of inputs. For example, this works for me.
test = np.array([x for x in itertools.product([0,2],[0,2],[0,2])])
this gives me my desired result:
>>> test
array([[0, 0, 0],
[0, 0, 2],
[0, 2, 0],
[0, 2, 2],
[2, 0, 0],
[2, 0, 2],
[2, 2, 0],
[2, 2, 2]])
However, I'd like to be able to be able to pass to the product function a varying number of lists. For example:
test = np.array([x for x in itertools.product([0,2],[0,2],[0,2],[0,2])])
or
test = np.array([x for x in itertools.product([0,2],[0,2])])
I have tried
test = np.array([x for x in itertools.product(([0,2],) * 3)])
and
test = np.array([x for x in itertools.product([[0,2]]*3)])
but neither gives me the desired result. Surely there is an easy way to do this. I would appreciate any help.
It looks to me like you were grasping for the splat-unpack syntax:
>>> n = 3
>>> L = [0, 2]
>>> np.array([x for x in itertools.product(*([L] * n))])
array([[0, 0, 0],
[0, 0, 2],
[0, 2, 0],
[0, 2, 2],
[2, 0, 0],
[2, 0, 2],
[2, 2, 0],
[2, 2, 2]])
It may be easier to use the second argument repeat to itertools.product though.
>>> np.array(list(itertools.product(L, repeat=3)))
array([[0, 0, 0],
[0, 0, 2],
[0, 2, 0],
[0, 2, 2],
[2, 0, 0],
[2, 0, 2],
[2, 2, 0],
[2, 2, 2]])
itertools.product supports another argument called repeat as in itertools.product(*iterables[, repeat]) through which you can manipulate the dimensions of cross product. Note, this argument should be specified explicitly in order to disambiguate from the list content.
So your example extends to
test = np.array([x for x in itertools.product([0,2],[0,2],[0,2],[0,2])])
to
test = np.array([x for x in itertools.product([0,2], repeat = 4)])
you can try this
for 3 times:
test = np.array([x for x in itertools.product(*itertools.repeat([0,2],3))])
for n times:
test = np.array([x for x in itertools.product(*itertools.repeat([0,2],n))])
itertools.repeat([0,2],n) this will repeat elem, elem, elem, ... endlessly or up to n times and * in front of itertools is to unpack all element
You need to add * to expand the list of lists:
In [244]: list(itertools.product(*[[0,2]]*2))
Out[244]: [(0, 0), (0, 2), (2, 0), (2, 2)]
This expansion, and the use of repeat are equal in timing tests.
Related
I would like to know the fastest way to extract the indices of the first n non zero values per column in a 2D array.
For example, with the following array:
arr = [
[4, 0, 0, 0],
[0, 0, 0, 0],
[0, 4, 0, 0],
[2, 0, 9, 0],
[6, 0, 0, 0],
[0, 7, 0, 0],
[3, 0, 0, 0],
[1, 2, 0, 0],
With n=2 I would have [0, 0, 1, 1, 2] as xs and [0, 3, 2, 5, 3] as ys. 2 values in the first and second columns and 1 in the third.
Here is how it is currently done:
x = []
y = []
n = 3
for i, c in enumerate(arr.T):
a = c.nonzero()[0][:n]
if len(a):
x.extend([i]*len(a))
y.extend(a)
In practice I have arrays of size (405, 256).
Is there a way to make it faster?
Here is a method, although quite confusing as it uses a lot of functions, that does not require sorting the array (only a linear scan is necessary to get non null values):
n = 2
# Get indices with non null values, columns indices first
nnull = np.stack(np.where(arr.T != 0))
# split indices by unique value of column
cols_ids= np.array_split(range(len(nnull[0])), np.where(np.diff(nnull[0]) > 0)[0] +1 )
# Take n in each (max) and concatenate the whole
np.concatenate([nnull[:, u[:n]] for u in cols_ids], axis = 1)
outputs:
array([[0, 0, 1, 1, 2],
[0, 3, 2, 5, 3]], dtype=int64)
Here is one approach using argsort, it gives a different order though:
n = 2
m = arr!=0
# non-zero values first
idx = np.argsort(~m, axis=0)
# get first 2 and ensure non-zero
m2 = np.take_along_axis(m, idx, axis=0)[:n]
y,x = np.where(m2)
# slice
x, idx[y,x]
# (array([0, 1, 2, 0, 1]), array([0, 2, 3, 3, 5]))
Use dislocation comparison for the row results of the transposed nonzero:
>>> n = 2
>>> i, j = arr.T.nonzero()
>>> mask = np.concatenate([[True] * n, i[n:] != i[:-n]])
>>> i[mask], j[mask]
(array([0, 0, 1, 1, 2], dtype=int64), array([0, 3, 2, 5, 3], dtype=int64))
I have a 1d PyTorch tensor containing integers between 0 and n-1. Now I need to create a 2d PyTorch tensor with n-1 columns, where each row is a sequence from 0 to n-1 excluding the value in the first tensor. How can I achieve this efficiently?
Ex:
n = 3
a = torch.Tensor([0, 1, 2, 1, 2, 0])
# desired output
b = [
[1, 2],
[0, 2],
[0, 1],
[0, 2],
[0, 1],
[1, 2]
]
Typically, the a.numel() >> n.
Detailed Explanation:
The first element of a is 0, hence it has to map to the sequence [0, 1, 2] excluding 0, which is [1, 2].
Similarly, the second element of a is 1, hence it has to map to [0, 2] and so on.
PS: I actually have an additional batch dimension, which I've excluded here for simplicity. Hence, I need the solution to be easily extendable to one additional dimension.
We can construct a tensor with the desired sequences and index with tensor a.
import torch
n = 3
a = torch.Tensor([0, 1, 2, 1, 2, 0]) # using torch.tensor is recommended
def exclude_gather(a, n):
sequences = torch.nonzero(torch.arange(n) != torch.arange(n)[:,None], as_tuple=True)[1].reshape(-1, n-1)
return sequences[a.long()]
exclude_gather(a, n)
Output
tensor([[1, 2],
[0, 2],
[0, 1],
[0, 2],
[0, 1],
[1, 2]])
We can add a batch dimension with functorch.vmap
from functorch import vmap
n = 4
b = torch.Tensor([[0, 1, 2, 1, 3, 0],[0, 3, 1, 0, 2, 1]])
vmap(exclude_gather, in_dims=(0, None))(b, n)
Output
tensor([[[1, 2, 3],
[0, 2, 3],
[0, 1, 3],
[0, 2, 3],
[0, 1, 2],
[1, 2, 3]],
[[1, 2, 3],
[0, 1, 2],
[0, 2, 3],
[1, 2, 3],
[0, 1, 3],
[0, 2, 3]]])
All you have to do is initialize a multi-dimension array with all possible indices using torch.arange(). After that, purge indices that you don't want from each tensor using a boolean mask.
import torch
a = torch.Tensor([0, 1, 2, 1, 2, 0])
n = 3
b = [torch.arange(n) for i in range(len(a))]
c = [b[i]!=a[i] for i in range(len(b))]
# use the boolean array as a mask to apply on b
d = [[b[i][c[i]] for i in range(len(b))]]
print(d) # this can be converted to a list of numbers or torch tensor
This prints the output - [[tensor([1, 2]), tensor([0, 2]), tensor([0, 1]), tensor([0, 2]), tensor([0, 1]), tensor([1, 2])]] which you can convert to int/numpy/torch array/tensor easily.
This can be extended to multiple dimensions as well.
The following does the trick
b = []
for i in range(n-1):
b.append(i * torch.ones_like(a) + (a <= i))
b = torch.stack(b, dim=1)
Since n << size(a), the for loop should not be very costly.
Let's say, now I have a 1x1 matrix, like:
M = Matrix([[2]])
How can I create a new 2x2 matrix from this, filling the all blanks with 0s? Which is:
N = Matrix([[2, 0], [0, 0]])
If it were numpy, I could use np.newaxis; however, it seems that there is no newaxis in sympy.
So, I tried:
N = M.reshape(2, 2)
I got the following error:
ValueError: Invalid reshape parameters 2 2
I found that the following expression works:
N = Matrix(2, 2, [D[0], 0, 0, 0])
However, this is a bit awkward.
Is there any better way?
Please note that a scalar multiplication N = D[0] * Matrix(2, 2, [1, 0, 0, 0]) is not acceptable, since next time I may ask you to convert 2x2 to 3x3.
Use sympy.diag.
>>> import sympy as sp
>>> m = sp.Matrix([[2]])
>>> sp.diag(m, 0)
Matrix([
[2, 0],
[0, 0]])
>>> sp.diag(m, 0, 0)
Matrix([
[2, 0, 0],
[0, 0, 0],
[0, 0, 0]])
>>> sp.diag(sp.Matrix([[1, 2], [3, 4]]), 0)
Matrix([
[1, 2, 0],
[3, 4, 0],
[0, 0, 0]])
I would like to create from a list all the different list were 0,1,2,3...all element are replaced by an other
For example, if the replacement item is 0:
L=[1,2,3]
->[1,2,3],[0,2,3],[1,0,3],[1,2,0],[0,0,3],[0,2,0],[1,0,0],[0,0,0]
So far, I've tried I managed to do what I whant using Itertools but only in the case where 1 value is replaced by 0
Does anyone know how to do this ?
Everyone's trying too hard here. We want each value to be either the original value or 0 -- we want pairs like (1,0), (2,0), and (3,0):
>>> from itertools import product, repeat
>>> L = [1, 2, 3]
>>> zip(L, repeat(0))
<zip object at 0x7f931ad1bf08>
>>> list(zip(L, repeat(0)))
[(1, 0), (2, 0), (3, 0)]
and then we can just pass that into product:
>>> list(product(*zip(L, repeat(0))))
[(1, 2, 3), (1, 2, 0), (1, 0, 3), (1, 0, 0), (0, 2, 3), (0, 2, 0), (0, 0, 3), (0, 0, 0)]
This is one way using itertools. The benefit of this method is that it is lazy.
A new list is produced on every __next__ call of the generator transformer.
Alternatively, as below, you can output all combinations by calling list on the generator function.
from itertools import combinations, chain
A = [1, 2, 3]
def transformer(x):
idx = chain.from_iterable(combinations(range(len(x)), i) for i in range(len(x)+1))
for indices in idx:
y = x.copy()
for j in indices:
y[j] = 0
yield y
res = list(transformer(A))
print(res)
[[1, 2, 3], [0, 2, 3], [1, 0, 3], [1, 2, 0], [0, 0, 3], [0, 2, 0], [1, 0, 0], [0, 0, 0]]
You can use recursion. First, create a function that can generate a full combinations for each index of the input:
def full_combinations(d, current = []):
if len(d) == len(current):
yield current
else:
yield current
for i in range(len(d)):
if len(set(current+[i])) == len(current)+1:
yield from full_combinations(d, current+[i])
combination_list = list(full_combinations([1, 2, 3]))
new_results = [[0 if c in i else a for c, a in enumerate([1, 2, 3])] for i in combination_list]
full = [a for i, a in enumerate(new_results) if a not in new_results[:i]]
Output:
[[1, 2, 3], [0, 2, 3], [0, 0, 3], [0, 0, 0], [0, 2, 0], [1, 0, 3], [1, 0, 0], [1, 2, 0]]
It's not pretty, but I'm sure you can get this idea to work.
The idea is to use itertools.combinations to get all combinations of indices for every length, then we flatten this list with itertools.chain().
Then we loop through this list of lists, setting those indices to the replace character.
import itertools
l = [1,2,3]
replace = 0
indices = list(itertools.chain(*[list(itertools.combinations(list(range(len(l))),z+1)) for z in range(len(l))]))
allcombs = [[l]]
for i in indices:
l2 = l[:]
for j in i:
l2[j] = replace
allcombs.append(l2)
print(allcombs)
[[[1, 2, 3]], [0, 2, 3], [1, 0, 3], [1, 2, 0], [0, 0, 3], [0, 2, 0], [1, 0, 0], [0, 0, 0]]
Briefly: there is a similar question and the best answer suggests using numpy.bincount. I need the same thing, but for a matrix.
I've got two arrays:
array([1, 2, 1, 1, 2])
array([2, 1, 1, 1, 1])
together they make indices that should be incremented:
>>> np.array([a, b]).T
array([[1, 2],
[2, 1],
[1, 1],
[1, 1],
[2, 1]])
I want to get this matrix:
array([[0, 0, 0],
[0, 2, 1], # (1,1) twice, (1,2) once
[0, 2, 0]]) # (2,1) twice
The matrix will be small (like, 5×5), and the number of indices will be large (somewhere near 10^3 or 10^5).
So, is there anything better (faster) than a for-loop?
You can still use bincount(). The trick is to convert a and b into a single 1D array of flat indices.
If the matrix is nxm, you could apply bincount() to a * m + b, and construct the matrix from the result.
To take the example in your question:
In [15]: a = np.array([1, 2, 1, 1, 2])
In [16]: b = np.array([2, 1, 1, 1, 1])
In [17]: cnt = np.bincount(a * 3 + b)
In [18]: cnt.resize((3, 3))
In [19]: cnt
Out[19]:
array([[0, 0, 0],
[0, 2, 1],
[0, 2, 0]])
If the shape of the array is more complicated, it might be easier to use np.ravel_multi_index() instead of computing flat indices by hand:
In [20]: cnt = np.bincount(np.ravel_multi_index(np.vstack((a, b)), (3, 3)))
In [21]: np.resize(cnt, (3, 3))
Out[21]:
array([[0, 0, 0],
[0, 2, 1],
[0, 2, 0]])
(Hat tip #Jaime for pointing out ravel_multi_index.)
m1 = m.view(numpy.ndarray) # Create view
m1.shape = -1 # Make one-dimensional array
m1 += np.bincount(a+m.shape[1]*b, minlength=m1.size)