numpy select each row with different lengths - python

I have two array s
x=array([[0, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[2, 2, 2, 2, 2]])
I want to subselect elements in each row by the length in array y
y = array([3, 2, 4])
My target is z:
z = array([[0, 0, 0],
[1, 0,],
[2, 2, 2, 2]])
How could I do that with numpy functions instead of list/loop?
Thank you so much for your help.

Numpy array is optimized for homogeneous array with a specific dimensions. I like to think of it like a matrix: it does not make sense to have a matrix with different number of elements on each rows.
That said, depending on how you want to use the processed array, you can simply make a list of array:
z = [array([0, 0, 0]),
array([1, 0,]),
array([2, 2, 2, 2]])]
Still, you will need to do that manually:
x = array([[0, 0, 0, 0, 0], [1, 0, 0, 0, 0], [2, 2, 2, 2, 2]])
y = array([3, 2, 4])
z = [x_item[:y_item] for x_item, y_item in zip(x, y)]
The list comprehension iterates over the x and y combined with zip() to create the new slice of the original array.

Something like this also,
z = [x[i,:e] for i,e in enumerate(y)]

Related

How to set values in a 2d numpy array given 1D indices for each row?

In numpy you can set the indices of a 1d array to a value
import numpy as np
b = np.array([0, 0, 0, 0, 0])
indices = [1, 3]
b[indices] = 1
b
array([0, 1, 0, 1, 0])
I'm trying to do this with multi-rows and an index for each row in the most programmatically elegant and computationally efficient way possible. For example
b = np.array([[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]])
indices = [[1, 3], [0, 1], [0, 3]]
The desired result is
array([[0, 1, 0, 1, 0],
[1, 1, 0, 0, 0],
[1, 0, 0, 1, 0]])
I tried b[indices] and b[:,indices] but they resulted in an error or undesired result.
From searching, there are a few work arounds, but each tends to need at least 1 loop in python.
Solution 1: Run a loop through each row of the 2d array. The draw back for this is that the loop runs in python, and this part won't take advantage of numpy's c processing.
Solution 2: Use numpy put. The draw back is put works on a flattened version of the input array, so the indices need to be flattened too, and altered by the row size and number of rows, which would use a double for loop in python.
Solution 3: put_along_axis seems to only be able to set 1 value per row, so I would need to repeat this function for the number of values per row.
What would be the most computationally and programatically elegant solution? Anything where numpy would handle all the operations?
In [330]: b = np.zeros((3,5),int)
To set the (3,2) columns, the row indices need to be (3,1) shape (matching by broadcasting):
In [331]: indices = np.array([[1,3],[0,1],[0,3]])
In [332]: b[np.arange(3)[:,None], indices] = 1
In [333]: b
Out[333]:
array([[0, 1, 0, 1, 0],
[1, 1, 0, 0, 0],
[1, 0, 0, 1, 0]])
put along does the same thing:
In [335]: b = np.zeros((3,5),int)
In [337]: np.put_along_axis(b, indices,1,axis=1)
In [338]: b
Out[338]:
array([[0, 1, 0, 1, 0],
[1, 1, 0, 0, 0],
[1, 0, 0, 1, 0]])
On solution to build the indices in each dimension and then use a basic indexing:
from itertools import chain
b = np.array([[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]])
# Find the indices along the axis 0
y = np.arange(len(indices)).repeat(np.fromiter(map(len, indices), dtype=np.int_))
# Flatten the list and convert it to an array
x = np.fromiter(chain.from_iterable(indices), dtype=np.int_)
# Finaly set the items
b[y, x] = 1
It works even for indices lists with variable-sized sub-lists like indices = [[1, 3], [0, 1], [0, 2, 3]]. If your indices list always contains the same number of items in each sub-list then you can use the (more efficient) following code:
b = np.array([[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]])
indices = np.array(indices)
n, m = indices.shape
y = np.arange(n).repeat(m)
x = indices.ravel()
b[y, x] = 1
Simple one-liner based on Jérôme's answer (requires all items of indices to be equal-length):
>>> b[np.arange(np.size(indices)) // len(indices[0]), np.ravel(indices)] = 1
>>> b
array([[0, 1, 0, 1, 0],
[1, 1, 0, 0, 0],
[1, 0, 0, 1, 0]])

How to replace non-zero elements of a (0,1) numpy array by it's column/ row indices without using a loop?

I have a numpy array of 0's and 1's.
def random_array(p):
return np.random.choice(2, 6, p=[p, 1-p])
my_matrix = np.array([random_array(j) for j in np.random.uniform(0.3, 1.0, 4)])
I can get indices of all num-zero elements as np.nonzero(my_matrix). May you help me to allocate all these indices to the column number (even column indices will work, where we start from 0 rather 1) in which it is present. For example,
array([[0, 2, 0, 4, 5, 6],
[0, 0, 0, 4, 5, 0],
[1, 2, 0, 4, 0, 0],
[0, 2, 3, 4, 5, 6]])
Here, all 1's have been replaced with the column number. Thus, this is for a matrix whose all non-zero elements were 1. If you can find the indices of column then that would also be fibulas because, I can get the same by adding 1.
Note: I do not wish to use any loop for this task.
If I understand well, you want to replace all no zero elements with their column indices (starting from 1 instead of 0) right ?
Then you can do it like:
idx = np.nonzero(my_matrix)
my_matrix[idx[0], idx[1]] = idx[1]+1
You can multiply your array with another array of the same size containing the corresponding row/column index:
## Dummy data
# Array size
s = (6,4)
# Axis along which we need to calculate the index:
a = 0
# Random binary array
x = np.random.rand(*s).round()
# Get the index along one axis using broadcasting (starting with 1)
x = x*(np.expand_dims(range(s[a]),len(s)-a-1)+1)
In [169]: def random_array(p):
...: return np.random.choice(2, 6, p=[p, 1-p])
...: my_matrix = np.array([random_array(j) for j in np.random.uniform(0.3, 1.0, 4)])
In [170]: my_matrix
Out[170]:
array([[0, 0, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 0, 1],
[1, 1, 0, 0, 1, 0]])
Just multiply the range index. By broadcasting (6,) arange is fine for columns:
In [171]: np.arange(1,7)*my_matrix
Out[171]:
array([[0, 0, 0, 0, 0, 0],
[0, 2, 3, 0, 0, 0],
[0, 0, 3, 4, 0, 6],
[1, 2, 0, 0, 5, 0]])
for rows
In [172]: np.arange(1,5)[:,None]*my_matrix
Out[172]:
array([[0, 0, 0, 0, 0, 0],
[0, 2, 2, 0, 0, 0],
[0, 0, 3, 3, 0, 3],
[4, 4, 0, 0, 4, 0]])

Count indices to array to produce heatmap

I'd like to accumulate indices that point to a m-by-n array to another array of that very shape to produce a heatmap. For example, these indices:
[
[0, 1, 2, 0, 1, 2]
[0, 1, 0, 0, 0, 2]
]
would produce the following array:
[
[2, 0, 0]
[1, 1, 0]
[1, 0, 1]
]
I've managed to succesfully implement an algorithm, but I started wondering, whether there is already a built-in NumPy solution for this kind of problem.
Here's my code:
a = np.array([[0, 1, 2, 0, 1, 2], [0, 1, 0, 0, 0, 2]])
def _gather_indices(indices: np.ndarray, shape: tuple):
heat = np.zeros(shape)
for i in range(indices.shape[-1]):
heat[tuple(indices[:, i])] += 1
Two methods could be suggested.
With np.add.at -
heat = np.zeros(shape,dtype=int)
np.add.at(heat,(a[0],a[1]),1)
Or with tuple() based one for a more aesthetic one -
np.add.at(heat,tuple(a),1)
With bincount -
idx = np.ravel_multi_index(a,shape)
np.bincount(idx,minlength=np.prod(shape)).reshape(shape)
Additionally, we could compute shape using the max-limits of the indices in a -
shape = a.max(axis=1)+1
Sample run -
In [147]: a
Out[147]:
array([[0, 1, 2, 0, 1, 2],
[0, 1, 0, 0, 0, 2]])
In [148]: shape = (3,3)
In [149]: heat = np.zeros(shape,dtype=int)
...: np.add.at(heat,(a[0],a[1]),1)
In [151]: heat
Out[151]:
array([[2, 0, 0],
[1, 1, 0],
[1, 0, 1]])
In [173]: idx = np.ravel_multi_index(a,shape)
In [174]: np.bincount(idx,minlength=np.prod(shape)).reshape(shape)
Out[174]:
array([[2, 0, 0],
[1, 1, 0],
[1, 0, 1]])

Python Concatenate and stack many matrices [duplicate]

This question already has answers here:
Toeplitz matrix of toeplitz matrix
(3 answers)
Closed 3 years ago.
I want to create a square matrix like this one where its element is a square matrix either the B square matrix or the negative identity matrix or zeros. I have created the B matrix as well as the -I and also I have created a Z matrix of zeros. B, I and Z squares matrices with same n1*n1 (or n2 * n2) dimensions and the final matrix I want to have n*n dimensions where n = n1 * n2
If for example B, I and Z are 4*4 the final will be 16*16
I know how to concatenate and stack matrices but I don't know how to implement this better since need to make the below process 64! times.
for iter in range(64):
if iter == 0:
temp = B
temp = np.hstack((temp, I))
temp = np.hstack((temp, Z))
temp = np.hstack((temp, Z))
if iter == 1:
temp2 = I
temp2 = np.hstack((temp2, B))
temp2 = np.hstack((temp2, I))
temp2 = np.hstack((temp2, Z))
if iter == 2:
temp3 = Z
temp3 = np.hstack((temp3, I))
temp3 = np.hstack((temp3, B))
temp3 = np.hstack((temp3, I))
if iter == 3:
temp4 = Z
temp4 = np.hstack((temp4, Z))
temp4 = np.hstack((temp4, I))
temp4 = np.hstack((temp4, B))
.......
........
........
st1 = np.vstack((temp, temp2))
st2 = np.vstack((st1, temp3))
.......
Can I save n*n matrices into array elements and then concatenate or stack them?
Depending on whether you are dealing with numpy arrays or lists, you can use the following example to append arrays:
import numpy as np
x = np.array([[11.1, 12.1, 13.1], [21.1, 22.1, 23.1]])
print(x.shape)
y = np.array([[11.2, 12.2],[21.2, 22.2]])
print(y.shape)
z = np.append(x,y, axis=1)
print(z.shape)
print(z)
Please note that as mentioned by #user2699, the numpy append could get slow for large array sizes (Fastest way to grow a numpy numeric array).
With lists, you can use the append command:
x = [1, 2, 3]
x.append([4, 5])
print (x) #
This example is taken from: Difference between append vs. extend list methods in Python
np.block helps you create an array like this:
In [109]: B =np.arange(1,5).reshape(2,2)
In [110]: I =np.eye(2).astype(int)
In [111]: Z = np.zeros((2,2),int)
In [112]: np.block?
In [113]: np.block([[B,I,Z,Z],[I,B,I,Z],[Z,I,B,I],[Z,Z,I,B]])
Out[113]:
array([[1, 2, 1, 0, 0, 0, 0, 0],
[3, 4, 0, 1, 0, 0, 0, 0],
[1, 0, 1, 2, 1, 0, 0, 0],
[0, 1, 3, 4, 0, 1, 0, 0],
[0, 0, 1, 0, 1, 2, 1, 0],
[0, 0, 0, 1, 3, 4, 0, 1],
[0, 0, 0, 0, 1, 0, 1, 2],
[0, 0, 0, 0, 0, 1, 3, 4]])
block does a nested sequence of concatenate starting with the innermost lists. Previous versions used hstack on the inner lists, and vstack of those results.
In [118]: np.vstack((np.hstack([B,I,Z,Z]),np.hstack([I,B,I,Z])))
Out[118]:
array([[1, 2, 1, 0, 0, 0, 0, 0],
[3, 4, 0, 1, 0, 0, 0, 0],
[1, 0, 1, 2, 1, 0, 0, 0],
[0, 1, 3, 4, 0, 1, 0, 0]])
The lists of lists in [113] could be constructed with code, using the desired sizes, but I won't go into those details.
Another approach is to create a np.zeros((8,8)) target array, and fill in the desired blocks. Maybe even better to make a np.zeros((4,2,4,2)), fill that, and reshape later.
In [119]: res = np.zeros((4,2,4,2),int)
In [120]: res[np.arange(4),:,np.arange(4),:] = B
In [121]: res[np.arange(3),:,np.arange(1,4),:] = I
In [122]: res[np.arange(1,4),:,np.arange(3),:] = I
In [124]: res.reshape(8,8)
Out[124]:
array([[1, 2, 1, 0, 0, 0, 0, 0],
[3, 4, 0, 1, 0, 0, 0, 0],
[1, 0, 1, 2, 1, 0, 0, 0],
[0, 1, 3, 4, 0, 1, 0, 0],
[0, 0, 1, 0, 1, 2, 1, 0],
[0, 0, 0, 1, 3, 4, 0, 1],
[0, 0, 0, 0, 1, 0, 1, 2],
[0, 0, 0, 0, 0, 1, 3, 4]])

Numpy: given the nonzero indices of a matrix how to extract the elements into a submatrix

I have very sparse matrices, so I want to extract the smallest rectangular region of a matrix that has non-zero values. I know that numpy.nonzero(a) gives you the indices of the elements that are non-zero, but how can I use this to extract a submatrix that contains the elements of the matrix at those indices.
To give an example, this is what I am aiming for:
>>> test
array([[0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0],
[0, 0, 1, 1, 0, 0]])
>>> np.nonzero(test)
(array([1, 1, 1, 1, 2, 2]), array([1, 2, 3, 4, 2, 3]))
>>> submatrix(test)
array([[1, 1, 1, 1],
[0, 1, 1, 0]])
Does anyone know a simple way to do this in numpy? Thanks.
It seems like you're looking to find the smallest region of your matrix that contains all the nonzero elements. If that's true, here's a method:
import numpy as np
def submatrix(arr):
x, y = np.nonzero(arr)
# Using the smallest and largest x and y indices of nonzero elements,
# we can find the desired rectangular bounds.
# And don't forget to add 1 to the top bound to avoid the fencepost problem.
return arr[x.min():x.max()+1, y.min():y.max()+1]
test = np.array([[0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0],
[0, 0, 1, 1, 0, 0]])
print submatrix(test)
# Result:
# [[1 1 1 1]
# [0 1 1 0]]

Categories