Applying a mask to an multidimensional array - python

I want to do this in a proper way:
data = np.array(data)
data =[
[1, 1, 2, 1],
[0, 1, 3, 2],
[0, 2, 3, 2],
[2, 4, 3, 1],
[0, 2, 1, 4],
[3, 1, 4, 1]]
this should become (delete the lines that start with 0):
[1, 1, 2, 1]
[2, 4, 3, 1]
[3, 1, 4, 1]
So far I did it like this:
lines = []
for i in range(0, len(data[0])):
if data[0,i] != 0:
lines.append(data[:,i])
lines = np.array(lines)
Then I found this fine method:
mask = 1 <= data[0,:]
and now I want to apply that mask to that array. This Mask reads: [True, False, False, True, False, True]. How do I do that?

Why not just:
[ar for ar in data if ar[0] != 0]
This assumes that arrays are not empty.

I presume you have a numpy array based on the data[0,:] and data[0,i] you have in your question and you mean data[:, 0] :
import numpy as np
data = np.array([
[1, 1, 2, 1],
[0, 1, 3, 2],
[0, 2, 3, 2],
[2, 4, 3, 1],
[0, 2, 1, 4],
[3, 1, 4, 1]])
data = data[data[:,0] != 0]
print(data)
Output:
[[1 1 2 1]
[2 4 3 1]
[3 1 4 1]]
data[0,:] is the first row [1 1 2 1] not the first column

Using List comprehension
In [56]: [elem for elem in data if elem[0] !=0]
Out[56]: [[1, 1, 2, 1], [2, 4, 3, 1], [3, 1, 4, 1]]

Related

How to create a matrix like below using Numpy

Matrix is like
[0, 1, 2]
[1, 2, 3]
[2, 3, 4]
For clarification, it's not just to create one such matrix but many other different matrices like this.
[0, 1, 2, 3]
[1, 2, 3, 4]
[2, 3, 4, 5]
You can use a sliding_window_view
from numpy.lib.stride_tricks import sliding_window_view as swv
cols = 4
rows = 3
out = swv(np.arange(cols+rows-1), cols).copy()
NB. because this is a view, you need .copy() to make it a mutable array, it's not necessary if a read-only object is sufficient (e.g., for display or indexing).
Output:
array([[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5]])
Output with cols = 3 ; rows = 5:
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6]])
alternative: broadcasting:
cols = 4
rows = 3
out = np.arange(rows)[:,None] + np.arange(cols)
Output:
array([[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5]])
L = 3
np.array([
np.array(range(L)) + j
for j in range(L)
])
or a bit of optimization:
L = 3
a = np.array(range(L))
np.array([
a + j
for j in range(L)
])
You can easily create a matrix like that using broadcasting, for instance
>>> np.arange(3)[:, None] + np.arange(4)
array([[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 3, 4, 5]])

PyTorch indexing: select complement of indices

Say I have a tensor and index:
x = torch.tensor([1,2,3,4,5])
idx = torch.tensor([0,2,4])
If I want to select all elements not in the index, I can manually define a Boolean mask like so:
mask = torch.ones_like(x)
mask[idx] = 0
x[mask]
is there a more elegant way of doing this?
i.e. a syntax where I can directly pass the indices as opposed to creating a mask e.g. something like:
x[~idx]
I couldn't find a satisfactory solution to finding the complement of a multi-dimensional tensor of indices and finally implemented my own. It can work on cuda and enjoys fast parallel computation.
def complement_idx(idx, dim):
"""
Compute the complement: set(range(dim)) - set(idx).
idx is a multi-dimensional tensor, find the complement for its trailing dimension,
all other dimension is considered batched.
Args:
idx: input index, shape: [N, *, K]
dim: the max index for complement
"""
a = torch.arange(dim, device=idx.device)
ndim = idx.ndim
dims = idx.shape
n_idx = dims[-1]
dims = dims[:-1] + (-1, )
for i in range(1, ndim):
a = a.unsqueeze(0)
a = a.expand(*dims)
masked = torch.scatter(a, -1, idx, 0)
compl, _ = torch.sort(masked, dim=-1, descending=False)
compl = compl.permute(-1, *tuple(range(ndim - 1)))
compl = compl[n_idx:].permute(*(tuple(range(1, ndim)) + (0,)))
return compl
Example:
>>> import torch
>>> a = torch.rand(3, 4, 5)
>>> a
tensor([[[0.7849, 0.7404, 0.4112, 0.9873, 0.2937],
[0.2113, 0.9923, 0.6895, 0.1360, 0.2952],
[0.9644, 0.9577, 0.2021, 0.6050, 0.7143],
[0.0239, 0.7297, 0.3731, 0.8403, 0.5984]],
[[0.9089, 0.0945, 0.9573, 0.9475, 0.6485],
[0.7132, 0.4858, 0.0155, 0.3899, 0.8407],
[0.2327, 0.8023, 0.6278, 0.0653, 0.2215],
[0.9597, 0.5524, 0.2327, 0.1864, 0.1028]],
[[0.2334, 0.9821, 0.4420, 0.1389, 0.2663],
[0.6905, 0.2956, 0.8669, 0.6926, 0.9757],
[0.8897, 0.4707, 0.5909, 0.6522, 0.9137],
[0.6240, 0.1081, 0.6404, 0.1050, 0.6413]]])
>>> b, c = torch.topk(a, 2, dim=-1)
>>> b
tensor([[[0.9873, 0.7849],
[0.9923, 0.6895],
[0.9644, 0.9577],
[0.8403, 0.7297]],
[[0.9573, 0.9475],
[0.8407, 0.7132],
[0.8023, 0.6278],
[0.9597, 0.5524]],
[[0.9821, 0.4420],
[0.9757, 0.8669],
[0.9137, 0.8897],
[0.6413, 0.6404]]])
>>> c
tensor([[[3, 0],
[1, 2],
[0, 1],
[3, 1]],
[[2, 3],
[4, 0],
[1, 2],
[0, 1]],
[[1, 2],
[4, 2],
[4, 0],
[4, 2]]])
>>> compl = complement_idx(c, 5)
>>> compl
tensor([[[1, 2, 4],
[0, 3, 4],
[2, 3, 4],
[0, 2, 4]],
[[0, 1, 4],
[1, 2, 3],
[0, 3, 4],
[2, 3, 4]],
[[0, 3, 4],
[0, 1, 3],
[1, 2, 3],
[0, 1, 3]]])
>>> al = torch.cat([c, compl], dim=-1)
>>> al
tensor([[[3, 0, 1, 2, 4],
[1, 2, 0, 3, 4],
[0, 1, 2, 3, 4],
[3, 1, 0, 2, 4]],
[[2, 3, 0, 1, 4],
[4, 0, 1, 2, 3],
[1, 2, 0, 3, 4],
[0, 1, 2, 3, 4]],
[[1, 2, 0, 3, 4],
[4, 2, 0, 1, 3],
[4, 0, 1, 2, 3],
[4, 2, 0, 1, 3]]])
>>> al, _ = al.sort(dim=-1)
>>> al
tensor([[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]])
You may want to try the single-line expression:
x[np.setdiff1d(range(len(x)), idx)]
Though it seems also not elegant:).

Swapping columns in a numpy array by a given indexs

I am trying to change column position of a matrix by a given indexs of array
import numpy as np
t = np.array([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]])
indexs = np.array([3, 4, 2, 1, 0])
check = [False for i in range(len(indexs))]
for i in range(len(indexs)):
check[i] = True
if (i != indexs[i] and check[indexs[i]] == False):
check[indexs[i]] = True
t[:, [i, indexs[i]]] = t[:, [indexs[i], i]]
print(t)
The result I want:
[[3 4 2 1 0]
[3 4 2 1 0]
[3 4 2 1 0]
[3 4 2 1 0]]
I want to return an array whose column positions is the same as indexs but I can't.
How can I achieve that?
Just index the array along the dimension you want:
t[:, indexs]
if you transpose the matrix it's easy
transposed = t.T
result = np.array([transposed[i] for i in indexs])
result = result.T
array([[3, 4, 2, 1, 0],
[3, 4, 2, 1, 0],
[3, 4, 2, 1, 0],
[3, 4, 2, 1, 0]])

How to Transpose each element in a 3D np array

Given a 3D array a, I want to call np.tranpose on each of the element in its first index.
For example, given the array:
array([[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]],
[[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2]],
[[3, 3, 3, 3],
[3, 3, 3, 3],
[3, 3, 3, 3]])
I want:
array([[[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1]],
[[2, 2, 2],
[2, 2, 2],
[2, 2, 2],
[2, 2, 2]],
[[3, 3, 3],
[3, 3, 3],
[3, 3, 3],
[3, 3, 3]]])
Essentially I want to transpose each element inside the array. I tried to reshape it but I can't find a good way of doing it. Looping through it and calling transpose on each would be too slow. Any advice?
You can use the built-in numpy transpose method and directly specify the axes to transpose
>>> a = np.array([[[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]],
[[2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2]],
[[3, 3, 3, 3], [3, 3, 3, 3], [3, 3, 3, 3]]])
>>> print(a.transpose((0, 2, 1)))
[[[1 1 1]
[1 1 1]
[1 1 1]
[1 1 1]]
[[2 2 2]
[2 2 2]
[2 2 2]
[2 2 2]]
[[3 3 3]
[3 3 3]
[3 3 3]
[3 3 3]]]

calculations for different columns in a numpy array

I have a 2D array with filled with some values (column 0) and zeros (rest of the columns). I would like to do pretty much the same as I do with MS excel but using numpy, meaning to put into the rest of the columns values from calculations based on the first column. Here it is a MWE:
import numpy as np
a = np.zeros(20, dtype=np.int8).reshape(4,5)
b = [1, 2, 3, 4]
b = np.array(b)
a[:, 0] = b
# don't change the first column
for column in a[:, 1:]:
a[:, column] = column[0]+1
The expected output:
array([[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8]], dtype=int8)
The resulting output:
array([[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0]], dtype=int8)
Any help would be appreciated.
Looping is slow and there is no need to loop to produce the array that you want:
>>> a = np.ones(20, dtype=np.int8).reshape(4,5)
>>> a[:, 0] = b
>>> a
array([[1, 1, 1, 1, 1],
[2, 1, 1, 1, 1],
[3, 1, 1, 1, 1],
[4, 1, 1, 1, 1]], dtype=int8)
>>> np.cumsum(a, axis=1)
array([[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8]])
What went wrong
Let's start, as in the question, with this array:
>>> a
array([[1, 0, 0, 0, 0],
[2, 0, 0, 0, 0],
[3, 0, 0, 0, 0],
[4, 0, 0, 0, 0]], dtype=int8)
Now, using the code from the question, let's do the loop and see what column actually is:
>>> for column in a[:, 1:]:
... print(column)
...
[0 0 0 0]
[0 0 0 0]
[0 0 0 0]
[0 0 0 0]
As you can see, column is not the index of the column but the actual values in the column. Consequently, the following does not do what you would hope:
a[:, column] = column[0]+1
Another method
If we want to loop (so that we can do something more complex), here is another approach to generating the desired array:
>>> b = np.array([1, 2, 3, 4])
>>> np.column_stack([b+i for i in range(5)])
array([[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8]])
Your usage of column is a little ambiguous: in for column in a[:, 1:], it is treated as a column and in the body, however, it is treated as index to the column. You can try this instead:
for column in range(1, a.shape[1]):
a[:, column] = a[:, column-1]+1
a
#array([[1, 2, 3, 4, 5],
# [2, 3, 4, 5, 6],
# [3, 4, 5, 6, 7],
# [4, 5, 6, 7, 8]], dtype=int8)

Categories