PyTorch indexing: select complement of indices - python

Say I have a tensor and index:
x = torch.tensor([1,2,3,4,5])
idx = torch.tensor([0,2,4])
If I want to select all elements not in the index, I can manually define a Boolean mask like so:
mask = torch.ones_like(x)
mask[idx] = 0
x[mask]
is there a more elegant way of doing this?
i.e. a syntax where I can directly pass the indices as opposed to creating a mask e.g. something like:
x[~idx]

I couldn't find a satisfactory solution to finding the complement of a multi-dimensional tensor of indices and finally implemented my own. It can work on cuda and enjoys fast parallel computation.
def complement_idx(idx, dim):
"""
Compute the complement: set(range(dim)) - set(idx).
idx is a multi-dimensional tensor, find the complement for its trailing dimension,
all other dimension is considered batched.
Args:
idx: input index, shape: [N, *, K]
dim: the max index for complement
"""
a = torch.arange(dim, device=idx.device)
ndim = idx.ndim
dims = idx.shape
n_idx = dims[-1]
dims = dims[:-1] + (-1, )
for i in range(1, ndim):
a = a.unsqueeze(0)
a = a.expand(*dims)
masked = torch.scatter(a, -1, idx, 0)
compl, _ = torch.sort(masked, dim=-1, descending=False)
compl = compl.permute(-1, *tuple(range(ndim - 1)))
compl = compl[n_idx:].permute(*(tuple(range(1, ndim)) + (0,)))
return compl
Example:
>>> import torch
>>> a = torch.rand(3, 4, 5)
>>> a
tensor([[[0.7849, 0.7404, 0.4112, 0.9873, 0.2937],
[0.2113, 0.9923, 0.6895, 0.1360, 0.2952],
[0.9644, 0.9577, 0.2021, 0.6050, 0.7143],
[0.0239, 0.7297, 0.3731, 0.8403, 0.5984]],
[[0.9089, 0.0945, 0.9573, 0.9475, 0.6485],
[0.7132, 0.4858, 0.0155, 0.3899, 0.8407],
[0.2327, 0.8023, 0.6278, 0.0653, 0.2215],
[0.9597, 0.5524, 0.2327, 0.1864, 0.1028]],
[[0.2334, 0.9821, 0.4420, 0.1389, 0.2663],
[0.6905, 0.2956, 0.8669, 0.6926, 0.9757],
[0.8897, 0.4707, 0.5909, 0.6522, 0.9137],
[0.6240, 0.1081, 0.6404, 0.1050, 0.6413]]])
>>> b, c = torch.topk(a, 2, dim=-1)
>>> b
tensor([[[0.9873, 0.7849],
[0.9923, 0.6895],
[0.9644, 0.9577],
[0.8403, 0.7297]],
[[0.9573, 0.9475],
[0.8407, 0.7132],
[0.8023, 0.6278],
[0.9597, 0.5524]],
[[0.9821, 0.4420],
[0.9757, 0.8669],
[0.9137, 0.8897],
[0.6413, 0.6404]]])
>>> c
tensor([[[3, 0],
[1, 2],
[0, 1],
[3, 1]],
[[2, 3],
[4, 0],
[1, 2],
[0, 1]],
[[1, 2],
[4, 2],
[4, 0],
[4, 2]]])
>>> compl = complement_idx(c, 5)
>>> compl
tensor([[[1, 2, 4],
[0, 3, 4],
[2, 3, 4],
[0, 2, 4]],
[[0, 1, 4],
[1, 2, 3],
[0, 3, 4],
[2, 3, 4]],
[[0, 3, 4],
[0, 1, 3],
[1, 2, 3],
[0, 1, 3]]])
>>> al = torch.cat([c, compl], dim=-1)
>>> al
tensor([[[3, 0, 1, 2, 4],
[1, 2, 0, 3, 4],
[0, 1, 2, 3, 4],
[3, 1, 0, 2, 4]],
[[2, 3, 0, 1, 4],
[4, 0, 1, 2, 3],
[1, 2, 0, 3, 4],
[0, 1, 2, 3, 4]],
[[1, 2, 0, 3, 4],
[4, 2, 0, 1, 3],
[4, 0, 1, 2, 3],
[4, 2, 0, 1, 3]]])
>>> al, _ = al.sort(dim=-1)
>>> al
tensor([[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]])

You may want to try the single-line expression:
x[np.setdiff1d(range(len(x)), idx)]
Though it seems also not elegant:).

Related

Numpy add array of shape NxD to NxK into the first D in array NxK

I have two arrays
a = np.array([[0, 0, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 2, 2, 3, 4]])
and
b = np.array([[1, 1],
[2, 2],
[3, 3]])
I want to one array where I am adding the values of b to the first two columns in a like this:
c = np.array([[1, 1, 2, 3, 4],
[2, 3, 2, 3, 4],
[3, 5, 2, 3, 4]])
if it helps you can think of the first two columns in a as the x,y coordinates and b as dx, dy.
My current method is as follows:
c = np.concatenate([a[:, 0:2] + b, a[:, 2:]],1)
but I am looking for a better method
Thank you
You can use np.pad to add zeros to b to make its shape the same as a's, then add them:
>>> a + np.pad(b, ((0, 0), (0, 3)))
array([[1, 1, 2, 3, 4],
[2, 3, 2, 3, 4],
[3, 5, 2, 3, 4]])
In general (for 2-D):
>>> a = np.array([[0, 0, 2, 3, 4],
... [0, 1, 2, 3, 4],
... [0, 2, 2, 3, 4]])
>>> b = np.array([[1, 1],
... [2, 2],
... [3, 3],
... [4, 4],
... [5, 5]])
>>> a_shape, b_shape = a.shape, b.shape
>>> max_w = max(a_shape[0], b_shape[0])
>>> max_h = max(a_shape[1], b_shape[1])
>>> padded_a = np.pad(a,
((0, np.abs(a_shape[0] - max_w)),
(0, np.abs(a_shape[1] - max_h))))
>>> padded_a
array([[0, 0, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 2, 2, 3, 4],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
>>> padded_b = np.pad(b,
((0, np.abs(b_shape[0] - max_w)),
(0, np.abs(b_shape[1] - max_h))))
>>> padded_b
array([[1, 1, 0, 0, 0],
[2, 2, 0, 0, 0],
[3, 3, 0, 0, 0],
[4, 4, 0, 0, 0],
[5, 5, 0, 0, 0]])
>>> padded_a + padded_b
array([[1, 1, 2, 3, 4],
[2, 3, 2, 3, 4],
[3, 5, 2, 3, 4],
[4, 4, 0, 0, 0],
[5, 5, 0, 0, 0]])
In general (2-D, using a zeros array and adding to it):
>>> c = np.zeros((max_h, max_w), dtype=a.dtype)
>>> c[:a_shape[0], :a_shape[1]] += a
>>> c[:b_shape[0], :b_shape[1]] += b
>>> c
array([[1, 1, 2, 3, 4],
[2, 3, 2, 3, 4],
[3, 5, 2, 3, 4],
[4, 4, 0, 0, 0],
[5, 5, 0, 0, 0]])

Create a 4D list from a 3D list

I'm working with lists and something came up, so let's say I have a 3D list and I want to create a 4D list, so every two positions in the inner lists split, here is what I tried
mylist = [[[0, 2, 1], [0, 3, 1], [0, 4, 3, 1], [0, 4, 3, 1], [0, 3, 2, 1], [0, 2, 3, 4, 1]],
[[0, 2, 1], [0, 4, 2, 3, 1], [0, 4, 3, 1], [0, 4, 3, 1], [0, 3, 2, 1], [0, 2, 3, 4, 1]]]
newlist = [mylist[i: i + 2] for i in range(0, len(mylist), 2)]
print(newlist)
newlist = [[[[0, 2, 1], [0, 3, 1], [0, 4, 3, 1], [0, 4, 3, 1], [0, 3, 2, 1], [0, 2, 3, 4, 1]],
[[0, 2, 1], [0, 4, 2, 3, 1], [0, 4, 3, 1], [0, 4, 3, 1], [0, 3, 2, 1], [0, 2, 3, 4, 1]]]]
but I was expecting something like:
newlist = [[[[0, 2, 1], [0, 3, 1]], [[0, 4, 3, 1], [0, 4, 3, 1]], [[0, 3, 2, 1], [0, 2, 3, 4, 1]]],
[[0, 2, 1], [0, 4, 2, 3, 1]], [[0, 4, 3, 1], [0, 4, 3, 1]], [[0, 3, 2, 1], [0, 2, 3, 4, 1]]]]
I believe I'm missing a for in my list comprehension something like:
newlist = [[mylist[j: j + 2] for j in i] for i in range(0, len(my list), 2)]
but I'm having an error and I can't figure it out what is the problem, so any help will appreciated, thank you so much!
Try this:
newlist=[[list(ls) for ls in zip(i[::2], i[1::2])] for i in mylist]
print(newlist)
Output:
[[[[0, 2, 1], [0, 3, 1]],
[[0, 4, 3, 1], [0, 4, 3, 1]],
[[0, 3, 2, 1], [0, 2, 3, 4, 1]]],
[[[0, 2, 1], [0, 4, 2, 3, 1]],
[[0, 4, 3, 1], [0, 4, 3, 1]],
[[0, 3, 2, 1], [0, 2, 3, 4, 1]]]]
Here is a possible solution. You were very close!
newlist = [[lst[i:i+2] for i in range(0, len(lst), 2)] for lst in mylist]

Transposing a matrix using python numpy

This is my current matrix:
[[0, 1, 2, 4],
[0, 3, 1, 3],
[0, 2, 3, 2],
[0, 2, 4, 1],
[0, 4, 1, 2],
[0, 3, 2, 2],
[1, 2, 2, 2]]
I want to transpose it and get this as output:
[[0, 0, 0, 0, 1],
[2, 2, 4, 3, 2],
[3, 4, 1, 2, 2],
[2, 1, 2, 2, 2]]
I used inverse = np.swapaxes(ate,0,7) but I am not sure what will be my axis2 value be. Here the axis2 is 7.
I think what you're looking for is np.transpose()
You can use np.swapaxes, however this swaps "dimensions", so for a matrix that's either 0 or 1 because you have two dimensions:
>>> np.swapaxes(arr, 0, 1) # assuming your matrix is called arr
array([[0, 0, 0, 0, 0, 0, 1],
[1, 3, 2, 2, 4, 3, 2],
[2, 1, 3, 4, 1, 2, 2],
[4, 3, 2, 1, 2, 2, 2]])
To get your desired output you'd need to remove the first two columns before the swapaxes:
>>> np.swapaxes(arr[2:], 0, 1)
array([[0, 0, 0, 0, 1],
[2, 2, 4, 3, 2],
[3, 4, 1, 2, 2],
[2, 1, 2, 2, 2]])
However generally you should use np.transpose or .T if you want to transpose the matrix/array:
>>> arr[2:].T
array([[0, 0, 0, 0, 1],
[2, 2, 4, 3, 2],
[3, 4, 1, 2, 2],
[2, 1, 2, 2, 2]])

Python: extract the core of a 2D numpy array

Say I have a 2D numpy array like this:
In[1]: x
Out[1]:
array([[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3],
[4, 4, 4, 4, 4],
[5, 5, 5, 5, 5]], dtype=int64)
and I want to extract the (n-1)*(m-1) core, which would be:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]], dtype=int64)
How could I do this, since the data structure is not flat? Do you suggest flattening it first?
This is a simplified version of a much bigger array, which core has dimension (n-33)*(n-33).
You can use negative stop indices to exclude the last x rows/columns and normal start indices:
>>> x[1:-1, 1:-1]
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]], dtype=int64)
For your new example:
>>> t = np.array([[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3],
[4, 4, 4, 4, 4],
[5, 5, 5, 5, 5]], dtype=np.int64)
>>> t[1:-1, 1:-1]
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]], dtype=int64)
You could also remove 2 leading and trailing columns:
>>> t[1:-1, 2:-2]
array([[1],
[2],
[3],
[4]], dtype=int64)
or rows:
>>> t[2:-2, 1:-1]
array([[2, 2, 2],
[3, 3, 3]], dtype=int64)

Python combination and permuation code

I have following code to generate the set of combination, append the combination in the list, and return list.
def make_combination():
import itertools
max_range = 5
indexes = combinations_plus = []
for i in range(0, max_range):
indexes.append(i)
for i in xrange(2, max_range):
each_combination = [list(x) for x in itertools.combinations(indexes, i)]
combinations_plus.append(each_combination)
retrun combinations_plus
It generates so many combinations that I don't want (hard to display). But, I want the following combination:
1) [[0, 1], [0, 2], [0, 3], [0, 4], [1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]
2) [[0, 1, 2], [0, 1, 3], [0, 1, 4], [0, 2, 3], [0, 2, 4], [0, 3, 4], [1, 2, 3], [1, 2, 4], [1, 3, 4], [2, 3, 4]]
3) [[0, 1, 2, 3], [0, 1, 2, 4], [0, 1, 3, 4], [0, 2, 3, 4], [1, 2, 3, 4]]
I think problem in the following line but I don't know what it is. Any idea about what the mistake is.
combinations_plus.append(each_combination)
An easier way of doing what you want is the following:
list(list(itertools.combinations(list(range(5)), i)) for i in range(2, 5))
To fix your original code, there were two problems:
indexes = combinations_plus = []
The above creates two names for the exact same list. Appending to either appends to both which is not what you want.
The two for statements shouldn't be nested, or the list of indexes is incomplete:
for i in range(0, max_range):
indexes.append(i)
for i in xrange(2, max_range):
each_combination = [list(x) for x in itertools.combinations(indexes, i)]
combinations_plus.append(each_combination)
In fact, initialize indexes with range and skip the first for loop:
indexes = range(max_range) # becomes [0,1,2,3,4]
combinations_plus = []
With these fixes (and fixing the spelling of return, you have:
def make_combination():
import itertools
max_range = 5
indexes = range(max_range)
combinations_plus = []
for i in xrange(2, max_range):
each_combination = [list(x) for x in itertools.combinations(indexes, i)]
combinations_plus.append(each_combination)
return combinations_plus
Which returns (newlines added for readability):
[[[0, 1], [0, 2], [0, 3], [0, 4], [1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]],
[[0, 1, 2], [0, 1, 3], [0, 1, 4], [0, 2, 3], [0, 2, 4], [0, 3, 4], [1, 2, 3], [1, 2, 4], [1, 3, 4], [2, 3, 4]],
[[0, 1, 2, 3], [0, 1, 2, 4], [0, 1, 3, 4], [0, 2, 3, 4], [1, 2, 3, 4]]]

Categories