Python: extract the core of a 2D numpy array - python

Say I have a 2D numpy array like this:
In[1]: x
Out[1]:
array([[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3],
[4, 4, 4, 4, 4],
[5, 5, 5, 5, 5]], dtype=int64)
and I want to extract the (n-1)*(m-1) core, which would be:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]], dtype=int64)
How could I do this, since the data structure is not flat? Do you suggest flattening it first?
This is a simplified version of a much bigger array, which core has dimension (n-33)*(n-33).

You can use negative stop indices to exclude the last x rows/columns and normal start indices:
>>> x[1:-1, 1:-1]
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]], dtype=int64)
For your new example:
>>> t = np.array([[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3],
[4, 4, 4, 4, 4],
[5, 5, 5, 5, 5]], dtype=np.int64)
>>> t[1:-1, 1:-1]
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]], dtype=int64)
You could also remove 2 leading and trailing columns:
>>> t[1:-1, 2:-2]
array([[1],
[2],
[3],
[4]], dtype=int64)
or rows:
>>> t[2:-2, 1:-1]
array([[2, 2, 2],
[3, 3, 3]], dtype=int64)

Related

PyTorch indexing: select complement of indices

Say I have a tensor and index:
x = torch.tensor([1,2,3,4,5])
idx = torch.tensor([0,2,4])
If I want to select all elements not in the index, I can manually define a Boolean mask like so:
mask = torch.ones_like(x)
mask[idx] = 0
x[mask]
is there a more elegant way of doing this?
i.e. a syntax where I can directly pass the indices as opposed to creating a mask e.g. something like:
x[~idx]
I couldn't find a satisfactory solution to finding the complement of a multi-dimensional tensor of indices and finally implemented my own. It can work on cuda and enjoys fast parallel computation.
def complement_idx(idx, dim):
"""
Compute the complement: set(range(dim)) - set(idx).
idx is a multi-dimensional tensor, find the complement for its trailing dimension,
all other dimension is considered batched.
Args:
idx: input index, shape: [N, *, K]
dim: the max index for complement
"""
a = torch.arange(dim, device=idx.device)
ndim = idx.ndim
dims = idx.shape
n_idx = dims[-1]
dims = dims[:-1] + (-1, )
for i in range(1, ndim):
a = a.unsqueeze(0)
a = a.expand(*dims)
masked = torch.scatter(a, -1, idx, 0)
compl, _ = torch.sort(masked, dim=-1, descending=False)
compl = compl.permute(-1, *tuple(range(ndim - 1)))
compl = compl[n_idx:].permute(*(tuple(range(1, ndim)) + (0,)))
return compl
Example:
>>> import torch
>>> a = torch.rand(3, 4, 5)
>>> a
tensor([[[0.7849, 0.7404, 0.4112, 0.9873, 0.2937],
[0.2113, 0.9923, 0.6895, 0.1360, 0.2952],
[0.9644, 0.9577, 0.2021, 0.6050, 0.7143],
[0.0239, 0.7297, 0.3731, 0.8403, 0.5984]],
[[0.9089, 0.0945, 0.9573, 0.9475, 0.6485],
[0.7132, 0.4858, 0.0155, 0.3899, 0.8407],
[0.2327, 0.8023, 0.6278, 0.0653, 0.2215],
[0.9597, 0.5524, 0.2327, 0.1864, 0.1028]],
[[0.2334, 0.9821, 0.4420, 0.1389, 0.2663],
[0.6905, 0.2956, 0.8669, 0.6926, 0.9757],
[0.8897, 0.4707, 0.5909, 0.6522, 0.9137],
[0.6240, 0.1081, 0.6404, 0.1050, 0.6413]]])
>>> b, c = torch.topk(a, 2, dim=-1)
>>> b
tensor([[[0.9873, 0.7849],
[0.9923, 0.6895],
[0.9644, 0.9577],
[0.8403, 0.7297]],
[[0.9573, 0.9475],
[0.8407, 0.7132],
[0.8023, 0.6278],
[0.9597, 0.5524]],
[[0.9821, 0.4420],
[0.9757, 0.8669],
[0.9137, 0.8897],
[0.6413, 0.6404]]])
>>> c
tensor([[[3, 0],
[1, 2],
[0, 1],
[3, 1]],
[[2, 3],
[4, 0],
[1, 2],
[0, 1]],
[[1, 2],
[4, 2],
[4, 0],
[4, 2]]])
>>> compl = complement_idx(c, 5)
>>> compl
tensor([[[1, 2, 4],
[0, 3, 4],
[2, 3, 4],
[0, 2, 4]],
[[0, 1, 4],
[1, 2, 3],
[0, 3, 4],
[2, 3, 4]],
[[0, 3, 4],
[0, 1, 3],
[1, 2, 3],
[0, 1, 3]]])
>>> al = torch.cat([c, compl], dim=-1)
>>> al
tensor([[[3, 0, 1, 2, 4],
[1, 2, 0, 3, 4],
[0, 1, 2, 3, 4],
[3, 1, 0, 2, 4]],
[[2, 3, 0, 1, 4],
[4, 0, 1, 2, 3],
[1, 2, 0, 3, 4],
[0, 1, 2, 3, 4]],
[[1, 2, 0, 3, 4],
[4, 2, 0, 1, 3],
[4, 0, 1, 2, 3],
[4, 2, 0, 1, 3]]])
>>> al, _ = al.sort(dim=-1)
>>> al
tensor([[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]])
You may want to try the single-line expression:
x[np.setdiff1d(range(len(x)), idx)]
Though it seems also not elegant:).

How can I tranpose arrays in a numpy matrix?

Similar to the premise in this question, I'd like to transpose each sub-array in the matrix. However, my sub-arrays are of different sizes. I've tried the following lines of code:
import numpy as np
test_array = np.array([
np.array([[1, 1, 1, 1],
[1, 1, 1, 1]]),
np.array([[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2]]),
np.array([[3, 3],
[3, 3],
[3, 3]])
])
new_test_array = np.apply_along_axis(test_array, 0, np.transpose)
*** numpy.AxisError: axis 0 is out of bounds for array of dimension 0
new_test_array = np.transpose(test_array, (0, 2, 1))
*** ValueError: axes don't match array
new_test_array = np.array(list(map(np.transpose, test_array)))
returns original array
My expected output is
new_test_array = np.array([
np.array([[1, 1],
[1, 1],
[1, 1],
[1, 1]]),
np.array([[2, 2, 2],
[2, 2, 2],
[2, 2, 2],
[2, 2, 2]]),
np.array([[3, 3, 3],
[3, 3, 3]])
])
To answer shortly, you can do this on your data to get what you want:
new_test_array = [np.transpose(x) for x in test_array]
But in your example you build an array of lists instead of an array of varying sizes (which is impossible in numpy). It is also why your methods did not work.
So if you want to do it in a more correct way, first you have to use a list and then convert each list into a numpy array, which you can then transpose individually.
Here's an example code:
test_list = [[[1, 1, 1, 1],
[1, 1, 1, 1]],
[[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2]],
[[3, 3],
[3, 3],
[3, 3]]]
list_of_arrays = [np.array(x) for x in test_list]
transposed_arrays = [np.transpose(x) for x in list_of_arrays]
Printing transposed_arrays will give you this:
[array([[1, 1],
[1, 1],
[1, 1],
[1, 1]]),
array([[2, 2, 2],
[2, 2, 2],
[2, 2, 2],
[2, 2, 2]]),
array([[3, 3, 3],
[3, 3, 3]])]

How to expand the elements of a numpy matrix into sub matrices [duplicate]

This question already has answers here:
Quick way to upsample numpy array by nearest neighbor tiling [duplicate]
(3 answers)
Closed 3 years ago.
Let's say I have a numpy array:
x = np.array([[1, 2],
[3, 4]]
What is the easiest way to expand the elements into submatrices?
An intermediary result could look like this:
x = np.array([[[[1, 1],[1, 1]], [[2, 2],[2, 2]]],
[[[3, 3],[3, 3]], [[4, 4],[4, 4]]]]
And the desired result:
x = np.array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]]
You can use two repeats over the desired axes:
In [34]: np.repeat(np.repeat(x, 2, 1), 2, 0)
Out[34]:
array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
Or as a faster approach (more suitable for larger arrays and repeat numbers) you can use as_strided:
In [43]: from numpy.lib.stride_tricks import as_strided
In [44]: x, y = arr.shape
In [45]: xs, ys = arr.strides
In [46]: result = as_strided(arr, (x, 2, y, 2), (xs, 0, ys, 0))
In [47]: result.reshape(x*2, y*2)
Out[47]:
array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
You can use numpy.repeat for the task. It has an axis argument.
>>> a = np.array([[1, 2], [3, 4]])
>>> a
array([[1, 2],
[3, 4]])
>>> np.repeat(a, 2)
array([1, 1, 2, 2, 3, 3, 4, 4])
>>> np.repeat(a, 2, axis=1)
array([[1, 1, 2, 2],
[3, 3, 4, 4]])
>>> np.repeat(np.repeat(a, 2, axis=1), 2, axis=0)
array([[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3],
[4, 4, 4, 4]])

Reshaping a multidimensional Numpy array

I have a numpy array of shape (1429,1) where each row itself is a numpy array of shape (3,100) where l may vary from row to row.
How can I reshape this array by flattening each row such that the resulting numpy array will have the shape (1429, 300)?
I guess your initial array's shape is (1429, 3, 100), if that's true, you can change it's shape as below:
import numpy as np
a = a.flatten().reshape((1429, 300)) #a is the initial numpy array
The type of your embedding structure is probably object. It's just a collection of references on 1429 numpy.ndarrays.
As an exemple :
a=np.empty((1429,1),object)
for x in a :
x[0]=np.random.rand(3,100)
In [19]: a.shape,a.dtype
Out[19]: ((1429, 1), dtype('O'))
In [20]: a[0,0].shape
Out[20]: (3, 100)
The structure is probably not contiguous. To obtain a block containing all your data, you must reconstruct it to obtain the good layout :
b=np.array([x.ravel() for x in a.ravel()])
In [21]: b.shape
Out[21]: (1429, 300)
ravel discard unwanted dimensions.
Assuming it is an object dtype array with shape (1429,1), and all elements are 2d of shape (3,100), a good way to 'flatten' is to use concatenate or stack.
np.stack(arr.ravel()).reshape(-1,300)
I use arr.ravel() so the array looks like a (1429) element list to stack. stack then concatenates the elements, creating a (1429, 3, 100) array. The reshape then converts that to (1429, 300).
In [939]: arr = np.empty((5,1),object)
In [940]: arr[:,0] = [np.arange(6).reshape(2,3) for _ in range(5)]
In [941]: arr
Out[941]:
array([[array([[0, 1, 2],
[3, 4, 5]])],
[array([[0, 1, 2],
[3, 4, 5]])],
[array([[0, 1, 2],
[3, 4, 5]])],
[array([[0, 1, 2],
[3, 4, 5]])],
[array([[0, 1, 2],
[3, 4, 5]])]], dtype=object)
In [942]: np.stack(arr.ravel())
Out[942]:
array([[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]]])
In [943]: np.stack(arr.ravel()).reshape(-1,6)
Out[943]:
array([[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5]])
np.stack with the default axis=0 is the same as np.array(...).
Or with concatenate
In [950]: np.concatenate(arr.ravel(),axis=0)
Out[950]:
array([[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5]])
In [951]: np.concatenate(arr.ravel(),axis=0).reshape(5,6)
Out[951]:
array([[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5]])

How to add ones to matrix?

I have an array:
X = [[2, 2, 2],
[3, 3, 3],
[4, 4, 4]]
I need to add extra column in numpy array and fill it with ones using hstack and reshape. Like that:
X = [[2, 2, 2, 1],
[3, 3, 3, 1],
[4, 4, 4, 1]]
What I do:
X = np.hstack(X, np.ones(X.reshape(X, (2,3))))
And a get an error:
TypeError: only length-1 arrays can be converted to Python scalars
What's a problem? What I've done wrong?
Here's a couple ways with numpy.append, numpy.hstack or numpy.column_stack:
# numpy is imported as np
>>> x
array([[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
>>> np.append(x, np.ones([x.shape[0], 1], dtype=np.int32), axis=1)
array([[2, 2, 2, 1],
[3, 3, 3, 1],
[4, 4, 4, 1]])
>>> np.hstack([x, np.ones([x.shape[0], 1], dtype=np.int32)])
array([[2, 2, 2, 1],
[3, 3, 3, 1],
[4, 4, 4, 1]])
>>> np.column_stack([x, np.ones([x.shape[0], 1], dtype=np.int32)])
array([[2, 2, 2, 1],
[3, 3, 3, 1],
[4, 4, 4, 1]])
You can use numpy.insert():
>>> X
array([[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
Ones at the begining of matrix:
>>> X=np.insert(X,0,1.0,axis=1)
>>> X
array([[1, 2, 2, 2],
[1, 3, 3, 3],
[1, 4, 4, 4]])
Ones at the end of matrix
>>> X=np.insert(X,3,1.0,axis=1)
>>> X
array([[2, 2, 2, 1],
[3, 3, 3, 1],
[4, 4, 4, 1]])

Categories