Inserting an array of arrays as the last column - python

I have an array A:
array([[1, 2, 3],
[1, 1, 1],
[2, 2, 2]])
and an array B:
array([[1, 0],
[1, 0],
[0, 1]])
I want to make array B as the last column of array A, so I want the result array (let's call it C) to look like this:
array([[1, 2, 3, [1, 0]],
[1, 1, 1, [1, 0]],
[2, 2, 2, [0, 1]]])
I tried: np.insert(a,-1,b,axis=1) , but this gave me an error:
ValueError: could not broadcast input array from shape (2,3) into shape (3,3)

Maybe that's what you're looking for:
import numpy as np
a = np.array([[1, 2, 3],
[1, 1, 1],
[2, 2, 2]])
b = np.array([[1, 0],
[1, 0],
[0, 1]])
np.hstack([a,b])
Which results in:
array([[1, 2, 3, 1, 0],
[1, 1, 1, 1, 0],
[2, 2, 2, 0, 1]])

print zip(*zip(*a)+[b.tolist(),])
although it wont be a numpy array afterwards
>>> a
array([[1, 2, 3],
[1, 1, 1],
[2, 2, 2]])
>>> b
array([[1, 0],
[1, 0],
[0, 1]])
>>> zip(*zip(*a)+[b.tolist(),])
[(1, 2, 3, [1, 0]), (1, 1, 1, [1, 0]), (2, 2, 2, [0, 1])]

Related

How does the transpose of high-dimensional arrays work?

It's easy to understand the concept of Transpose in 2-D array. I reall can not understand How the transpose of high-dimensional arrays works.
For example
c = np.indices([4,5]).T.reshape(20,1,2)
d = np.indices([4,5]).reshape(20,1,2)
np.all(c==d) # output is False
Why are the outputs of C and D inconsistent?
In [143]: c = np.indices([4,5])
In [144]: c
Out[144]:
array([[[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]])
In [145]: c.shape
Out[145]: (2, 4, 5)
In [146]: c.T.shape
Out[146]: (5, 4, 2)
Look at one 2d array from the size 2 dimension:
In [150]: c[0,:,:]
Out[150]:
array([[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3]])
In [151]: c.T[:,:,0]
Out[151]:
array([[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3]])
The 2nd is the usual 2d transpose, a (5,4) array.
MATLAB doesn't do transpose on 3d arrays, at least it doesn't call it such. It may have a way making such a change. numpy, using a general shape/strides multidimensional implementation, easily generalizes the 2d transpose - to 1d or 3d or more.

PyTorch indexing: select complement of indices

Say I have a tensor and index:
x = torch.tensor([1,2,3,4,5])
idx = torch.tensor([0,2,4])
If I want to select all elements not in the index, I can manually define a Boolean mask like so:
mask = torch.ones_like(x)
mask[idx] = 0
x[mask]
is there a more elegant way of doing this?
i.e. a syntax where I can directly pass the indices as opposed to creating a mask e.g. something like:
x[~idx]
I couldn't find a satisfactory solution to finding the complement of a multi-dimensional tensor of indices and finally implemented my own. It can work on cuda and enjoys fast parallel computation.
def complement_idx(idx, dim):
"""
Compute the complement: set(range(dim)) - set(idx).
idx is a multi-dimensional tensor, find the complement for its trailing dimension,
all other dimension is considered batched.
Args:
idx: input index, shape: [N, *, K]
dim: the max index for complement
"""
a = torch.arange(dim, device=idx.device)
ndim = idx.ndim
dims = idx.shape
n_idx = dims[-1]
dims = dims[:-1] + (-1, )
for i in range(1, ndim):
a = a.unsqueeze(0)
a = a.expand(*dims)
masked = torch.scatter(a, -1, idx, 0)
compl, _ = torch.sort(masked, dim=-1, descending=False)
compl = compl.permute(-1, *tuple(range(ndim - 1)))
compl = compl[n_idx:].permute(*(tuple(range(1, ndim)) + (0,)))
return compl
Example:
>>> import torch
>>> a = torch.rand(3, 4, 5)
>>> a
tensor([[[0.7849, 0.7404, 0.4112, 0.9873, 0.2937],
[0.2113, 0.9923, 0.6895, 0.1360, 0.2952],
[0.9644, 0.9577, 0.2021, 0.6050, 0.7143],
[0.0239, 0.7297, 0.3731, 0.8403, 0.5984]],
[[0.9089, 0.0945, 0.9573, 0.9475, 0.6485],
[0.7132, 0.4858, 0.0155, 0.3899, 0.8407],
[0.2327, 0.8023, 0.6278, 0.0653, 0.2215],
[0.9597, 0.5524, 0.2327, 0.1864, 0.1028]],
[[0.2334, 0.9821, 0.4420, 0.1389, 0.2663],
[0.6905, 0.2956, 0.8669, 0.6926, 0.9757],
[0.8897, 0.4707, 0.5909, 0.6522, 0.9137],
[0.6240, 0.1081, 0.6404, 0.1050, 0.6413]]])
>>> b, c = torch.topk(a, 2, dim=-1)
>>> b
tensor([[[0.9873, 0.7849],
[0.9923, 0.6895],
[0.9644, 0.9577],
[0.8403, 0.7297]],
[[0.9573, 0.9475],
[0.8407, 0.7132],
[0.8023, 0.6278],
[0.9597, 0.5524]],
[[0.9821, 0.4420],
[0.9757, 0.8669],
[0.9137, 0.8897],
[0.6413, 0.6404]]])
>>> c
tensor([[[3, 0],
[1, 2],
[0, 1],
[3, 1]],
[[2, 3],
[4, 0],
[1, 2],
[0, 1]],
[[1, 2],
[4, 2],
[4, 0],
[4, 2]]])
>>> compl = complement_idx(c, 5)
>>> compl
tensor([[[1, 2, 4],
[0, 3, 4],
[2, 3, 4],
[0, 2, 4]],
[[0, 1, 4],
[1, 2, 3],
[0, 3, 4],
[2, 3, 4]],
[[0, 3, 4],
[0, 1, 3],
[1, 2, 3],
[0, 1, 3]]])
>>> al = torch.cat([c, compl], dim=-1)
>>> al
tensor([[[3, 0, 1, 2, 4],
[1, 2, 0, 3, 4],
[0, 1, 2, 3, 4],
[3, 1, 0, 2, 4]],
[[2, 3, 0, 1, 4],
[4, 0, 1, 2, 3],
[1, 2, 0, 3, 4],
[0, 1, 2, 3, 4]],
[[1, 2, 0, 3, 4],
[4, 2, 0, 1, 3],
[4, 0, 1, 2, 3],
[4, 2, 0, 1, 3]]])
>>> al, _ = al.sort(dim=-1)
>>> al
tensor([[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]])
You may want to try the single-line expression:
x[np.setdiff1d(range(len(x)), idx)]
Though it seems also not elegant:).

Transposing a matrix using python numpy

This is my current matrix:
[[0, 1, 2, 4],
[0, 3, 1, 3],
[0, 2, 3, 2],
[0, 2, 4, 1],
[0, 4, 1, 2],
[0, 3, 2, 2],
[1, 2, 2, 2]]
I want to transpose it and get this as output:
[[0, 0, 0, 0, 1],
[2, 2, 4, 3, 2],
[3, 4, 1, 2, 2],
[2, 1, 2, 2, 2]]
I used inverse = np.swapaxes(ate,0,7) but I am not sure what will be my axis2 value be. Here the axis2 is 7.
I think what you're looking for is np.transpose()
You can use np.swapaxes, however this swaps "dimensions", so for a matrix that's either 0 or 1 because you have two dimensions:
>>> np.swapaxes(arr, 0, 1) # assuming your matrix is called arr
array([[0, 0, 0, 0, 0, 0, 1],
[1, 3, 2, 2, 4, 3, 2],
[2, 1, 3, 4, 1, 2, 2],
[4, 3, 2, 1, 2, 2, 2]])
To get your desired output you'd need to remove the first two columns before the swapaxes:
>>> np.swapaxes(arr[2:], 0, 1)
array([[0, 0, 0, 0, 1],
[2, 2, 4, 3, 2],
[3, 4, 1, 2, 2],
[2, 1, 2, 2, 2]])
However generally you should use np.transpose or .T if you want to transpose the matrix/array:
>>> arr[2:].T
array([[0, 0, 0, 0, 1],
[2, 2, 4, 3, 2],
[3, 4, 1, 2, 2],
[2, 1, 2, 2, 2]])

np.choose not giving desired result after broadcasting

I would like to pick the nth elements as specified in maxsuit from suitCounts. I did broadcast the maxsuit array so I do get a result, but not the desired one. Any suggestions what I'm doing conceptually wrong is appreciated. I don't understand the result of np.choose(self.maxsuit[:,:,None]-1, self.suitCounts), which is not what I'm looking for.
>>> self.maxsuit
Out[38]:
array([[3, 3],
[1, 1],
[1, 1]], dtype=int64)
>>> self.maxsuit[:,:,None]-1
Out[33]:
array([[[2],
[2]],
[[0],
[0]],
[[0],
[0]]], dtype=int64)
>>> self.suitCounts
Out[34]:
array([[[2, 1, 3, 0],
[1, 0, 3, 0]],
[[4, 1, 2, 0],
[3, 0, 3, 0]],
[[2, 2, 0, 0],
[1, 1, 1, 0]]])
>>> np.choose(self.maxsuit[:,:,None]-1, self.suitCounts)
Out[35]:
array([[[2, 2, 0, 0],
[1, 1, 1, 0]],
[[2, 1, 3, 0],
[1, 0, 3, 0]],
[[2, 1, 3, 0],
[1, 0, 3, 0]]])
The desired result would be:
[[3,3],[4,3],[2,1]]
You could use advanced-indexing for a broadcasted way to index into the array, like so -
In [415]: val # Data array
Out[415]:
array([[[2, 1, 3, 0],
[1, 0, 3, 0]],
[[4, 1, 2, 0],
[3, 0, 3, 0]],
[[2, 2, 0, 0],
[1, 1, 1, 0]]])
In [416]: idx # Indexing array
Out[416]:
array([[3, 3],
[1, 1],
[1, 1]])
In [417]: m,n = val.shape[:2]
In [418]: val[np.arange(m)[:,None],np.arange(n),idx-1]
Out[418]:
array([[3, 3],
[4, 3],
[2, 1]])
A bit cleaner way with np.ogrid to use open range arrays -
In [424]: d0,d1 = np.ogrid[:m,:n]
In [425]: val[d0,d1,idx-1]
Out[425]:
array([[3, 3],
[4, 3],
[2, 1]])
This is the best I can do with choose
In [23]: np.choose([[1,2,0],[1,2,0]], suitcounts[:,:,:3])
Out[23]:
array([[4, 2, 3],
[3, 1, 3]])
choose prefers that we use a list of arrays, rather than single one. It's supposed to prevent misuse. So the problem could be written as:
In [24]: np.choose([[1,2,0],[1,2,0]], [suitcounts[0,:,:3], suitcounts[1,:,:3], suitcounts[2,:,:3]])
Out[24]:
array([[4, 2, 3],
[3, 1, 3]])
The idea is to select items from the 3 subarrays, based on an index array like:
In [25]: np.array([[1,2,0],[1,2,0]])
Out[25]:
array([[1, 2, 0],
[1, 2, 0]])
The output will match the indexing array in shape. The choise arrays have match in shape as well, hence my use of [...,:3].
Values for the first column are selected from suitcounts[1,:,:3], for the 2nd column from suitcounts[2...] etc.
choose is limited to 32 choices; this is limitation imposed by the broadcasting mechanism.
Speaking of broadcasting I could simplify the expression
In [26]: np.choose([1,2,0], suitcounts[:,:,:3])
Out[26]:
array([[4, 2, 3],
[3, 1, 3]])
This broadcasts [1,2,0] to match the 2x3 shape of the subarrays.
I could get the target order by reordering the columns:
In [27]: np.choose([0,1,2], suitcounts[:,:,[2,0,1]])
Out[27]:
array([[3, 4, 2],
[3, 3, 1]])

Is there a way to compose two matrix of numbers in one matrix of text?

I would like to compose two matrix of numbers into one matrix of formated text in python.
Is there a easy way?
I could use for, but I just want this because is better for work.
As a simple example:
array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
to
array([['0:0', '1:0', '2:0'],
['0:1', '1:1', '2:1'],
['0:2', '1:2', '2:2'],
['0:3', '1:3', '2:3'],
['0:4', '1:4', '2:4']])
You can use np.dstack to combine both the arrays and use string manipulation with comprehension to manipulate each cell of the combined array
>>> arr = np.dstack((arr1, arr2))
>>> np.array([np.array([':'.join(map(str,cell)) for cell in row ]) for row in arr])
array([['0:0', '1:0', '2:0'],
['0:1', '1:1', '2:1'],
['0:2', '1:2', '2:2'],
['0:3', '1:3', '2:3'],
['0:4', '1:4', '2:4']],
dtype='|S3')
You could use nditer to iterate over the arrays, and make strings as needed: e.g.
import numpy as np
a1 = np.array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
a2 = np.array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
out=np.empty(a1.shape, dtype='S5')
for x,y,o in np.nditer([a1, a2, out], op_flags=['readwrite']):
o[...] = "{}:{}".format(x,y)
print(out)
Result:
[['0:0' '1:0' '2:0']
['0:1' '1:1' '2:1']
['0:2' '1:2' '2:2']
['0:3' '1:3' '2:3']
['0:4' '1:4' '2:4']]
Use list comprehensions and zip() to form a new array:
from numpy import array
ar1 = array([[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2]])
ar2 = array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
res = array([['%s:%s' % (j1, j2) for j1, j2 in zip(i1, i2)] for i1, i2 in zip(ar1, ar2)])
print(res)
Result:
[['0:0' '1:0' '2:0']
['0:1' '1:1' '2:1']
['0:2' '1:2' '2:2']
['0:3' '1:3' '2:3']
['0:4' '1:4' '2:4']]
This solution will also fit usual Python two-dimensional lists (just remove the 'array' functions).

Categories