np.dot in NumPy printing the transpose of what should be expected - python

I'm really new to Python and am wondering why this is printing the opposite of expected. A (7x4)(4x2)(2x1) multiplication should result in a 7x1 column vector.
import numpy as np
nutrition = np.array([[61, 100, 7, 2.2, 1, 7, 215],
[156, 340, 18, 7, 44, 5, 0],
[19, 110, 9, 3.3, 0, 6, 16],
[27, 60, 2, 0.5, 8, 2, 16]])
meals = np.array([[2, 1, 0, 0],
[0, 1, 1, 1]]
M = np.array([40, 10])
print(np.dot(nutrition.T, np.dot(meals.T, M)))
Instead, it is printing a 1x7 row vector:
[13140. 26700. 1570. 564. 2360. 890. 17520.]
Any explanation or problems to look into would be appreciated.

Your array M is of shape (2,) and NOT (2,1):
print(M.shape)
(2,)
Hence, the output shape is (7,) and NOT (7,1). Which makes it a 1-D array represented in a single row:
print(np.dot(nutrition.T, np.dot(meals.T, M)).shape)
(7,)
If you want a (7,1) output, simply reshape your M to (2,1):
M = M.reshape(-1,1)
#[[40]
# [10]]
And output would be:
[[13140.]
[26700.]
[ 1570.]
[ 564.]
[ 2360.]
[ 890.]
[17520.]]

Related

Defining numpy indexing arrays

I am having a point of confusion over numpy indexing. Let's say I have a three-dimensional array, like:
test_arr = np.arange(3*2*3).reshape(3,2,3)
test_arr
array([[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17]]])
I would like to index this by a boolean array along dimension 1:
dim1_idx = np.array([True, False])
test_arr[:, dim1_idx, :]
which gives me
array([[[ 0, 1, 2]],
[[ 6, 7, 8]],
[[12, 13, 14]]])
All good so far.
My question is, is there a way that I can define this boolean index array in advance - like (and this doesn't work):
all_dim_idx = dim1_idx[np.newaxis, :, np.newaxis]
test_arr[all_dim_idx]
I realize that the reason this doesn't is because it can't broadcast in a way to make the all_dim_idx array fit test_arr. I could use np.tile or np.reshape to make the index array fit onto the larger array, but (as well as not being then generalizable to other array shapes) I just get the impression that there's probably a better way. Can anyone enlighten me?
Thanks in advance!
In [600]: test_arr = np.arange(3*2*3).reshape(3,2,3)
In [601]: dim1_idx = np.array([True, False])
Define an indexing tuple:
In [602]: idx = (slice(None), dim1_idx, slice(None))
In [603]: test_arr[idx]
Out[603]:
array([[[ 0, 1, 2]],
[[ 6, 7, 8]],
[[12, 13, 14]]])

numpy's hstack on tensorflow for a single matrix/tensor

The numpy version of hstack for a single matrix
c=np.array([[[2,3,4],[4,5,6]],[[20,30,40],[40,50,60]]])
np.hstack(c)
output:
array([[ 2, 3, 4, 20, 30, 40],
[ 4, 5, 6, 40, 50, 60]])
I am hoping to achieve the same behavior in TF.
c_t=tf.constant(c)
tf.stack(c_t,axis=1).eval()
I am getting the error
TypeError: Expected list for 'values' argument to 'pack' Op, not <tf.Tensor 'Const_14:0' shape=(2, 2, 3) dtype=int64>.
So I tried
tf.stack([c_t],axis=1).eval()
The output
array([[[[ 2, 3, 4],
[ 4, 5, 6]]],
[[[20, 30, 40],
[40, 50, 60]]]])
I am not looking for the behaviour. tf.reshape and tf.concat are not helping me either.
We can swap/permute axes and reshape -
tf.reshape(tf.transpose(c_t,(1,0,2)),(c_t.shape[1],-1))
Relevant - Intuition and idea behind reshaping 4D array to 2D array in NumPy
One way to make it work is first unstack the tensor into a list, and then concatenate the tensors in list on first axis:
new_c = tf.concat(tf.unstack(c_t), axis=1)
sess.run(new_c)
array([[ 2, 3, 4, 20, 30, 40],
[ 4, 5, 6, 40, 50, 60]])
If you want to do it the manual way at the atomic level, then the below approach would as well work.
In [132]: c=np.array([[[2,3,4],[4,5,6]],[[20,30,40],[40,50,60]]])
In [133]: tfc = tf.convert_to_tensor(c)
In [134]: slices = [tf.squeeze(tfc[:1, ...]), tf.squeeze(tfc[1:, ...])]
In [135]: stacked = tf.concat(slices, axis=1)
In [136]: stacked.eval()
Out[136]:
array([[ 2, 3, 4, 20, 30, 40],
[ 4, 5, 6, 40, 50, 60]])

Fastest method for determining if 2 (vertically or horizontally) adjacent elements of a numpy array have the same value

I am looking for the fastest way of determining if 2 (vertically or horizontally) adjacent elements have the same value.
Let's say I have a numpy array of size 4x4.
array([
[8, 7, 4, 3],
[8, 4, 0, 4],
[3, 2, 2, 1],
[9, 8, 7, 6]])
I want to be able to identify that there are two adjacent 8s in the first column and there are two adjacent 2s in the third row. I could hard code a check but that would be ugly and I want to know if there is a faster way.
All guidance is appreciated. Thank you.
We would look for differentiation values along rows and columns for zeros signalling repeated ones there. Thus, we could do -
(np.diff(a,axis=0) == 0).any() | (np.diff(a,axis=1) == 0).any()
Or with slicing for performance boost -
(a[1:] == a[:-1]).any() | (a[:,1:] == a[:,:-1]).any()
So, (a[1:] == a[:-1]).any() is the vertical adjacency, whereas the other one is for horizontal one.
Extending to n adjacent ones (of same value) along rows or columns -
from scipy.ndimage.filters import convolve1d as conv
def vert_horz_adj(a, n=1):
k = np.ones(n,dtype=int)
v = (conv((a[1:]==a[:-1]).astype(int),k,axis=0,mode='constant')>=n).any()
h = (conv((a[:,1:]==a[:,:-1]).astype(int),k,axis=1,mode='constant')>=n).any()
return v | h
Sample run -
In [413]: np.random.seed(0)
...: a = np.random.randint(11,99,(10,4))
...: a[[2,3,4,6,7,8],0] = 1
In [414]: a
Out[414]:
array([[55, 58, 75, 78],
[78, 20, 94, 32],
[ 1, 98, 81, 23],
[ 1, 76, 50, 98],
[ 1, 92, 48, 36],
[88, 83, 20, 31],
[ 1, 80, 90, 58],
[ 1, 93, 60, 40],
[ 1, 30, 25, 50],
[43, 76, 20, 68]])
In [415]: vert_horz_adj(a, n=1)
Out[415]: True # Because of first col
In [416]: vert_horz_adj(a, n=2)
Out[416]: True # Because of first col
In [417]: vert_horz_adj(a, n=3)
Out[417]: False
In [418]: a[-1] = 10
In [419]: vert_horz_adj(a, n=3)
Out[419]: True # Because of last row
You can find the coordinates of the pairs with the following code:
import numpy as np
a = np.array([
[8, 7, 4, 3],
[8, 4, 0, 4],
[3, 2, 2, 1],
[9, 8, 7, 6]])
vertical = np.where((a == np.roll(a, 1, 0))[1:-1])
print(vertical) # (0,0) is the coordinate of the first of the repeating 8's
horizontal = np.where((a == np.roll(a, 1, 1))[:, 1:-1])
print(horizontal) # (2,1) is the coordinate of the first of the repeating 2's
which returns
(array([0], dtype=int64), array([0], dtype=int64))
(array([2], dtype=int64), array([1], dtype=int64))
if you want to locate the first occurence of each pair :
A=array([
[8, 7, 4, 3],
[8, 4, 0, 4],
[3, 2, 2, 1],
[9, 8, 7, 6]])
x=(A[1:]==A[:-1]).nonzero()
y=(A[:,1:]==A[:,:-1]).nonzero()
In [45]: x
Out[45]: (array([0], dtype=int64), array([0], dtype=int64))
In [47]: y
Out[47]: (array([2], dtype=int64), array([1], dtype=int64))
In [48]: A[x]
Out[48]: array([8])
In [49]: A[y]
Out[49]: array([2])
x and y give respectively the locations of the first 8 and the first 2.

Indexing tensor with binary matrix in numpy

I have a tensor A such that A.shape = (32, 19, 2) and a binary matrix B such that B.shape = (32, 19). Is there a one-line operation I can perform to get a matrix C, where C.shape = (32, 19) and C(i,j) = A[i, j, B[i,j]]?
Essentially, I want to use B as an indexing matrix, where if B[i,j] = 1 I take A[i,j,1] to form C(i,j).
np.where to the rescue. It's the same principle as mtrw's answer:
In [344]: A=np.arange(4*3*2).reshape(4,3,2)
In [345]: B=np.zeros((4,3),dtype=int)
In [346]: B[[0,1,1,2,3],[0,0,1,2,2]]=1
In [347]: B
Out[347]:
array([[1, 0, 0],
[1, 1, 0],
[0, 0, 1],
[0, 0, 1]])
In [348]: np.where(B,A[:,:,1],A[:,:,0])
Out[348]:
array([[ 1, 2, 4],
[ 7, 9, 10],
[12, 14, 17],
[18, 20, 23]])
np.choose can be used if the last dimension is larger than 2 (but smaller than 32). (choose operates on a list or the 1st dimension, hence the rollaxis.
In [360]: np.choose(B,np.rollaxis(A,2))
Out[360]:
array([[ 1, 2, 4],
[ 7, 9, 10],
[12, 14, 17],
[18, 20, 23]])
B can also be used directly as an index. The trick is to specify the other dimensions in a way that broadcasts to the same shape.
In [373]: A[np.arange(A.shape[0])[:,None], np.arange(A.shape[1])[None,:], B]
Out[373]:
array([[ 1, 2, 4],
[ 7, 9, 10],
[12, 14, 17],
[18, 20, 23]])
This last approach can be modified to work when B does not match the 1st 2 dimensions of A.
np.ix_ may simplify this indexing
I, J = np.ix_(np.arange(4),np.arange(3))
A[I, J, B]
You can do it using list comprehension:
C = np.array([[A[i, j, B[i, j]] for j in range(A.shape[1])] for i in range(A.shape[0])])
C = A[:,:,0]*(B==0) + A[:,:,1]*(B==1) should work. You can generalize this as np.sum([A[:,:,k]*(B==k) for k in np.arange(A.shape[-1])], axis=0) if you need to index more planes.

Apply Mask Array 2d to 3d

I want to apply a mask of 2 dimensions (an NxM array) to a 3 dimensional array (a KxNxM array). How can I do this?
2d = lat x lon
3d = time x lat x lon
import numpy as np
a = np.array(
[[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
b = np.array(
[[ 0, 1, 0],
[ 1, 0, 1],
[ 0, 1, 1]])
c = np.ma.array(a, mask=b) # this behavior is wanted
There are quite a few different ways to choose from. What you want to do is align the mask (of lower dimension) to the array that has the extra dimension: the important part is that you get the number of elements in both arrays the same, as the first example shows:
np.ma.array(a, mask=np.concatenate((b,b,b))) # shapes are (3, 3, 3) and (9, 3)
np.ma.array(a, mask=np.tile(b, (a.shape[0],1))) # same as above, just more general as it doesn't require you to specify just how many times you need to stack b.
np.ma.array(a, mask=a*b[np.newaxis,:,:]) # used broadcasting

Categories