The question is on how to join two arrays in this case more efficiently- There's a numpy array one of shape (N, M, 1) and array two of shape (M,F). It's required to join the second array with the first, to create an array of the shape (N, M, F+1). The elements of the second array will be broadcast along N.
One solution is copying array 2 to have size of the first (along all dims but one) and then concatenate. But this if the copying can be done as a broadcast during the join/concat it would use much lesser memory.
Any suggestions on how to make this more efficient?
The setup:
import numpy as np
arr1 = np.random.randint(0,10,(5,10))
arr1 = np.expand_dims(arr1, axis=-1) #(5,10, 1)
arr2 = np.random.randint(0,4,(10,15))
arr2 = np.expand_dims(arr2, axis=0) #(1, 10, 15)
arr2_2 = arr2
for i in range(len(arr1)-1):
arr2_2 = np.concatenate([arr2_2, arr2],axis=0)
arr2_2.shape #(5, 10, 15)
np.concatenate([arr1, arr2_2],axis=-1) # (5, 10, 16) -> correct end result
Joining arr1 and arr2 to get
try this
>>> a = np.random.randint(0, 10, (5, 10))
>>> b = np.random.randint(0, 4, (10, 15))
>>> c = np.dstack((a[:, :, np.newaxis], np.broadcast_to(b, (a.shape[0], *b.shape))))
>>> a.shape, b.shape, c.shape
((5, 10), (10, 15), (5, 10, 16)))
Related
I have several 3-dimensional numpy arrays that I want to join together to feed them as a training set for my LSTM neural network. They are mostly of shape (1,m,n)
I want to join them so that, for e.g. np.arr(1,50,20) + np.arr(1,50,20) = np.arr(2,50,20) and np.arr(1,50,20) + np.arr(3,50,20) = np.arr(4,50,20)
Which of the stack functions of numpy would suit my problem? Or is there another way to solve it more efficiently?
Use numpy concatenate with the first axis.
import numpy as np
rng = np.random.default_rng()
a = rng.integers(0, 10, (1, 3, 20))
b = rng.integers(-10, -1, (2, 3, 20))
c = np.concatenate((a, b), axis=0)
print(c.shape)
(3, 3, 20)
Use np.vstack
x = np.array([[[2,3,5],[4,5,1]]])
y = np.array([[[1,5,8],[8,0,9]]])
x.shape
(1,2,3)
np.vstack((x,y)).shape
(2,2,3)
i have 3 numpy arrays which store image data of shape (4,100,100).
arr1= np.load(r'C:\Users\x\Desktop\py\output\a1.npy')
arr2= np.load(r'C:\Users\x\Desktop\py\output\a2.npy')
arr3= np.load(r'C:\Users\x\Desktop\py\output\a3.npy')
I want to merge all 3 arrays into 1 array.
I have tried in this way:
merg_arr = np.zeros((len(arr1)+len(arr2)+len(arr3), 4,100,100), dtype=input_img.dtype)
now this make an array of the required length but I don't know how to copy all the data in this array. may be using a loop?
This will do the trick:
merge_arr = np.concatenate([arr1, arr2, arr3], axis=0)
np.stack arranges arrays along a new dimension. Their dimensions (except for the first) need to match.
Demo:
arr1 = np.empty((60, 4, 10, 10))
arr2 = np.empty((14, 4, 10, 10))
arr3 = np.empty((6, 4, 10, 10))
merge_arr = np.concatenate([arr1, arr2, arr3], axis=0)
print(merge_arr.shape) # (80, 4, 10, 10)
I am working with Keras and the provided MNIST data set. I believe the dataset is a numpy array. I have reshaped it as follows:
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28)
X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)
This gives a (60000, 1, 28, 28) numpy array. This can be read as there are 60000 28 x 28 images. I want to extract every single 28 x 28 image and apply some sort of function f to it. I have tried the following:
f = lambda a, _: print a.shape
np.apply_over_axes(f, data, [2,3])
But I am unsure exactly the second axis parameter comes into play though...
I have also tried:
f = lambda a: print a.shape
np.apply_along_axis(f, 0, data)
But the shape is always (60000,) instead of what I would expect (1, 28, 28). How do I get each subimage?
There is no performance gained by using np.apply_along_axis, np.vectorize, etc. Just use a loop:
import numpy as np
s = (4,1,28,28)
a = np.zeros(s)
for img in a[:,0]:
print(img.shape)
# (28, 28)
# (28, 28)
# (28, 28)
# (28, 28)
This lambda doesn't make sense:
lambda a, _: print a.shape
it's equivalent to
def foo(a, x):
return print a.shape
print a.shape prints something, and returns nothing, maybe even an error.
lambda a,x: a.shape is better, returning the shape of a, and ignoring the x argument.
If the size 1 dimension is in the way, why not just omit it?
X_train = X_train.reshape(X_train.shape[0], 28, 28)
or remove it
X_train[:,0,...]
np.squeeze(X_train)
But what's the point of the apply_over? Just to find the shape of a set of submatrices?
In [304]: X = np.ones((6,1,2,3))
In [305]: [x.shape for x in X]
Out[305]: [(1, 2, 3), (1, 2, 3), (1, 2, 3), (1, 2, 3), (1, 2, 3), (1, 2, 3)]
or
[x.shape for x in X[:,0]]
to remove the 2nd dimension, getting just the shape of the last 2.
This apply_along_axis, iterates on the last 3 dim, passing a 1d array to the lambda. So in effect it is returning X[:,0,i,j].shape.
In [308]: np.apply_along_axis(lambda a: a.shape, 0, X)
Out[308]:
array([[[[6, 6, 6],
[6, 6, 6]]]])
Generally iterations like this aren't needed. And when used, are slow compared to 'full-array' ones.
Just working on a CNN and am stuck on a tensor algorithm.
I want to be able to iterate through a list, or tuple, of dimensions and choose a range of elements of X (a multi dimensional array) from that dimension, while leaving the other dimensions alone.
x = np.random.random((10,3,32,32)) #some multi dimensional array
dims = [2,3] #aka the 32s
#for a dimension in dims
#I want the array of numbers from i:i+window in that dimension
#something like
arr1 = x.index(i:i+3,axis = dim[0])
#returns shape 10,3,3,32
arr2 = arr1.index(i:i+3,axis = dim[1])
#returns shape 10,3,3,3
np.take should work for you (read its docs)
In [237]: x=np.ones((10,3,32,32),int)
In [238]: dims=[2,3]
In [239]: arr1=x.take(range(1,1+3), axis=dims[0])
In [240]: arr1.shape
Out[240]: (10, 3, 3, 32)
In [241]: arr2=x.take(range(1,1+3), axis=dims[1])
In [242]: arr2.shape
Out[242]: (10, 3, 32, 3)
You can try slicing with
arr1 = x[:,:,i:i+3,:]
and
arr2 = arr1[:,:,:,i:i+3]
Shape is then
>>> x[:,:,i:i+3,:].shape
(10, 3, 3, 32)
Suppose I create a 2 dimensional array
m = np.random.normal(0, 1, size=(1000, 2))
q = np.zeros(shape=(1000,1))
print m[:,0] -q
When I take m[:,0].shape I get (1000,) as opposed to (1000,1) which is what I want. How do I coerce m[:,0] to a (1000,1) array?
By selecting the 0th column in particular, as you've noticed, you reduce the dimensionality:
>>> m = np.random.normal(0, 1, size=(5, 2))
>>> m[:,0].shape
(5,)
You have a lot of options to get a 5x1 object back out. You can index using a list, rather than an integer:
>>> m[:, [0]].shape
(5, 1)
You can ask for "all the columns up to but not including 1":
>>> m[:,:1].shape
(5, 1)
Or you can use None (or np.newaxis), which is a general trick to extend the dimensions:
>>> m[:,0,None].shape
(5, 1)
>>> m[:,0][:,None].shape
(5, 1)
>>> m[:,0, None, None].shape
(5, 1, 1)
Finally, you can reshape:
>>> m[:,0].reshape(5,1).shape
(5, 1)
but I'd use one of the other methods for a case like this.