How to reshape numpy array of numpy arrays? [duplicate] - python

I'm trying to turn a list of 2d numpy arrays into a 2d numpy array. For example,
dat_list = []
for i in range(10):
dat_list.append(np.zeros([5, 10]))
What I would like to get out of this list is an array that is (50, 10). However, when I try the following, I get a (10,5,10) array.
output = np.array(dat_list)
Thoughts?

you want to stack them:
np.vstack(dat_list)

Above accepted answer is correct for 2D arrays as you requested. For 3D input arrays though, vstack() will give you a surprising outcome. For those, use stack(<list of 3D arrays>, 0).

See https://docs.scipy.org/doc/numpy/reference/generated/numpy.append.html
for details. You can use append, but will want to specify the axis on which to append.
dat_list.append(np.zeros([5, 10]),axis=0)

Related

python:numpy array slicing of 1D array

I initialise an array as a=numpy.array([1,2,3]).
on running the statement print(a[0,:]), it shows an error. Does this slicing method only work for 2d arrays?
Just replace "a[0,:]" with "a[0:]".
import numpy as np
a = np.array([1, 2, 3])
print(a[0:])
You could solve this issue with
a = a[np.newaxis, :]
before printing, making it to a 1 x 3 array instead of having shape (3,). Obviously this only makes sense, if you need your printing statement for other multidimensional arrays also and want to make it work in a generalized way.

Transpose a 1-dimensional array in Numpy without casting to matrix

My goal is to to turn a row vector into a column vector and vice versa. The documentation for numpy.ndarray.transpose says:
For a 1-D array, this has no effect. (To change between column and row vectors, first cast the 1-D array into a matrix object.)
However, when I try this:
my_array = np.array([1,2,3])
my_array_T = np.transpose(np.matrix(myArray))
I do get the wanted result, albeit in matrix form (matrix([[66],[640],[44]])), but I also get this warning:
PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
my_array_T = np.transpose(np.matrix(my_array))
How can I properly transpose an ndarray then?
A 1D array is itself once transposed, contrary to Matlab where a 1D array doesn't exist and is at least 2D.
What you want is to reshape it:
my_array.reshape(-1, 1)
Or:
my_array.reshape(1, -1)
Depending on what kind of vector you want (column or row vector).
The -1 is a broadcast-like, using all possible elements, and the 1 creates the second required dimension.
If your array is my_array and you want to convert it to a column vector you can do:
my_array.reshape(-1, 1)
For a row vector you can use
my_array.reshape(1, -1)
Both of these can also be transposed and that would work as expected.
IIUC, use reshape
my_array.reshape(my_array.size, -1)

Reshaping numpy (n, )

I am working with multiple numpy objects that are numpy lists when each list element contains a 2d array. Strangely the shape() function does not reflect this returning only the number of samples overall.
x_train.shape, x_test.shape, x_test.iloc[0].shape
#((22507,), (5627,), (25, 100))
This code snippet accomplishes the task but I am wondering if there is a better/numpy way to accomplish this.
x = []
[x.append(item) for item in x_train]
_np.array(x).shape
# (22507, 25, 100)
I have searched through stack overflow and although there are many reshape questions I have not seen one that can solve this problem efficiently.
A simple and efficient way to convert your 1D dtype=object array of 2D arrays is:
np.stack(x_train)
But it would be more efficient to load the original data into a 3D array in the first place.

Applying a multi-dimensional function over multi-dimensional array (Python, Numpy)

I have a question how to efficiently apply a function which takes an m-dimensional slice of a n-dimensional array as an input.
For example, I have a n-dimensional array of shape (i,j,k,l). And on the dimensions (j,l), I want to apply the function, which gives me back a matrix of shape (j,l). The resulting numpy array should again have the shape (i,j,k,l).
For example I want to apply the following, normalisation function
def norm(arr2d):
return arr2d - np.mean(arr2d)
over the array
arrnd = np.arange(2*3*4*5).reshape(2,3,4,5) # Shape is (2,3,4,5)
on the slice (j,l).
The result I want to achieve I would get via a (slow?) Python list comprehension and moving axes.
result = np.asarray([ [ f(arrnd[:,j,:,l]) for l in range(5) ] for j in range(3)]) # Shape is (3,5,2,4)
result = np.moveaxis(np.moveaxis(result,2,0),2,3).shape # Shape is (2,3,4,5) again
Is there any better, more "numpyic" way to achieve this, without any involved loops?
I alreay looked at np.apply_along_axis() and np.apply_over_axes() but the former only works for 1-d functions, and the latter might only work, if my function is implemented as a ufunc.
The example I provided is just a toy example. The solution should work for any python function.
((If normalising a slice would be my specific problem, I could have circumenvented the python loop and moveaxis by using the ufunc's axes=(..).))

Multiple Element Indexing in multi-dimensional array

I have a 3d Numpy array and would like to take the mean over one axis considering certain elements from the other two dimensions.
This is an example code depicting my problem:
import numpy as np
myarray = np.random.random((5,10,30))
yy = [1,2,3,4]
xx = [20,21,22,23,24,25,26,27,28,29]
mymean = [ np.mean(myarray[t,yy,xx]) for t in np.arange(5) ]
However, this results in:
ValueError: shape mismatch: objects cannot be broadcast to a single shape
Why does an indexing like e.g. myarray[:,[1,2,3,4],[1,2,3,4]] work, but not my code above?
This is how you fancy-index over more than one dimension:
>>> np.mean(myarray[np.arange(5)[:, None, None], np.array(yy)[:, None], xx],
axis=(-1, -2))
array([ 0.49482768, 0.53013301, 0.4485054 , 0.49516017, 0.47034123])
When you use fancy indexing, i.e. a list or array as an index, over more than one dimension, numpy broadcasts those arrays to a common shape, and uses them to index the array. You need to add those extra dimensions of length 1 at the end of the first indexing arrays, for the broadcast to work properly. Here are the rules of the game.
Since you use consecutive elements you can use a slice:
import numpy as np
myarray = np.random.random((5,10,30))
yy = slice(1,5)
xx = slice(20, 30)
mymean = [np.mean(myarray[t, yy, xx]) for t in np.arange(5)]
To answer your question about why it doesn't work: when you use lists/arrays as indices, Numpy uses a different set of indexing semantics than it does if you use slices. You can see the full story in the documentation and, as that page says, it "can be somewhat mind-boggling".
If you want to do it for nonconsecutive elements, you must grok that complex indexing mechanism.

Categories