Instead of a n-dimentional array, let's take a 3D array to illustrate my question :
>>> import numpy as np
>>> arr = np.ones(24).reshape(2, 3, 4)
So I have an array of shape (2, 3, 4). I would like to concatenate/fuse the 2nd and 3rd axis together to get an array of the shape (2, 12).
Wrongly, thought I could have done it easily with np.concatenate :
>>> np.concatenate(arr, axis=1).shape
(3, 8)
I found a way to do it by a combination of np.rollaxis and np.concatenate but it is increasingly ugly as the array goes up in dimension:
>>> np.rollaxis(np.concatenate(np.rollaxis(arr, 0, 3), axis=0), 0, 2).shape
(2, 12)
Is there any simple way to accomplish this? It seems very trivial, so there must exist some function, but I cannot seem to find it.
EDIT : Indeed I could use np.reshape, which means to compute the dimensions of the axis first. Is it possible without accessing/computing the shape beforehand?
On recent python versions you can do:
anew = a.reshape(*a.shape[:k], -1, *a.shape[k+2:])
I recommend against directly assigning to .shape since it doesn't work on sufficiently noncontiguous arrays.
Let's say that you have n dimensions in your array and that you want to fuse adjacent axis i and i+1:
shape = a.shape
new_shape = list(shape[:i]) + [-1] + list(shape[i+2:])
a.shape = new_shape
Related
I'm using numpy and want to index a row without losing the dimension information.
import numpy as np
X = np.zeros((100,10))
X.shape # >> (100, 10)
xslice = X[10,:]
xslice.shape # >> (10,)
In this example xslice is now 1 dimension, but I want it to be (1,10).
In R, I would use X[10,:,drop=F]. Is there something similar in numpy. I couldn't find it in the documentation and didn't see a similar question asked.
Thanks!
Another solution is to do
X[[10],:]
or
I = array([10])
X[I,:]
The dimensionality of an array is preserved when indexing is performed by a list (or an array) of indexes. This is nice because it leaves you with the choice between keeping the dimension and squeezing.
It's probably easiest to do x[None, 10, :] or equivalently (but more readable) x[np.newaxis, 10, :]. None or np.newaxis increases the dimension of the array by 1, so that you're back to the original after the slicing eliminates a dimension.
As far as why it's not the default, personally, I find that constantly having arrays with singleton dimensions gets annoying very quickly. I'd guess the numpy devs felt the same way.
Also, numpy handle broadcasting arrays very well, so there's usually little reason to retain the dimension of the array the slice came from. If you did, then things like:
a = np.zeros((100,100,10))
b = np.zeros(100,10)
a[0,:,:] = b
either wouldn't work or would be much more difficult to implement.
(Or at least that's my guess at the numpy dev's reasoning behind dropping dimension info when slicing)
I found a few reasonable solutions.
1) use numpy.take(X,[10],0)
2) use this strange indexing X[10:11:, :]
Ideally, this should be the default. I never understood why dimensions are ever dropped. But that's a discussion for numpy...
Here's an alternative I like better. Instead of indexing with a single number, index with a range. That is, use X[10:11,:]. (Note that 10:11 does not include 11).
import numpy as np
X = np.zeros((100,10))
X.shape # >> (100, 10)
xslice = X[10:11,:]
xslice.shape # >> (1,10)
This makes it easy to understand with more dimensions too, no None juggling and figuring out which axis to use which index. Also no need to do extra bookkeeping regarding array size, just i:i+1 for any i that you would have used in regular indexing.
b = np.ones((2, 3, 4))
b.shape # >> (2, 3, 4)
b[1:2,:,:].shape # >> (1, 3, 4)
b[:, 2:3, :].shape . # >> (2, 1, 4)
To add to the solution involving indexing by lists or arrays by gnebehay, it is also possible to use tuples:
X[(10,),:]
This is especially annoying if you're indexing by an array that might be length 1 at runtime. For that case, there's np.ix_:
some_array[np.ix_(row_index,column_index)]
I've been using np.reshape to achieve the same as shown below
import numpy as np
X = np.zeros((100,10))
X.shape # >> (100, 10)
xslice = X[10,:].reshape(1, -1)
xslice.shape # >> (1, 10)
I'm using numpy and want to index a row without losing the dimension information.
import numpy as np
X = np.zeros((100,10))
X.shape # >> (100, 10)
xslice = X[10,:]
xslice.shape # >> (10,)
In this example xslice is now 1 dimension, but I want it to be (1,10).
In R, I would use X[10,:,drop=F]. Is there something similar in numpy. I couldn't find it in the documentation and didn't see a similar question asked.
Thanks!
Another solution is to do
X[[10],:]
or
I = array([10])
X[I,:]
The dimensionality of an array is preserved when indexing is performed by a list (or an array) of indexes. This is nice because it leaves you with the choice between keeping the dimension and squeezing.
It's probably easiest to do x[None, 10, :] or equivalently (but more readable) x[np.newaxis, 10, :]. None or np.newaxis increases the dimension of the array by 1, so that you're back to the original after the slicing eliminates a dimension.
As far as why it's not the default, personally, I find that constantly having arrays with singleton dimensions gets annoying very quickly. I'd guess the numpy devs felt the same way.
Also, numpy handle broadcasting arrays very well, so there's usually little reason to retain the dimension of the array the slice came from. If you did, then things like:
a = np.zeros((100,100,10))
b = np.zeros(100,10)
a[0,:,:] = b
either wouldn't work or would be much more difficult to implement.
(Or at least that's my guess at the numpy dev's reasoning behind dropping dimension info when slicing)
I found a few reasonable solutions.
1) use numpy.take(X,[10],0)
2) use this strange indexing X[10:11:, :]
Ideally, this should be the default. I never understood why dimensions are ever dropped. But that's a discussion for numpy...
Here's an alternative I like better. Instead of indexing with a single number, index with a range. That is, use X[10:11,:]. (Note that 10:11 does not include 11).
import numpy as np
X = np.zeros((100,10))
X.shape # >> (100, 10)
xslice = X[10:11,:]
xslice.shape # >> (1,10)
This makes it easy to understand with more dimensions too, no None juggling and figuring out which axis to use which index. Also no need to do extra bookkeeping regarding array size, just i:i+1 for any i that you would have used in regular indexing.
b = np.ones((2, 3, 4))
b.shape # >> (2, 3, 4)
b[1:2,:,:].shape # >> (1, 3, 4)
b[:, 2:3, :].shape . # >> (2, 1, 4)
To add to the solution involving indexing by lists or arrays by gnebehay, it is also possible to use tuples:
X[(10,),:]
This is especially annoying if you're indexing by an array that might be length 1 at runtime. For that case, there's np.ix_:
some_array[np.ix_(row_index,column_index)]
I've been using np.reshape to achieve the same as shown below
import numpy as np
X = np.zeros((100,10))
X.shape # >> (100, 10)
xslice = X[10,:].reshape(1, -1)
xslice.shape # >> (1, 10)
I have a 'row' vector cast as a numpy ndarray. I would simply like to make it a 'column' vector (I don't care too much about the type as long as it is compatible with matplotlib). Here is an example of what I'm trying:
import numpy as np
a = np.ndarray(shape=(1,4), dtype=float, order='F')
print(a.shape)
a.T #I think this performs the transpose?
print(a.shape)
The output looks like this:
(1, 4)
(1, 4)
I was hoping to get:
(1, 4)
(4, 1)
Can someone point me in the right direction? I have seen that the transpose in numpy doesn't do anything to a 1D array. But is this a 1D array?
Transposing an array does not happen in place. Writing a.T creates a view of the transpose of the array a, but this view is then lost immediately since no variable is assigned to it. a remains unchanged.
You need to write a = a.T to bind the name a to the transpose:
>>> a = a.T
>>> a.shape
(4, 1)
In your example a is indeed a 2D array. Transposing a 1D array (with shape (n,)) does not change that array at all.
you can alter the shape 'in place' which will be the same as a.T for (1,4) but see the comment by Mr E whether it's needed. i.e.
...
print(a.shape)
a.shape = (4, 1)
print(a.shape)
You probably don't want or need the singular dimension, unless you are trying to force a broadcasting operation.
Link
You can treat rank-1 arrays as either row or column vectors. dot(A,v)
treats v as a column vector, while dot(v,A) treats v as a row vector.
This can save you having to type a lot of transposes.
Provided a 1D array as a:
a=np.arange(8)
I would like it to be reproduced in a 3D scheme in order to have such shape (n1, len(a), n3).
Is there any working way to obtain this via np.tile? It seems trivial, but trying:
np.shape( np.tile(a, (n1,1,n3)) )
or
np.shape( np.tile( np.tile(a, (n1,1)), (1,1,n2) ) )
I never obtain what I need, being the resulting shapes (n1, 1, len(a)*n3) or (1, n1, len(a)*n3).
Maybe it is me not understanding how tile works ...
What's happening is that a is being made a 1x1x8 array before the tiling is applied. You'll need to make a a 1x8x1 array and then call tile.
As the documentation for tile notes:
If A.ndim < d, A is promoted to be d-dimensional by prepending
new axes. So a shape (3,) array is promoted to (1, 3) for 2-D
replication, or shape (1, 1, 3) for 3-D replication. If this is not
the desired behavior, promote A to d-dimensions manually before
calling this function.
The easiest way to get the result you're after is to slice a with None (or equivalently, np.newaxis) to make it the correct shape.
As a quick example:
import numpy as np
a = np.arange(8)
result = np.tile(a[None, :, None], (4, 1, 5))
print result.shape
I generally use MATLAB and Octave, and i recently switching to python numpy.
In numpy when I define an array like this
>>> a = np.array([[2,3],[4,5]])
it works great and size of the array is
>>> a.shape
(2, 2)
which is also same as MATLAB
But when i extract the first entire column and see the size
>>> b = a[:,0]
>>> b.shape
(2,)
I get size (2,), what is this? I expect the size to be (2,1). Perhaps i misunderstood the basic concept. Can anyone make me clear about this??
A 1D numpy array* is literally 1D - it has no size in any second dimension, whereas in MATLAB, a '1D' array is actually 2D, with a size of 1 in its second dimension.
If you want your array to have size 1 in its second dimension you can use its .reshape() method:
a = np.zeros(5,)
print(a.shape)
# (5,)
# explicitly reshape to (5, 1)
print(a.reshape(5, 1).shape)
# (5, 1)
# or use -1 in the first dimension, so that its size in that dimension is
# inferred from its total length
print(a.reshape(-1, 1).shape)
# (5, 1)
Edit
As Akavall pointed out, I should also mention np.newaxis as another method for adding a new axis to an array. Although I personally find it a bit less intuitive, one advantage of np.newaxis over .reshape() is that it allows you to add multiple new axes in an arbitrary order without explicitly specifying the shape of the output array, which is not possible with the .reshape(-1, ...) trick:
a = np.zeros((3, 4, 5))
print(a[np.newaxis, :, np.newaxis, ..., np.newaxis].shape)
# (1, 3, 1, 4, 5, 1)
np.newaxis is just an alias of None, so you could do the same thing a bit more compactly using a[None, :, None, ..., None].
* An np.matrix, on the other hand, is always 2D, and will give you the indexing behavior you are familiar with from MATLAB:
a = np.matrix([[2, 3], [4, 5]])
print(a[:, 0].shape)
# (2, 1)
For more info on the differences between arrays and matrices, see here.
Typing help(np.shape) gives some insight in to what is going on here. For starters, you can get the output you expect by typing:
b = np.array([a[:,0]])
Basically numpy defines things a little differently than MATLAB. In the numpy environment, a vector only has one dimension, and an array is a vector of vectors, so it can have more. In your first example, your array is a vector of two vectors, i.e.:
a = np.array([[vec1], [vec2]])
So a has two dimensions, and in your example the number of elements in both dimensions is the same, 2. Your array is therefore 2 by 2. When you take a slice out of this, you are reducing the number of dimensions that you have by one. In other words, you are taking a vector out of your array, and that vector only has one dimension, which also has 2 elements, but that's it. Your vector is now 2 by _. There is nothing in the second spot because the vector is not defined there.
You could think of it in terms of spaces too. Your first array is in the space R^(2x2) and your second vector is in the space R^(2). This means that the array is defined on a different (and bigger) space than the vector.
That was a lot to basically say that you took a slice out of your array, and unlike MATLAB, numpy does not represent vectors (1 dimensional) in the same way as it does arrays (2 or more dimensions).