Given a NumPy array:
array = np.random.randn(1,1,2)
print(array.shape)
# (1, 1, 2)
calling tuple() on it eats the first dimension:
tupled_array = tuple(array)
print(tupled_array[0].shape)
# (1, 2) <- why?
I am curious about why?
If we wrap the NumPy array with a list:
tupled_list_array = tuple([array])
print(tupled_list_array[0].shape)
# (1, 1, 2)
tuple() extracts elements based on the first dimension. np.random.randn(1,1,2) is a 1x1x2 matrix. tuple() turns it into the following tuple: (1x2 matrix, )
On the other hand, if you use np.random.randn(2,1,1), tuple() turns it into: (1x1 matrix, 1x1 matrix)
Related
I have 2 arrays. Call them 'A' and 'B'.
The shape of array 'A' is (10,10,3), and the shape of array 'B' is (10,10).
Now, I have acquired the coordinates of certain elements of array 'B' as a list of tuples. Let's call the list 'TupleList'.
I want to now make the values of all elements in array 'A' equal to 0, except for those elements present at the coordinates in 'TupleList'.
Imagine the arrays are image arrays. Where A being an RGB image has the 3rd dimension.
How can I do so?
I am having trouble doing this because of the extra 3rd dimension that array 'A' has. Otherwise, it is quite straightforward with the use of np.where(), given that I know the limited number of values that array 'B' can take.
Here is a solution by transposing the indices list into list of indices for the first dimension and list of indices for the second dimension:
import numpy as np
nrows, ncols = 5, 5
arr = np.arange(nrows * ncols * 3, dtype=float).reshape(nrows, ncols, 3)
idxs = [(0, 1), (1, 3), (3, 2), (4, 2)]
idxs_dim0, idxs_dim1 = zip(*idxs)
res = np.zeros(arr.shape)
res[idxs_dim0, idxs_dim1] = arr[idxs_dim0, idxs_dim1]
Checking the result:
from itertools import product
for idx_dim0, idx_dim1 in product(range(nrows), range(ncols)):
value = res[idx_dim0, idx_dim1]
if (idx_dim0, idx_dim1) in idxs:
assert np.all(value == arr[idx_dim0, idx_dim1])
else:
assert np.all(value == (0, 0, 0))
I want to create a 3D np.array named output of varying size. An array of size (5,a,b); with a and b varying (b decreasing):
(a,b) = (1000,20)
(a,b) = (1000,19)
(a,b) = (1000,18)
(a,b) = (1000,17)
(a,b) = (1000,16)
I could create an array of arrays in order to do so, but later on I want to get the first column of all the arrays (without a loop) then I cannot use:
output[:,:,0]
Concatenating them wont work also, it asks for the same size of the arrays...
Any alternatives to be able to have a varying single array instead of an array of arrays?
Thanks!
Like #Divakar said, create an empty array with type object and assign the different sized arrays to their respective indices.
import numpy as np
arrs = [np.ones((5, i, 10 - i)) for i in range(10)]
arrs[0].shape
(5, 0, 10)
arrs[1].shape
(5, 1, 9)
out = np.emtpy(len(arrs), dtype=object)
out[:] = arrs
out[0].shape
(5, 0, 10)
out[1].shape
(5, 1, 9)
Maybe you could make a list and add this 5 arrays.
Instead of a n-dimentional array, let's take a 3D array to illustrate my question :
>>> import numpy as np
>>> arr = np.ones(24).reshape(2, 3, 4)
So I have an array of shape (2, 3, 4). I would like to concatenate/fuse the 2nd and 3rd axis together to get an array of the shape (2, 12).
Wrongly, thought I could have done it easily with np.concatenate :
>>> np.concatenate(arr, axis=1).shape
(3, 8)
I found a way to do it by a combination of np.rollaxis and np.concatenate but it is increasingly ugly as the array goes up in dimension:
>>> np.rollaxis(np.concatenate(np.rollaxis(arr, 0, 3), axis=0), 0, 2).shape
(2, 12)
Is there any simple way to accomplish this? It seems very trivial, so there must exist some function, but I cannot seem to find it.
EDIT : Indeed I could use np.reshape, which means to compute the dimensions of the axis first. Is it possible without accessing/computing the shape beforehand?
On recent python versions you can do:
anew = a.reshape(*a.shape[:k], -1, *a.shape[k+2:])
I recommend against directly assigning to .shape since it doesn't work on sufficiently noncontiguous arrays.
Let's say that you have n dimensions in your array and that you want to fuse adjacent axis i and i+1:
shape = a.shape
new_shape = list(shape[:i]) + [-1] + list(shape[i+2:])
a.shape = new_shape
I have simple array like this
x = np.array([1,2,3,4])
In [3]: x.shape
Out[3]: (4,)
But I don't want shape to return (4,), but (4,1). How can I achieve this?
Generally in Numpy you would declare a matrix or vector using two square brackets. It's common misconception to use single square brackets for single dimensional matrix or vector.
Here is an example:
a = np.array([[1,2,3,4], [5,6,7,8]])
a.shape # (2,4) -> Multi-Dimensional Matrix
In similar way if I want single dimensional matrix then just remove the data not the outer square bracket.
a = np.array([[1,2,3,4]])
a.shape # (1,4) -> Row Matrix
b = np.array([[1], [2], [3], [4]])
b.shape # (4, 1) -> Column Matrix
When you use single square brackets, it's likely to give some odd dimensions.
Always enclose your data within another square bracket for such single dimensional matrix (like you are entering the data for multi-dimensional matrix) without data for those extra dimensions.
Also: You could also always reshape
x = np.array([1,2,3,4])
x = x.reshape(4,1)
x.shape # (4,1)
One Line:
x = np.array([1,2,3,4]).reshape(4,1)
x.shape # (4,1)
If you want a column vector use
x2 = x[:, np.newaxis]
x2.shape # (4, 1)
Alternatively, you could reshape the array yourself:
arr1 = np.array([1,2,3,4])
print arr1.shape
# (4,)
arr2 = arr1.reshape((4,1))
print arr2.shape
# (4, 1)
You could of course reshape the array when you create it:
arr1 = np.array([1,2,3,4]).reshape((4,1))
If you want to change the array in place as suggested by #FHTMitchell in the comments:
arr1.resize((4, 1))
Below achieves what you want. However, I strongly suggest you look at why exactly you need shape to return (4, 1). Most matrix-type operations are possible without this explicit casting.
x = np.array([1,2,3,4])
y = np.matrix(x)
z = y.T
x.shape # (4,)
y.shape # (1, 4)
z.shape # (4, 1)
You can use zip to transpose at python (non-numpy) level:
>>> a = [1, 2, 3, 4]
>>>
>>> *zip(a),
((1,), (2,), (3,), (4,))
>>>
>>> import numpy as np
>>> np.array([*zip(a)])
array([[1],
[2],
[3],
[4]])
Please note that while this is convenient in terms of key strokes it is a bit wasteful given that a tuple object has to be constructed for every list element whereas reshaping an array comes essentially for free. So do not use this on long lists.
The NumPy indexing docs say that
Ellipsis expand to the number of : objects needed to make a selection
tuple of the same length as x.ndim.
However, this seems to hold only when the other indexing arguments are ints and slice objects. For example, None doesn't seem to count towards the selection tuple length for the purposes of Ellipsis:
>>> import numpy
>>> numpy.zeros([2, 2]).shape
(2, 2)
>>> numpy.zeros([2, 2])[..., None].shape
(2, 2, 1)
>>> numpy.zeros([2, 2])[:, None].shape
(2, 1, 2)
>>> numpy.zeros([2, 2])[:, :, None].shape
(2, 2, 1)
Similar odd effects can be observed with boolean indexes, which may count as multiple tuple elements or none at all.
How does NumPy expand Ellipsis in the general case?
Ellipsis does expand to be equivalent to a number of :s, but that number is not always whatever makes the selection tuple length match the array's ndim. Rather, it expands to enough :s for the selection tuple to use every dimension of the array.
In most NumPy indexing, each element of the selection tuple matches up to some dimension of the original array. For example, in
>>> x = numpy.arange(9).reshape([3, 3])
>>> x[1, :]
array([3, 4, 5])
the 1 matches up to the first dimension of x, and the : matches up to the second dimension. The 1 and the : use those dimensions.
Indexing elements don't always use exactly one array dimension, though. If an indexing element corresponds to no input dimensions, or multiple input dimensions, that indexing element will use that many dimensions of the input. For example, None creates a new dimension in the output not corresponding to any dimension of the input. None doesn't use an input dimension, which is why
numpy.zeros([2, 2])[..., None]
expands to
numpy.zeros([2, 2])[:, :, None]
instead of numpy.zeros([2, 2])[:, None].
Similarly, a boolean index uses a number of dimensions corresponding to the number of dimensions of the boolean index itself. For example, a boolean scalar index uses none:
>>> x[..., False].shape
(3, 3, 0)
>>> x[:, False].shape
(3, 0, 3)
>>> x[:, :, False].shape
(3, 3, 0)
And in the common case of a boolean array index with the same shape as the array it's indexing, the boolean array will use every dimension of the other array, and inserting a ... will do nothing:
>>> x.shape
(3, 3)
>>> (x < 5).shape
(3, 3)
>>> x[x<5]
array([0, 1, 2, 3, 4])
>>> x[..., x<5]
array([0, 1, 2, 3, 4])
If you want to see the source code that handles ... expansion and used dimension calculation, it's in the NumPy github repository in the prepare_index function under numpy/core/src/multiarray/mapping.c. Look for the used_ndim variable.