How to eliminate a dummy dimension of ndarray? - python

How can I eliminate a dummy dimension in python numpy ndarray?
For example, suppose that A.shape = (0, 1325, 3),
then how can eliminate '0' dimension so that A.shape = (1325,3).
Both 'np.sqeeze(A)' or 'A.reshape(A.shape[1:])' don't work.

You can't eliminate that 0 dimension. A dimension of length 0 is not a "dummy" dimension. It really means length 0. Since the total number of elements in the array (which you can check with a.size) is the product of the shape attribute, an array with shape (0, 1325, 3) contains 0 elements, while an array with shape (1325, 3) contains 3975 elements. If there was a way to eliminate the 0 dimension, where would that data come from?
If your array is supposed to contain data, then you probably need to look at how that array was created in the first place.

Related

Inserting an array inside the list of arrays

I have an issue with numpy arrays and I can't understand what I am doing wrong. I need to create a 100x100 matrix with random int (non zero) and the last row should be the combination of all previous rows. Here is my code:
non_zero_m = np.random.randint(0,10,(99,100))
arr = non_zero_m.sum(axis=0)
singular_m = np.concatenate((non_zero_m, arr))
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)
I can't understand why python shows that arrays has different dimensions
The problem is that arr is a 1-dimensional array, and you are trying to concatenate it to a matrix (2-dimensional).
Just replace the second line with:
arr = non_zero_m.sum(axis=0).reshape(1, -1)
This reshapes arr to a 2-dimensonal array, such that the first axis has dimension 1 (thus making arr effectively a row vector), and the second axis has the required dimension to keep all of arr's elements (this is the meaning of -1 in this context).

Appending matricies into a single matrix with numpy

I have a function in Python that returns a numpy.mat of shape (100, 1). I am calling this function 4 times in a loop and would like to take the resulting 4 matricies and create a matrix of shape (100, 4). I have looked for sometime at numpy.append, numpy.concatenate, and numpy.insert but have not been able to get this working.
Here is a short SSCCE of my issue
zeros = np.zeros(shape=(100, 4))
for i in range(1, 5):
np.append(zeros, np.empty(shape=(100, 1)))
print(zeros)
Where zeros should results in a matrix of shape (100, 4) with "junk" values from each of the calls to numpy.empty and not all 0..
Do something along these lines -
zeros = np.zeros(shape=(100, 4))
for i in range(1, 5):
data = np.random.rand(100,1) # func that returns (100,1) shaped array
zeros[:,i-1] = data.ravel()
In place of ravel(), we could also use : data[:,0] or np.squeeze(data), basic idea is to feed a 1D array there, because the LHS zeros[:,i-1] expects a 1D array there.
As an alternative, inside the loop, we could also do -
zeros[:,[i-1]] = data
Thus, with that list of column index [i-1] instead of i-1, we are keeping the dimensions into which data is to be assigned (keeps as 2D) and that allows us to feed in data, which is also 2D without any change.

What is the best way to do multi-dimensional indexing with numpy?

I am trying to do some indexing on a 3D numpy array.
Basically I have an array phi which has shape (F,A,D); for example (5, 3, 7). Generated, for example as follows:
F=5; A=3; D=7; phi = np.random.random((F,A,D))
My goal is to be able to index over A and D, with a 2D array such as [[0,1,2],[5,5,6]], which means take the values indexed by 0 in the 3rd dimension, for the the first position in A, the values indexed by 1 in the 3rd dimension for the second position of A and so on. The result should have a shape that is (F,A,2) or (F,2,A).
This would be equivalent to manually cycling all the values of the "indexer array" such as:
phi[:,0,0]; phi[:,1,1]; phi[:,2,2]
phi[:,0,5]; phi[:,1,5]; phi[:,2,6]
Intuitively I would do something like phi[:,:,[[0,1,2],[3,3,3]]], but it's shape ends up being (5, 3, 2, 3).
Any ideas on how to obtain the correct result?
I think this is what you want
phi[:,range(A),[[0,1,2],[5,5,6]]]
Your attempt
phi[:,:,[[0,1,2],[5,5,6]]]
takes the values along the third dimension for every values of the first two dimensions, therefore you end up with a shape of (5,3,2,3).
However, according to your example you want a continous increase in the second dimension which is accomplished in my code by range(A) and numpy's broadcasting.

Element wise comparison between 1D and 2D array

Want to perform an element wise comparison between an 1D and 2D array. Each element of the 1D array need to be compared (e.g. greater) against the corresponding row of 2D and a mask will be created. Here is an example:
A = np.random.choice(np.arange(0, 10), (4,100)).astype(np.float)
B = np.array([5., 4., 8., 2. ])
I want to do
A<B
so that first row of A will be compared against B[0] which is 5. and the result will be an boolean array.
If I try this I get:
operands could not be broadcast together with shapes (4,100) (4,)
Any ideas?
You need to insert an extra dimension into array B:
A < B[:, None]
This allows NumPy to properly match up the two shapes for broadcasting; B now has shape (4, 1) and the dimensions can be paired up:
(4, 100)
(4, 1)
The rule is that either the dimensions have the same length, or one of the lengths needs to be 1; here 100 can be paired with 1, and 4 can be paired with 4. Before the new dimension was inserted, NumPy tried to pair 100 with 4 which raised the error.

Confusion in array operation in numpy

I generally use MATLAB and Octave, and i recently switching to python numpy.
In numpy when I define an array like this
>>> a = np.array([[2,3],[4,5]])
it works great and size of the array is
>>> a.shape
(2, 2)
which is also same as MATLAB
But when i extract the first entire column and see the size
>>> b = a[:,0]
>>> b.shape
(2,)
I get size (2,), what is this? I expect the size to be (2,1). Perhaps i misunderstood the basic concept. Can anyone make me clear about this??
A 1D numpy array* is literally 1D - it has no size in any second dimension, whereas in MATLAB, a '1D' array is actually 2D, with a size of 1 in its second dimension.
If you want your array to have size 1 in its second dimension you can use its .reshape() method:
a = np.zeros(5,)
print(a.shape)
# (5,)
# explicitly reshape to (5, 1)
print(a.reshape(5, 1).shape)
# (5, 1)
# or use -1 in the first dimension, so that its size in that dimension is
# inferred from its total length
print(a.reshape(-1, 1).shape)
# (5, 1)
Edit
As Akavall pointed out, I should also mention np.newaxis as another method for adding a new axis to an array. Although I personally find it a bit less intuitive, one advantage of np.newaxis over .reshape() is that it allows you to add multiple new axes in an arbitrary order without explicitly specifying the shape of the output array, which is not possible with the .reshape(-1, ...) trick:
a = np.zeros((3, 4, 5))
print(a[np.newaxis, :, np.newaxis, ..., np.newaxis].shape)
# (1, 3, 1, 4, 5, 1)
np.newaxis is just an alias of None, so you could do the same thing a bit more compactly using a[None, :, None, ..., None].
* An np.matrix, on the other hand, is always 2D, and will give you the indexing behavior you are familiar with from MATLAB:
a = np.matrix([[2, 3], [4, 5]])
print(a[:, 0].shape)
# (2, 1)
For more info on the differences between arrays and matrices, see here.
Typing help(np.shape) gives some insight in to what is going on here. For starters, you can get the output you expect by typing:
b = np.array([a[:,0]])
Basically numpy defines things a little differently than MATLAB. In the numpy environment, a vector only has one dimension, and an array is a vector of vectors, so it can have more. In your first example, your array is a vector of two vectors, i.e.:
a = np.array([[vec1], [vec2]])
So a has two dimensions, and in your example the number of elements in both dimensions is the same, 2. Your array is therefore 2 by 2. When you take a slice out of this, you are reducing the number of dimensions that you have by one. In other words, you are taking a vector out of your array, and that vector only has one dimension, which also has 2 elements, but that's it. Your vector is now 2 by _. There is nothing in the second spot because the vector is not defined there.
You could think of it in terms of spaces too. Your first array is in the space R^(2x2) and your second vector is in the space R^(2). This means that the array is defined on a different (and bigger) space than the vector.
That was a lot to basically say that you took a slice out of your array, and unlike MATLAB, numpy does not represent vectors (1 dimensional) in the same way as it does arrays (2 or more dimensions).

Categories