Testing Numpy array to see if it is in column form - python

In my function, sometimes I get a result that is a 1 dimensional numpy array in 2D form, so that it's shape is nx1 (n,1). Other times, I might get it in the form 1xn array.shape = (1,n)
Other times, I get just a numpy array whose shape is (n,).
when I run the following tests, I get an error on the one hand, and a false positive on the other (since the length of a shape attribute is always greater than 1, apparently):
y_predicted = forest.predict(testX)
if y_predicted.shape[1] != None:
y_predicted = y_predicted.T[0]
and
y_predicted = forest.predict(testX)
if len(y_predicted.shape) > 1:
y_predicted = y_predicted.T[0]
I just need to make sure the final shape of y is always in the form (n,) rather than (n,1) or (1,n)...

You should use numpy.squeeze:
numpy.squeeze(a) removes single-dimensional entries from the shape of an array.
Example:
>>> x = np.array([[1,2,3]])
>>> x.shape
(1, 3)
>>> np.squeeze(x)
array([1, 2, 3])
>>> np.squeeze(x).shape
(3,)

Related

What is the proper way to to get dot product of two N-D (3-D) matrices using numpy?

I want to get dot product of two arrays along the batch dimension. np.dot gave a super weird result. Let suppose I have a batch of size 2. So what would be the proper way to get the results?
X = np.random.randn(2,3,4)
X_t = np.transpose(X,axes=[0,2,1]) # shape now is [2,4,3]
np.matmul(X,X_t) # shape is [2,3,3]
np.dot(X,X_t) # shape is [2,3,2,3] SUPER Weird
np.einsum('ijk,ikl->ijl',X,X_t) # Dimension as [2,3,3] Same as Matmul()
What is the correct way of matrix multiplication for conditions like these?
Use # operator. It reduces the first (0th) dimention.
Matmul for other dims.
import numpy as np
x = np.random.randn(2, 3, 4)
x_t = np.transpose(x, axes=[0, 2, 1]) # shape now is [2,4,3]
wrong = np.dot(x, x_t) # shape is [2,3,2,3] SUPER Weird
res = x # x_t
print(res.shape)
print(wrong.shape)
out:
(2, 3, 3)
(2, 3, 2, 3)

python difference between array(10,1) array(10,)

I'm trying to load MNIST dataset into arrays.
When I use
(X_train, y_train), (X_test, y_test)= mnist.load_data()
I get an array y_test(10000,) but I want it to be in the shape of (10000,1).
What is the difference between array(10000,1) and array(10000,)?
How can I convert the first array to the second array?
Your first Array with shape (10000,) is a 1-Dimensional np.ndarray.
Since the shape attribute of numpy Arrays is a Tuple and a tuple of length 1 needs a trailing comma the shape is (10000,) and not (10000) (which would be an int). So currently your data looks like this:
import numpy as np
a = np.arange(5) # >>> array([0, 1, 2, 3, 4]
print(a.shape) # >>> (5,)
What you want is an 2-Dimensional array with shape of (10000, 1).
Adding a dimension of length 1 doesn't require any additional data, it is basically and "empty" dimension. To add an dimension to an existing array you can use either np.expand_dims() or np.reshape().
Using np.expand_dims:
import numpy as np
b = np.array(np.arange(5)) # >>> array([0, 1, 2, 3, 4])
b = np.expand_dims(b, axis=1) # >>> array([[0],[1],[2],[3],[4]])
The function was specifically made for the purpose of adding empty dimensions to arrays. The axis keyword specifies which position the newly added dimension will occupy.
Using np.reshape:
import numpy as np
a = np.arange(5)
X_test_reshaped = np.reshape(a, shape=[-1, 1]) # >>> array([[0],[1],[2],[3],[4]])
The shape=[-1, 1] specifies how the new shape should look like after the reshape operation. The -1 itself will be replaced by the shape that 'fits the data' by numpy internally.
Reshape is a more powerful function than expand_dims and can be used in many different ways. You can read more on other uses of it in the numpy docs. numpy.reshape()
An array with a size of (10,1) is a 2D array containing empty columns.
An array with a size of (10,) is a 1D array.
To convert (10,1) to (10,), you can simply collapse the columns. For example, we take the x array, which has x.shape = (10,1). now using x[:,] you can collapse the columns and x[:,].shape = (10,).
To convert (10,) to (10,1), you can add a dimension by using np.newaxis. So, after import numpy as np, assuming we are using numpy arrays here. Take a y array for example, which has y.shape = (10,). Using y[:, np.newaxis], you can a new array with the shape of (10,1).

Numpy Append Matrix to Tensor

I am trying to build a list of matrices using numpy, but when I try to append a matrix to an empty tensor, I get the error:
ValueError: all the input arrays must have same number of dimensions
Concatenate and append both seem to fail. I tried calling:
tensor = np.concatenate((tensor, matrix), axis=0)
and
tensor = np.append(tensor, matrix, axis=0)
but I get the same error either way.
The tensor starts with a size of [0, h, w], and the matrix is of size [h, w]. The matrix is the correct shape in the direction I want to append to, but it won't seem to attach.
It seems matrix would representing the incoming ones, while you accumulate those into tensor. So, to solve it, add a new axis with None/np.newaxis as the leading one to matrix and then concatenate with tensor -
np.concatenate((tensor, matrix[None]),axis=0)
If you are accumulating, store it back into tensor.
Or use np.vstack((tensor, matrix[None])).
Sample run -
In [16]: h,w = 3,4
...: a = np.random.rand(0,h,w)
...: b = np.random.rand(h,w)
In [17]: np.concatenate((a, b[None]),axis=0).shape
Out[17]: (1, 3, 4)

How to get these shapes to line up for a numpy matrix

I'm trying to input vectors into a numpy matrix by doing:
eigvec[:,i] = null
However I keep getting the error:
ValueError: could not broadcast input array from shape (20,1) into shape (20)
I've tried using flatten and reshape, but nothing seems to work
The shapes in the error message are a good clue.
In [161]: x = np.zeros((10,10))
In [162]: x[:,1] = np.ones((1,10)) # or x[:,1] = np.ones(10)
In [163]: x[:,1] = np.ones((10,1))
...
ValueError: could not broadcast input array from shape (10,1) into shape (10)
In [166]: x[:,1].shape
Out[166]: (10,)
In [167]: x[:,[1]].shape
Out[167]: (10, 1)
In [168]: x[:,[1]] = np.ones((10,1))
When the shape of the destination matches the shape of the new value, the copy works. It also works in some cases where the new value can be 'broadcasted' to fit. But it does not try more general reshaping. Also note that indexing with a scalar reduces the dimension.
I can guess that
eigvec[:,i] = null.flat
would work (however, null.flatten() should work too). In fact, it looks like NumPy complains because of you are assigning a pseudo-1D array (shape (20, 1)) to a 1D array which is considered to be oriented differently (shape (1, 20), if you wish).
Another solution would be:
eigvec[:,i] = null.T
where you properly transpose the "vector" null.
The fundamental point here is that NumPy has "broadcasting" rules for converting between arrays with different numbers of dimensions. In the case of conversions between 2D and 1D, a 1D array of size n is broadcast into a 2D array of shape (1, n) (and not (n, 1)). More generally, missing dimensions are added to the left of the original dimensions.
The observed error message basically said that shapes (20,) and (20, 1) are not compatible: this is because (20,) becomes (1, 20) (and not (20, 1)). In fact, one is a column matrix, while the other is a row matrix.

Force 2-dimensionality in vector

When I do p = np.zeros((3,1)) I get a matrix in the shape (3, 1).
Sometimes when I am working with NumPy arrays that are nx1, however, I get that their shape is (3,).
How can I make these (3,) shaped arrays into (3,1)?
i.e. here is a minimum runnable program:
a = np.random.randn(3)
>>a.shape
(3,)
I want it to be (3,1). I know I could just call with arguments 3,1 but this is just an example, sometimes I can't control the generative process but only manipulate the output.
Just check the shape and add another axis if needed:
if len(a.shape) == 1:
a = a[..., np.newaxis]
# or this, if you need more generality:
a = a.reshape(a.shape + (1,) * (desired_dimensions - len(a.shape)))
There's an np.atleast_2d function, but it would produce a 1-by-3 array instead of 3-by-1.

Categories