Is this behaviour of NDArray correct?

Is this behaviour of NDArray correct? - python

I feel the behaviour of ndarray object is incorrect. I created one using the line of code below
c = np.ones((5,0), dtype=np.int32)
Some of the commands and outputs are given below
print(c)
[]
c
array([], shape=(5, 0), dtype=int32)
c[0]
array([], dtype=int32)
print(c[0])
[]
It's like empty array contains empty array. I can assign values but this value is lost, it doesn't show.
print(c)
[]
c.shape
(5, 0)
c[0]=10
print(c)
[]
print(c[0])
[]
What does (5,0) array mean? What is the difference between a and c?
a = np.ones((5,), dtype=np.int32)
c = np.ones((5,0), dtype=np.int32)
I am sorry I am new to Python so my knowledge is very basic.

Welcome to python. There seems to be some misconception about shape of an array, in particular the shape of a 1D array ( shapes of the form (n,). You see the shape (n,) corresponds to a 1 dimensional numpy array. If you are familiar with linear algebra, then this 1D array is analogous to a row vector. IT IS NOT the same thing as (n,0). What (n, m) represents is the shape of a 2D numpy array ( which is analagous to a matrix in linear algebra). Therefore saying an array has a shape (n,0) relates to an array with n rows but each row would have 0 columns therefore you are returned an empty array. If you do infact want a vector of ones you can type np.ones((5,)). Hope it helps. Comment if you require any further help.

In [43]: c = np.ones((5,0), dtype=np.int32)
In [44]: c
Out[44]: array([], shape=(5, 0), dtype=int32)
In [45]: c.size
Out[45]: 0
In [46]: np.ones(5).size
Out[46]: 5
The size, or number of elements of an array is the product of its shape. For c that 5*0 = 0. c is a 2d array that contains nothing.
If I try to assign a value to a column of c I get an error:
In [49]: c[:,0]=10
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-49-7baeeb61be4e> in <module>
----> 1 c[:,0]=10
IndexError: index 0 is out of bounds for axis 1 with size 0
Your assignment:
In [51]: c[0] = 10
is actually:
In [52]: c[0,:] = np.array(10)
That works because the c[0,:].shape is (0,), and an array with shape () or (1,) can be 'broadcast' to that target. That's a tricky case of broadcasting.
A more instructive case of assignment to c is where we try to assign 2 values:
In [57]: c[[0,1],:] = np.array([1,2])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-57-9ad1339293df> in <module>
----> 1 c[[0,1],:] = np.array([1,2])
ValueError: shape mismatch: value array of shape (2,) could not be broadcast to indexing result of shape (2,0)
The source array is 1d with shape (2,). The target is 2d with shape (2,0).
In general arrays with a 0 in the shape are confusing, and shouldn't be created. They some sometimes arise when indexing other arrays. But don't make one np.zeros((n,0)) except as an experiment.

Related

Paddin new axis in 3d matrix returns error

What I need to do is to extend a 2D matrix to 3D and fill the 3rd axis with an arbitrary number of zero. The error returned is:
all the input arrays must have same number of dimensions, but the
array at index 0 has 3 dimension(s) and the array at index 1 has 0
dimension(s)
What should I correct?
import numpy as np
kernel = np.ones((3,3)) / 9
kernel = kernel[..., None]
print(type(kernel))
print(np.shape(kernel))
print(kernel)
i = 1
for i in range(27):
np.append(kernel, 0, axis = 2)
print(kernel)

What should I use instead of np.append()?
Use concatenate():
import numpy as np
kernel = np.ones((3,3)) / 9
kernel = kernel[..., None]
print(type(kernel))
print(np.shape(kernel))
print(kernel)
print('-----------------------------')
append_values = np.zeros((3,3))
append_values = append_values[..., None]
i = 1
for i in range(2):
kernel = np.concatenate((kernel, append_values), axis=2)
print(kernel.shape)
print(kernel)
But best generate the append_values array already with the required shape in the third dimension to avoid looping:
append_values = np.zeros((3,3,2)) # or (3,3,27)
kernel = np.concatenate((kernel, append_values), axis=2)
print(kernel.shape)
print(kernel)

Look at the output and error - full error, not just a piece!
<class 'numpy.ndarray'>
(3, 3, 1)
...
In [94]: np.append(kernel, 0, axis = 2)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [94], in <cell line: 1>()
----> 1 np.append(kernel, 0, axis = 2)
File <__array_function__ internals>:5, in append(*args, **kwargs)
File ~\anaconda3\lib\site-packages\numpy\lib\function_base.py:4817, in append(arr, values, axis)
4815 values = ravel(values)
4816 axis = arr.ndim-1
-> 4817 return concatenate((arr, values), axis=axis)
File <__array_function__ internals>:5, in concatenate(*args, **kwargs)
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 3 dimension(s) and the array at index 1 has 0 dimension(s)
As your shape shows kernel is 3d (3,3,1). np.append takes the scalar 0, and makes an array, np.array(0), and calls concatenate. concatenate, if you take time to read its docs, requires matching numbers of dimensions.
But my main beef with your code was that you used np.append without capturing the result. Again, if you take time to read the docs, you'll realize that np.append does not work in-place. It does NOT modify kernel. When it works, it returns a new array. And doing that repeatedly in a loop is inefficient.
It looks like you took the list append model, applied it without much thought, to arrays. That's not how to code numpy.
As the other answer shows, doing one concatenate with a (3,3,27) array of 0s is the way to go if you want to make a (3,3,28) array.
Alternatively make a (3,3,28) array of 0s, and copy the one (3,3,1) array to the appropriate column.

Vstack of two arrays with same number of rows gives an error

I have a numpy array of shape (29, 10) and a list of 29 elements and I want to end up with an array of shape (29,11)
I am basically converting the list to a numpy array and trying to vstack, but it complain about dimensions not being the same.
Toy example
a = np.zeros((29,10))
a.shape
(29,10)
b = np.array(['A']*29)
b.shape
(29,)
np.vstack((a, b))
ValueError: all the input array dimensions except for the concatenation axis must match exactly
Dimensions do actually match, why am I getting this error and how can I solve it?

I think you are looking for np.hstack.
np.hstack((a, b.reshape(-1,1)))
Moreover b must be 2-dimensional, that's why I used a reshape.

The problem is that you want to append a 1D array to a 2D array.
Also, for the dimension you've given for b, you are probably looking for hstack.
Try this:
a = np.zeros((29,10))
a.shape
(29,10)
b = np.array(['A']*29)[:,None] #to ensure 2D structure
b.shape
(29,1)
np.hstack((a, b))
If you do want to vertically stack, you'd need this:
a = np.zeros((29,10))
a.shape
(29,10)
b = np.array(['A']*10)[None,:] #to ensure 2D structure
b.shape
(1,10)
np.vstack((a, b))

What is the difference between an array with shape (N,1) and one with shape (N)? And how to convert between the two?

Python newbie here coming from a MATLAB background.
I have a 1 column array and I want to move that column into the first column of a 3 column array. With a MATLAB background this is what I would do:
import numpy as np
A = np.zeros([150,3]) #three column array
B = np.ones([150,1]) #one column array which needs to replace the first column of A
#MATLAB-style solution:
A[:,0] = B
However this does not work because the "shape" of A is (150,3) and the "shape" of B is (150,1). And apparently the command A[:,0] results in a "shape" of (150).
Now, what is the difference between (150,1) and (150)? Aren't they the same thing: a column vector? And why isn't Python "smart enough" to figure out that I want to put the column vector, B, into the first column of A?
Is there an easy way to convert a 1-column vector with shape (N,1) to a 1-column vector with shape (N)?
I am new to Python and this seems like a really silly thing that MATLAB does much better...

Several things are different. In numpy arrays may be 0d or 1d or higher. In MATLAB 2d is the smallest (and at one time the only dimensions). MATLAB readily expands dimensions the end because it is Fortran ordered. numpy, is by default c ordered, and most readily expands dimensions at the front.
In [1]: A = np.zeros([5,3])
In [2]: A[:,0].shape
Out[2]: (5,)
Simple indexing reduces a dimension, regardless whether it's A[0,:] or A[:,0]. Contrast that with happens to a 3d MATLAB matrix, A(1,:,:) v A(:,:,1).
numpy does broadcasting, adjusting dimensions during operations like sum and assignment. One basic rule is that dimensions may be automatically expanded toward the start if needed:
In [3]: A[:,0] = np.ones(5)
In [4]: A[:,0] = np.ones([1,5])
In [5]: A[:,0] = np.ones([5,1])
...
ValueError: could not broadcast input array from shape (5,1) into shape (5)
It can change (5,) LHS to (1,5), but can't change it to (5,1).
Another broadcasting example, +:
In [6]: A[:,0] + np.ones(5);
In [7]: A[:,0] + np.ones([1,5]);
In [8]: A[:,0] + np.ones([5,1]);
Now the (5,) works with (5,1), but that's because it becomes (1,5), which together with (5,1) produces (5,5) - an outer product broadcasting:
In [9]: (A[:,0] + np.ones([5,1])).shape
Out[9]: (5, 5)
In Octave
>> x = ones(2,3,4);
>> size(x(1,:,:))
ans =
1 3 4
>> size(x(:,:,1))
ans =
2 3
>> size(x(:,1,1) )
ans =
2 1
>> size(x(1,1,:) )
ans =
1 1 4
To do the assignment that you want you adjust either side
Index in a way that preserves the number of dimensions:
In [11]: A[:,[0]].shape
Out[11]: (5, 1)
In [12]: A[:,[0]] = np.ones([5,1])
transpose the (5,1) to (1,5):
In [13]: A[:,0] = np.ones([5,1]).T
flatten/ravel the (5,1) to (5,):
In [14]: A[:,0] = np.ones([5,1]).flat
In [15]: A[:,0] = np.ones([5,1])[:,0]
squeeze, ravel also work.
Some quick tests in Octave indicate that it is more forgiving when it comes to dimensions mismatch. But the numpy prioritizes consistency. Once the broadcasting rules are understood, the behavior makes sense.

Use squeeze method to eliminate the dimensions of size 1.
A[:,0] = B.squeeze()
Or just create B one-dimensional to begin with:
B = np.ones([150])
The fact that NumPy maintains a distinction between a 1D array and 2D array with one of dimensions being 1 is reasonable, especially when one begins working with n-dimensional arrays.
To answer the question in the title: there is an evident structural difference between an array of shape (3,) such as
[1, 2, 3]
and an array of shape (3, 1) such as
[[1], [2], [3]]

How to get these shapes to line up for a numpy matrix

I'm trying to input vectors into a numpy matrix by doing:
eigvec[:,i] = null
However I keep getting the error:
ValueError: could not broadcast input array from shape (20,1) into shape (20)
I've tried using flatten and reshape, but nothing seems to work

The shapes in the error message are a good clue.
In [161]: x = np.zeros((10,10))
In [162]: x[:,1] = np.ones((1,10)) # or x[:,1] = np.ones(10)
In [163]: x[:,1] = np.ones((10,1))
...
ValueError: could not broadcast input array from shape (10,1) into shape (10)
In [166]: x[:,1].shape
Out[166]: (10,)
In [167]: x[:,[1]].shape
Out[167]: (10, 1)
In [168]: x[:,[1]] = np.ones((10,1))
When the shape of the destination matches the shape of the new value, the copy works. It also works in some cases where the new value can be 'broadcasted' to fit. But it does not try more general reshaping. Also note that indexing with a scalar reduces the dimension.

I can guess that
eigvec[:,i] = null.flat
would work (however, null.flatten() should work too). In fact, it looks like NumPy complains because of you are assigning a pseudo-1D array (shape (20, 1)) to a 1D array which is considered to be oriented differently (shape (1, 20), if you wish).
Another solution would be:
eigvec[:,i] = null.T
where you properly transpose the "vector" null.
The fundamental point here is that NumPy has "broadcasting" rules for converting between arrays with different numbers of dimensions. In the case of conversions between 2D and 1D, a 1D array of size n is broadcast into a 2D array of shape (1, n) (and not (n, 1)). More generally, missing dimensions are added to the left of the original dimensions.
The observed error message basically said that shapes (20,) and (20, 1) are not compatible: this is because (20,) becomes (1, 20) (and not (20, 1)). In fact, one is a column matrix, while the other is a row matrix.

Calculating Correlation Coefficient with Numpy

I have a list of values and a 1-d numpy array, and I would like to calculate the correlation coefficient using numpy.corrcoef(x,y,rowvar=0). I get the following error:
Traceback (most recent call last):
File "testLearner.py", line 25, in <module>
corr = np.corrcoef(valuesToCompare,queryOutput,rowvar=0)
File "/usr/local/lib/python2.6/site-packages/numpy/lib/function_base.py", line 2003, in corrcoef
c = cov(x, y, rowvar, bias, ddof)
File "/usr/local/lib/python2.6/site-packages/numpy/lib/function_base.py", line 1935, in cov
X = concatenate((X,y), axis)
ValueError: array dimensions must agree except for d_0
I printed out the shape for my numpy array and got (400,1). When I convert my list to an array with numpy.asarray(y) I get (400,)!
I believe this is the problem. I did an array.reshape to (400,1) and printed out the shape, and I still get (400,). What am I missing?
Thanks in advance.

I think you might have assumed that reshape modifies the value of the original array. It doesn't:
>>> a = np.random.randn(5)
>>> a.shape
(5,)
>>> b = a.reshape(5,1)
>>> b.shape
(5, 1)
>>> a.shape
(5,)
np.asarray treats a regular list as a 1d array, but your original numpy array that you said was 1d is actually 2d (because its shape is (400,1)). If you want to use your list like a 2d array, there are two easy approaches:
np.asarray(lst).reshape((-1, 1)) – -1 means "however many it needs" for that dimension".
np.asarray([lst]).T – .T means array transpose, which switches from (1,5) to (5,1).-
You could also reshape your original array to 1d via ary.reshape((-1,)).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Is this behaviour of NDArray correct? - python

Related

Paddin new axis in 3d matrix returns error

Vstack of two arrays with same number of rows gives an error

What is the difference between an array with shape (N,1) and one with shape (N)? And how to convert between the two?

How to get these shapes to line up for a numpy matrix

Calculating Correlation Coefficient with Numpy

Categories

Resources