Reshaping numpy array from list - python

I have the following problem with shape of ndarray:
out.shape = (20,)
reference.shape = (20,0)
norm = [out[i] / np.sum(out[i]) for i in range(len(out))]
# norm is a list now so I convert it to ndarray:
norm_array = np.array((norm))
norm_array.shape = (20,30)
# error: operands could not be broadcast together with shapes (20,30) (20,)
diff = np.fabs(norm_array - reference)
How can I change shape of norm_array from (20,30) into (20,) or reference to (20,30), so I can substract them?
EDIT: Can someone explain me, why they have different shape, if I can access both single elements with norm_array[0][0] and reference[0][0] ?

I am not sure what you are trying to do exactly, but here is some information on numpy arrays.
A 1-d numpy array is a row vector with a shape that is a single-valued tuple:
>>> np.array([1,2,3]).shape
(3,)
You can create multidimensional arrays by passing in nested lists. Each sub-list is a 1-d row vector of length 1, and there are 3 of them.
>>> np.array([[1],[2],[3]]).shape
(3,1)
Here is the weird part. You can create the same array, but leave the lists empty. You end up with 3 row vectors of length 0.
>>> np.array([[],[],[]]).shape
(3,0)
This is what you have for you reference array, an array with structure but no values. This brings me back to my original point:
You can't subtract an empty array.

If I make 2 arrays with the shapes you describe, I get an error
In [1856]: norm_array=np.ones((20,30))
In [1857]: reference=np.ones((20,0))
In [1858]: norm_array-reference
...
ValueError: operands could not be broadcast together with shapes (20,30) (20,0)
But it's different from yours. But if I change the shape of reference, the error messages match.
In [1859]: reference=np.ones((20,))
In [1860]: norm_array-reference
...
ValueError: operands could not be broadcast together with shapes (20,30) (20,)
So your (20,0) is wrong. I don't know if you mistyped something or not.
But if I make reference 2d with 1 in the last dimension, broadcasting works, producing a difference that matches (20,30) in shape:
In [1861]: reference=np.ones((20,1))
In [1862]: norm_array-reference
If reference = np.zeros((20,)), then I could use reference[:,None] to add that singleton last dimension.
If reference is (20,), you can't do reference[0][0]. reference[0][0] only works with 2d arrays with at least 1 in the last dim. reference[0,0] is the preferred way of indexing a single element of a 2d array.
So far this is normal array dimensions and broadcasting; something you'll learn with use.
===============
I'm puzzled about the shape of out. If it is (20,), how does norm_array end up as (20,30). out must consist of 20 arrays or lists, each of which has 30 elements.
If out was 2d array, we could normalize without iteration
In [1869]: out=np.arange(12).reshape(3,4)
with the list comprehension:
In [1872]: [out[i]/np.sum(out[i]) for i in range(out.shape[0])]
Out[1872]:
[array([ 0. , 0.16666667, 0.33333333, 0.5 ]),
array([ 0.18181818, 0.22727273, 0.27272727, 0.31818182]),
array([ 0.21052632, 0.23684211, 0.26315789, 0.28947368])]
In [1873]: np.array(_) # and to array
Out[1873]:
array([[ 0. , 0.16666667, 0.33333333, 0.5 ],
[ 0.18181818, 0.22727273, 0.27272727, 0.31818182],
[ 0.21052632, 0.23684211, 0.26315789, 0.28947368]])
Instead take row sums, and tell it to keep it 2d for ease of further use
In [1876]: out.sum(axis=1,keepdims=True)
Out[1876]:
array([[ 6],
[22],
[38]])
now divide
In [1877]: out/out.sum(axis=1,keepdims=True)
Out[1877]:
array([[ 0. , 0.16666667, 0.33333333, 0.5 ],
[ 0.18181818, 0.22727273, 0.27272727, 0.31818182],
[ 0.21052632, 0.23684211, 0.26315789, 0.28947368]])

Related

convert 2D array into 1D array using np.newaxis

I am trying to convert 2D array with one column into 1D vector using np.newaxis. The result I got so far is 3D array instead of 1D vector or 1D array.
The 2D array y1 is:
y1.shape
(506, 1)
y1
array([[0.42 ],
[0.36666667],
[0.66 ],
[0.63333333],
[0.69333333],
... ])
Now I'd like to convert it into 1D array
import numpy as np
y2=y1[np.newaxis,:]
y2.shape
(1, 506, 1)
You can see after using np.newaxis, the shape of y2 become a 3D array, I am expecting the shape of (506,) 1D array.
what is the problem of my above code? Thanks
np.newaxis expand dimension so 2D -> 3D. If you want to reduce your dimension 2D -> 1D, use squeeze:
>>> a
array([[0.42 ],
[0.36666667],
[0.66 ],
[0.63333333],
[0.69333333]])
>>> a.shape()
(5, 1)
>>> a.squeeze()
array([0.42 , 0.36666667, 0.66 , 0.63333333, 0.69333333])
>>> a.squeeze().shape
(5,)
From the documentation:
Each newaxis object in the selection tuple serves to expand the dimensions of the resulting selection by one unit-length dimension. The added dimension is the position of the newaxis object in the selection tuple.
np.newaxis() is used to increase the dimension of the array. It will not decrease the dimension. In order to decrease the dimension, you can use:
reshape()
y1 = np.array(y1).reshape(-1,)
print(y1.shape)
>>> (506,)

Avoid using for loop. Python 3

I have an array of shape (3,2):
import numpy as np
arr = np.array([[0.,0.],[0.25,-0.125],[0.5,-0.125]])
I was trying to build a matrix (matrix) of dimensions (6,2), with the results of the outer product of the elements i,i of arr and arr.T. At the moment I am using a for loop such as:
size = np.shape(arr)
matrix = np.zeros((size[0]*size[1],size[1]))
for i in range(np.shape(arr)[0]):
prod = np.outer(arr[i],arr[i].T)
matrix[size[1]*i:size[1]+size[1]*i,:] = prod
Resulting:
matrix =array([[ 0. , 0. ],
[ 0. , 0. ],
[ 0.0625 , -0.03125 ],
[-0.03125 , 0.015625],
[ 0.25 , -0.0625 ],
[-0.0625 , 0.015625]])
Is there any way to build this matrix without using a for loop (e.g. broadcasting)?
Extend arrays to 3D with None/np.newaxis keeping the first axis aligned, while letting the second axis getting pair-wise multiplied, perform multiplication leveraging broadcasting and reshape to 2D -
matrix = (arr[:,None,:]*arr[:,:,None]).reshape(-1,arr.shape[1])
We can also use np.einsum -
matrix = np.einsum('ij,ik->ijk',arr,arr).reshape(-1,arr.shape[1])
einsum string representation might be more intuitive as it lets us visualize three things :
Axes that are aligned (axis=0 here).
Axes that are getting summed up (none here).
Axes that are kept i.e. element-wise multiplied (axis=1 here).

Reshape numpy (n,) vector to (n,1) vector

So it is easier for me to think about vectors as column vectors when I need to do some linear algebra. Thus I prefer shapes like (n,1).
Is there significant memory usage difference between shapes (n,) and (n,1)?
What is preferred way?
And how to reshape (n,) vector into (n,1) vector. Somehow b.reshape((n,1)) doesn't do the trick.
a = np.random.random((10,1))
b = np.ones((10,))
b.reshape((10,1))
print(a)
print(b)
[[ 0.76336295]
[ 0.71643237]
[ 0.37312894]
[ 0.33668241]
[ 0.55551975]
[ 0.20055153]
[ 0.01636735]
[ 0.5724694 ]
[ 0.96887004]
[ 0.58609882]]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
More simpler way with python syntax sugar is to use
b.reshape(-1,1)
where the system will automatically compute the correct shape instead "-1"
ndarray.reshape() returns a new view, or a copy (depends on the new shape). It does not modify the array in place.
b.reshape((10, 1))
as such is effectively no-operation, since the created view/copy is not assigned to anything. The "fix" is simple:
b_new = b.reshape((10, 1))
The amount of memory used should not differ at all between the 2 shapes. Numpy arrays use the concept of strides and so the dimensions (10,) and (10, 1) can both use the same buffer; the amounts to jump to next row and column just change.

Element wise comparison between 1D and 2D array

Want to perform an element wise comparison between an 1D and 2D array. Each element of the 1D array need to be compared (e.g. greater) against the corresponding row of 2D and a mask will be created. Here is an example:
A = np.random.choice(np.arange(0, 10), (4,100)).astype(np.float)
B = np.array([5., 4., 8., 2. ])
I want to do
A<B
so that first row of A will be compared against B[0] which is 5. and the result will be an boolean array.
If I try this I get:
operands could not be broadcast together with shapes (4,100) (4,)
Any ideas?
You need to insert an extra dimension into array B:
A < B[:, None]
This allows NumPy to properly match up the two shapes for broadcasting; B now has shape (4, 1) and the dimensions can be paired up:
(4, 100)
(4, 1)
The rule is that either the dimensions have the same length, or one of the lengths needs to be 1; here 100 can be paired with 1, and 4 can be paired with 4. Before the new dimension was inserted, NumPy tried to pair 100 with 4 which raised the error.

derivative with numpy.diff problems

I have this problem:
I have an array of 7 elements:
vector = [array([ 76.27789424]), array([ 76.06870298]), array([ 75.85016864]), array([ 75.71155968]), array([ 75.16982466]), array([ 73.08832948]), array([ 68.59935515])]
(this array is the result of a lot of operation)
now I want calculate the derivative with numpy.diff(vector) but I know that the type must be a numpy array.
for this, I type:
vector=numpy.array(vector);
if I print the vector, now, the result is:
[[ 76.27789424]
[ 76.06870298]
[ 75.85016864]
[ 75.71155968]
[ 75.16982466]
[ 73.08832948]
[ 68.59935515]]
but If i try to calculate the derivative, the result is [].
Can You help me, please?
Thanks a lot!
vector is a list of arrays, to get a 1-D NumPy array use a list comprehension and pass it to numpy.array:
>>> vector = numpy.array([x[0] for x in vector])
>>> numpy.diff(vector)
array([-0.20919126, -0.21853434, -0.13860896, -0.54173502, -2.08149518,
-4.48897433])
vector = numpy.array(vector);
gives you a two dimensional array with seven rows and one column
>>> vector.shape
(7, 1)
The shape reads like: (length axis 0, length axis 1, length axis 2, ...)
As you can see the last axis is axis 1 and it's length is 1.
from the docs
numpy.diff(a, n=1, axis=-1)
...
axis : int, optional
The axis along which the difference is taken, default is the last axis.
There is no way to take difference of a single value. So lets try to use the first axis which has a length of 7. Since axis counting starts with zero, the first axis is 0
>>> np.diff(vector, axis=0)
array([[-0.20919126],
[-0.21853434],
[-0.13860896],
[-0.54173502],
[-2.08149518],
[-4.48897433]])
Note that every degree of derivative will be one element shorter so the new shape is (7-1, 1) which is (6, 1). Lets verify that
>>> np.diff(vector, axis=0).shape
(6, 1)

Categories