Convert a list to an 3 dimensional numpy array - python

I have a list, which consist collection of differently shaped 2 dimensional numpy arrays inside it. It looks like this.
My goal is to convert this to a 3d numpy array so that its structure is something like this
[
[[ ]
[ ]
:
],
[[ ]
[ ]
:
],
:
]
or in words
[
two dimensional array 1,
two dimensional array 2,
two dimensional array 3,
:
:
]
I tried doing
arr = np.array(garbage)
that gives me an array but it is not structured as I described it. Its shape comes out to be (40336,)
I have to pass the array to a RNN. Do I have to pad zeros for all internal 2 dimensional arrays so that they are of same shape which will make the outer array of the three dimensional shape that I want?

Related

convert 2D array into 1D array using np.newaxis

I am trying to convert 2D array with one column into 1D vector using np.newaxis. The result I got so far is 3D array instead of 1D vector or 1D array.
The 2D array y1 is:
y1.shape
(506, 1)
y1
array([[0.42 ],
[0.36666667],
[0.66 ],
[0.63333333],
[0.69333333],
... ])
Now I'd like to convert it into 1D array
import numpy as np
y2=y1[np.newaxis,:]
y2.shape
(1, 506, 1)
You can see after using np.newaxis, the shape of y2 become a 3D array, I am expecting the shape of (506,) 1D array.
what is the problem of my above code? Thanks
np.newaxis expand dimension so 2D -> 3D. If you want to reduce your dimension 2D -> 1D, use squeeze:
>>> a
array([[0.42 ],
[0.36666667],
[0.66 ],
[0.63333333],
[0.69333333]])
>>> a.shape()
(5, 1)
>>> a.squeeze()
array([0.42 , 0.36666667, 0.66 , 0.63333333, 0.69333333])
>>> a.squeeze().shape
(5,)
From the documentation:
Each newaxis object in the selection tuple serves to expand the dimensions of the resulting selection by one unit-length dimension. The added dimension is the position of the newaxis object in the selection tuple.
np.newaxis() is used to increase the dimension of the array. It will not decrease the dimension. In order to decrease the dimension, you can use:
reshape()
y1 = np.array(y1).reshape(-1,)
print(y1.shape)
>>> (506,)

Issue with numpy matrix multiplication

I'm trying to multiply two matrices of dimensions (17,2) by transposing one of the matrices
Here is example p1
p1 = [[ 0.15520622 -0.92034567]
[ 0.43294367 -1.05921439]
[ 0.7569707 -1.15179354]
[ 1.08099772 -1.15179354]
[ 1.35873517 -0.96663524]
[-1.51121847 -0.64260822]
[-1.32606018 -0.87405609]
[-1.00203315 -0.96663524]
[-0.67800613 -0.96663524]
[-0.3539791 -0.87405609]
[ 0.89583942 1.02381648]
[ 0.66439155 1.3478435 ]
[ 0.3866541 1.48671223]
[ 0.15520622 1.5330018 ]
[-0.07624165 1.5330018 ]
[-0.3539791 1.44042265]
[-0.58542698 1.20897478]]
here is another example matrix p2
p2 = [[ 0.20932473 -0.90029958]
[ 0.53753779 -1.03849455]
[ 0.88302521 -1.10759204]
[ 1.24578701 -1.02122018]
[ 1.47035383 -0.77937898]
[-1.46628927 -0.69300713]
[-1.29354556 -0.9521227 ]
[-0.96533251 -1.03849455]
[-0.63711946 -1.00394581]
[-0.3089064 -0.90029958]
[ 0.86575084 1.06897874]
[ 0.55481216 1.37991742]
[ 0.26114785 1.50083802]
[ 0.03658102 1.51811239]
[-0.1879858 1.50083802]
[-0.46437574 1.37991742]
[-0.74076568 1.08625311]]
I'm trying to multiply them using numpy
import numpy
print(p1.T * p2)
But I'm getting the following error
operands could not be broadcast together with shapes (2,17) (17,2)
This is the expected matrix multiplication output
[[11.58117944 2.21072324]
[-0.51754442 22.28728876]]
Where exactly am I going wrong
Matrix multiplication is done with np.dot(p1.T,p2), because
A * B means matrix elements-wise multiply.
So you should use np.dot:
p1.T.dot(p2)
Sorry for a vague question. Initially, I was getting p1 and p2 values from numpy matrix. I later stored them in json file as list for optimization by using
.tolist()
method and was reading it back as numpy array using
numpy.array()
method which is apparently wrong..I changed my code to read the numpy array using
numpy.matrix()
method which seems to solve the issue. Hope this helps someone

Distance with array of different sizes

I have an array with dimensions as such:
pos = np.array([[ 1.72, 2.56],
[ 0.24, 5.67],
[ -1.24, 5.45],
[ -3.17, -0.23],
[ 1.17, -1.23],
[ 1.12, 1.08]])
and I want to find the distance between each line of the array to an index point which would be
ref = np.array([1.22, 1.18])
I would thus have an array with 4 elements as an answer but I'm really confused as to the method of approaching this with only numpy as I've tried many ways yet the size of the ref array presents a challenge. Thanks for the help.
The expected answer is an array with 6 elements. The elements are approximately:
[ 1.468, 4.596, 4.928 , 4.611, 2.410, 0.141 ]
Using numpy and assuming Euclidean metric:
import numpy as np
np.linalg.norm(pos - ref, axis=1)
If you need a Python list (instead of numpy array), add .tolist() to the previous line:
np.linalg.norm(pos - ref, axis=1).tolist()

Slicing a numpy 3-d matrix into 2-d matrix

I have a 3d numpy matrix t as follows, generated randomly:
t = np.random.rand(2,2,2)
array([[[ 0.80351862, 0.25631294],
[ 0.7971346 , 0.29468456]],
[[ 0.33771957, 0.91776256],
[ 0.6018604 , 0.55290615]]])
I want to extract a 2-d matrix such that the result is sliced along the columns of the 3-d matrix. Something like:
array([[ 0.25631294 , 0.91776256],
[ 0.29468456, 0.55290615]])
How can I slice in such a way?
Thanks for the help.
That's just taking the last dim, with a transpose:
>>> t[:,:,1].T
array([[ 0.25631294, 0.91776256],
[ 0.29468456, 0.55290615]])
You could do a combination of a slice, a reshape and a transpose, like so:
t[:, :, 1:].reshape((2, 2)).T
I hope it helps

Reshaping numpy array from list

I have the following problem with shape of ndarray:
out.shape = (20,)
reference.shape = (20,0)
norm = [out[i] / np.sum(out[i]) for i in range(len(out))]
# norm is a list now so I convert it to ndarray:
norm_array = np.array((norm))
norm_array.shape = (20,30)
# error: operands could not be broadcast together with shapes (20,30) (20,)
diff = np.fabs(norm_array - reference)
How can I change shape of norm_array from (20,30) into (20,) or reference to (20,30), so I can substract them?
EDIT: Can someone explain me, why they have different shape, if I can access both single elements with norm_array[0][0] and reference[0][0] ?
I am not sure what you are trying to do exactly, but here is some information on numpy arrays.
A 1-d numpy array is a row vector with a shape that is a single-valued tuple:
>>> np.array([1,2,3]).shape
(3,)
You can create multidimensional arrays by passing in nested lists. Each sub-list is a 1-d row vector of length 1, and there are 3 of them.
>>> np.array([[1],[2],[3]]).shape
(3,1)
Here is the weird part. You can create the same array, but leave the lists empty. You end up with 3 row vectors of length 0.
>>> np.array([[],[],[]]).shape
(3,0)
This is what you have for you reference array, an array with structure but no values. This brings me back to my original point:
You can't subtract an empty array.
If I make 2 arrays with the shapes you describe, I get an error
In [1856]: norm_array=np.ones((20,30))
In [1857]: reference=np.ones((20,0))
In [1858]: norm_array-reference
...
ValueError: operands could not be broadcast together with shapes (20,30) (20,0)
But it's different from yours. But if I change the shape of reference, the error messages match.
In [1859]: reference=np.ones((20,))
In [1860]: norm_array-reference
...
ValueError: operands could not be broadcast together with shapes (20,30) (20,)
So your (20,0) is wrong. I don't know if you mistyped something or not.
But if I make reference 2d with 1 in the last dimension, broadcasting works, producing a difference that matches (20,30) in shape:
In [1861]: reference=np.ones((20,1))
In [1862]: norm_array-reference
If reference = np.zeros((20,)), then I could use reference[:,None] to add that singleton last dimension.
If reference is (20,), you can't do reference[0][0]. reference[0][0] only works with 2d arrays with at least 1 in the last dim. reference[0,0] is the preferred way of indexing a single element of a 2d array.
So far this is normal array dimensions and broadcasting; something you'll learn with use.
===============
I'm puzzled about the shape of out. If it is (20,), how does norm_array end up as (20,30). out must consist of 20 arrays or lists, each of which has 30 elements.
If out was 2d array, we could normalize without iteration
In [1869]: out=np.arange(12).reshape(3,4)
with the list comprehension:
In [1872]: [out[i]/np.sum(out[i]) for i in range(out.shape[0])]
Out[1872]:
[array([ 0. , 0.16666667, 0.33333333, 0.5 ]),
array([ 0.18181818, 0.22727273, 0.27272727, 0.31818182]),
array([ 0.21052632, 0.23684211, 0.26315789, 0.28947368])]
In [1873]: np.array(_) # and to array
Out[1873]:
array([[ 0. , 0.16666667, 0.33333333, 0.5 ],
[ 0.18181818, 0.22727273, 0.27272727, 0.31818182],
[ 0.21052632, 0.23684211, 0.26315789, 0.28947368]])
Instead take row sums, and tell it to keep it 2d for ease of further use
In [1876]: out.sum(axis=1,keepdims=True)
Out[1876]:
array([[ 6],
[22],
[38]])
now divide
In [1877]: out/out.sum(axis=1,keepdims=True)
Out[1877]:
array([[ 0. , 0.16666667, 0.33333333, 0.5 ],
[ 0.18181818, 0.22727273, 0.27272727, 0.31818182],
[ 0.21052632, 0.23684211, 0.26315789, 0.28947368]])

Categories