Slicing a numpy 3-d matrix into 2-d matrix - python

I have a 3d numpy matrix t as follows, generated randomly:
t = np.random.rand(2,2,2)
array([[[ 0.80351862, 0.25631294],
[ 0.7971346 , 0.29468456]],
[[ 0.33771957, 0.91776256],
[ 0.6018604 , 0.55290615]]])
I want to extract a 2-d matrix such that the result is sliced along the columns of the 3-d matrix. Something like:
array([[ 0.25631294 , 0.91776256],
[ 0.29468456, 0.55290615]])
How can I slice in such a way?
Thanks for the help.

That's just taking the last dim, with a transpose:
>>> t[:,:,1].T
array([[ 0.25631294, 0.91776256],
[ 0.29468456, 0.55290615]])

You could do a combination of a slice, a reshape and a transpose, like so:
t[:, :, 1:].reshape((2, 2)).T
I hope it helps

Related

Issue with numpy matrix multiplication

I'm trying to multiply two matrices of dimensions (17,2) by transposing one of the matrices
Here is example p1
p1 = [[ 0.15520622 -0.92034567]
[ 0.43294367 -1.05921439]
[ 0.7569707 -1.15179354]
[ 1.08099772 -1.15179354]
[ 1.35873517 -0.96663524]
[-1.51121847 -0.64260822]
[-1.32606018 -0.87405609]
[-1.00203315 -0.96663524]
[-0.67800613 -0.96663524]
[-0.3539791 -0.87405609]
[ 0.89583942 1.02381648]
[ 0.66439155 1.3478435 ]
[ 0.3866541 1.48671223]
[ 0.15520622 1.5330018 ]
[-0.07624165 1.5330018 ]
[-0.3539791 1.44042265]
[-0.58542698 1.20897478]]
here is another example matrix p2
p2 = [[ 0.20932473 -0.90029958]
[ 0.53753779 -1.03849455]
[ 0.88302521 -1.10759204]
[ 1.24578701 -1.02122018]
[ 1.47035383 -0.77937898]
[-1.46628927 -0.69300713]
[-1.29354556 -0.9521227 ]
[-0.96533251 -1.03849455]
[-0.63711946 -1.00394581]
[-0.3089064 -0.90029958]
[ 0.86575084 1.06897874]
[ 0.55481216 1.37991742]
[ 0.26114785 1.50083802]
[ 0.03658102 1.51811239]
[-0.1879858 1.50083802]
[-0.46437574 1.37991742]
[-0.74076568 1.08625311]]
I'm trying to multiply them using numpy
import numpy
print(p1.T * p2)
But I'm getting the following error
operands could not be broadcast together with shapes (2,17) (17,2)
This is the expected matrix multiplication output
[[11.58117944 2.21072324]
[-0.51754442 22.28728876]]
Where exactly am I going wrong
Matrix multiplication is done with np.dot(p1.T,p2), because
A * B means matrix elements-wise multiply.
So you should use np.dot:
p1.T.dot(p2)
Sorry for a vague question. Initially, I was getting p1 and p2 values from numpy matrix. I later stored them in json file as list for optimization by using
.tolist()
method and was reading it back as numpy array using
numpy.array()
method which is apparently wrong..I changed my code to read the numpy array using
numpy.matrix()
method which seems to solve the issue. Hope this helps someone

Avoid using for loop. Python 3

I have an array of shape (3,2):
import numpy as np
arr = np.array([[0.,0.],[0.25,-0.125],[0.5,-0.125]])
I was trying to build a matrix (matrix) of dimensions (6,2), with the results of the outer product of the elements i,i of arr and arr.T. At the moment I am using a for loop such as:
size = np.shape(arr)
matrix = np.zeros((size[0]*size[1],size[1]))
for i in range(np.shape(arr)[0]):
prod = np.outer(arr[i],arr[i].T)
matrix[size[1]*i:size[1]+size[1]*i,:] = prod
Resulting:
matrix =array([[ 0. , 0. ],
[ 0. , 0. ],
[ 0.0625 , -0.03125 ],
[-0.03125 , 0.015625],
[ 0.25 , -0.0625 ],
[-0.0625 , 0.015625]])
Is there any way to build this matrix without using a for loop (e.g. broadcasting)?
Extend arrays to 3D with None/np.newaxis keeping the first axis aligned, while letting the second axis getting pair-wise multiplied, perform multiplication leveraging broadcasting and reshape to 2D -
matrix = (arr[:,None,:]*arr[:,:,None]).reshape(-1,arr.shape[1])
We can also use np.einsum -
matrix = np.einsum('ij,ik->ijk',arr,arr).reshape(-1,arr.shape[1])
einsum string representation might be more intuitive as it lets us visualize three things :
Axes that are aligned (axis=0 here).
Axes that are getting summed up (none here).
Axes that are kept i.e. element-wise multiplied (axis=1 here).

Reshape numpy (n,) vector to (n,1) vector

So it is easier for me to think about vectors as column vectors when I need to do some linear algebra. Thus I prefer shapes like (n,1).
Is there significant memory usage difference between shapes (n,) and (n,1)?
What is preferred way?
And how to reshape (n,) vector into (n,1) vector. Somehow b.reshape((n,1)) doesn't do the trick.
a = np.random.random((10,1))
b = np.ones((10,))
b.reshape((10,1))
print(a)
print(b)
[[ 0.76336295]
[ 0.71643237]
[ 0.37312894]
[ 0.33668241]
[ 0.55551975]
[ 0.20055153]
[ 0.01636735]
[ 0.5724694 ]
[ 0.96887004]
[ 0.58609882]]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
More simpler way with python syntax sugar is to use
b.reshape(-1,1)
where the system will automatically compute the correct shape instead "-1"
ndarray.reshape() returns a new view, or a copy (depends on the new shape). It does not modify the array in place.
b.reshape((10, 1))
as such is effectively no-operation, since the created view/copy is not assigned to anything. The "fix" is simple:
b_new = b.reshape((10, 1))
The amount of memory used should not differ at all between the 2 shapes. Numpy arrays use the concept of strides and so the dimensions (10,) and (10, 1) can both use the same buffer; the amounts to jump to next row and column just change.

Reshaping numpy array from list

I have the following problem with shape of ndarray:
out.shape = (20,)
reference.shape = (20,0)
norm = [out[i] / np.sum(out[i]) for i in range(len(out))]
# norm is a list now so I convert it to ndarray:
norm_array = np.array((norm))
norm_array.shape = (20,30)
# error: operands could not be broadcast together with shapes (20,30) (20,)
diff = np.fabs(norm_array - reference)
How can I change shape of norm_array from (20,30) into (20,) or reference to (20,30), so I can substract them?
EDIT: Can someone explain me, why they have different shape, if I can access both single elements with norm_array[0][0] and reference[0][0] ?
I am not sure what you are trying to do exactly, but here is some information on numpy arrays.
A 1-d numpy array is a row vector with a shape that is a single-valued tuple:
>>> np.array([1,2,3]).shape
(3,)
You can create multidimensional arrays by passing in nested lists. Each sub-list is a 1-d row vector of length 1, and there are 3 of them.
>>> np.array([[1],[2],[3]]).shape
(3,1)
Here is the weird part. You can create the same array, but leave the lists empty. You end up with 3 row vectors of length 0.
>>> np.array([[],[],[]]).shape
(3,0)
This is what you have for you reference array, an array with structure but no values. This brings me back to my original point:
You can't subtract an empty array.
If I make 2 arrays with the shapes you describe, I get an error
In [1856]: norm_array=np.ones((20,30))
In [1857]: reference=np.ones((20,0))
In [1858]: norm_array-reference
...
ValueError: operands could not be broadcast together with shapes (20,30) (20,0)
But it's different from yours. But if I change the shape of reference, the error messages match.
In [1859]: reference=np.ones((20,))
In [1860]: norm_array-reference
...
ValueError: operands could not be broadcast together with shapes (20,30) (20,)
So your (20,0) is wrong. I don't know if you mistyped something or not.
But if I make reference 2d with 1 in the last dimension, broadcasting works, producing a difference that matches (20,30) in shape:
In [1861]: reference=np.ones((20,1))
In [1862]: norm_array-reference
If reference = np.zeros((20,)), then I could use reference[:,None] to add that singleton last dimension.
If reference is (20,), you can't do reference[0][0]. reference[0][0] only works with 2d arrays with at least 1 in the last dim. reference[0,0] is the preferred way of indexing a single element of a 2d array.
So far this is normal array dimensions and broadcasting; something you'll learn with use.
===============
I'm puzzled about the shape of out. If it is (20,), how does norm_array end up as (20,30). out must consist of 20 arrays or lists, each of which has 30 elements.
If out was 2d array, we could normalize without iteration
In [1869]: out=np.arange(12).reshape(3,4)
with the list comprehension:
In [1872]: [out[i]/np.sum(out[i]) for i in range(out.shape[0])]
Out[1872]:
[array([ 0. , 0.16666667, 0.33333333, 0.5 ]),
array([ 0.18181818, 0.22727273, 0.27272727, 0.31818182]),
array([ 0.21052632, 0.23684211, 0.26315789, 0.28947368])]
In [1873]: np.array(_) # and to array
Out[1873]:
array([[ 0. , 0.16666667, 0.33333333, 0.5 ],
[ 0.18181818, 0.22727273, 0.27272727, 0.31818182],
[ 0.21052632, 0.23684211, 0.26315789, 0.28947368]])
Instead take row sums, and tell it to keep it 2d for ease of further use
In [1876]: out.sum(axis=1,keepdims=True)
Out[1876]:
array([[ 6],
[22],
[38]])
now divide
In [1877]: out/out.sum(axis=1,keepdims=True)
Out[1877]:
array([[ 0. , 0.16666667, 0.33333333, 0.5 ],
[ 0.18181818, 0.22727273, 0.27272727, 0.31818182],
[ 0.21052632, 0.23684211, 0.26315789, 0.28947368]])

numpy get std between datasets

I have a dataset array A. A is nĂ—2. It can be plotted on the x and y axis.
A[:,1] gets me all of the Y values ans A[:,0] gets me all the x values.
Now, I have a few other dataset arrays that are similar to A. X values are the same for these similar arrays. How do I calculate the standard deviation of the datasets? There should be a std value for each X. In the end my result std should have a length of n.
I can do this the manual way with loops but I'm not sure how to do this using NumPy in a pythonic and simple manner.
here are some sample data:
A=[[0,2.54],[1,254.5],[2,-43]]
B=[[0,3.34],[1,154.5],[2,-93]]
std_Array=[std(2.54,3.54),std(254.5,154.5),std(-43,-93)]
Suppose your arrays are all the same shape and they are in a list. Then to get the standard deviation of the first column of each you can do
arrays = [np.random.rand(10, 2) for _ in range(8)]
np.dstack(arrays).std(axis=0)[0]
This stacks the 2-D arrays into a 3-D array an then takes the std along the first axis, giving a 2 X 8 (the number of arrays). The first row of the result is the std. devs. of the 8 sets of x-values.
If you post some sample data perhaps we could help more.
Is this pythonic enough?
std_Array = numpy.std((A,B), axis = 0)[:,1]
li_arr = [np.array(x)[: , 1] for x in [A , B]]
This will produce numpy arrays with specifi columns you want to add the result will be
[array([ 2.54, 254.5 , -43. ]), array([ 3.34, 154.5 , -93. ])]
then you stack the values using column_stack
arr = np.column_stack(li_arr)
this will be the result stacking
array([[ 2.54, 3.34],
[ 254.5 , 154.5 ],
[ -43. , -93. ]])
and then finally
np.std(arr , axis = 1)

Categories