Slicing multidimensional numpy array to obtain a vector

Slicing multidimensional numpy array to obtain a vector - python

In this example I'm trying to create a vector by selecting relevant elements from a multidimensional array.
#data
n=3
rng = 4
x = np.array([0,1,2],dtype=int)
y = np.array([0,3,1],dtype=int)
P = np.reshape(np.arange(n*rng*rng),(n,rng,rng))
output = np.zeros(n)
for i in range(n):
output[i] = P[i,x[i],y[i]]
This returns
array([ 0., 23., 41.])
Now I'm trying to vectorize the above operation. To me, the logical thing would be to set
output = P[0:n,x,y]
but this returns
array([[ 0, 7, 9],
[16, 23, 25],
[32, 39, 41]])
Can anybody explain what is going on here and what I should do to obtain the intended output?
Thanks in advance

All you need is:
>>> P[np.arange(n), x, y]
array([ 0, 23, 41])
Related: Indexing Multi-dimensional arrays

Related

how to append row vectors to an empty matrix without knowing the size of the matrix using numpy?

rows is a 343x30 matrix of real numbers. Im trying to append row vectors from rows to true rows and false rows but it only adds the first row and doesnt do anything afterwards. Ive tried vstack and also tried putting example as a 2d array ([example]) but it crashed my pycharm. what can I do?
true_rows = []
true_labels = []
false_rows = []
false_labels = []
i = 0
for example in rows:
if question.match(example):
true_rows = np.append(true_rows , example , axis=0)
true_labels.append(labels[i])
else:
#false_rows = np.vstack(false_rows, example_t)
false_rows = np.append(false_rows, example, axis=0)
false_labels.append(labels[i])
i += 1

you can use only a simple list to append your rows and then transform this list to numpy array such as :
exemple1 = np.array([1,2,3,4,5])
exemple2 = np.array([6,7,8,9,10])
exemple3 = np.array([11,12,13,14,15])
true_rows = []
true_rows.append(exemple1)
true_rows.append(exemple2)
true_rows.append(exemple3)
true_rows = np.array(true_rows)
you will get this results:
true_rows = array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15]])
you can also use np.concatenate if you want to get one dimensional array like this:
true_rows = np.concatenate(true_rows , axis =0)
you will get this results:
true_rows = array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])

Your use of [] and np.append suggests you are trying to imitate a common list append model with arrays. You atleast read enough of the np.append docs to know you need to use axis, and that it returns a new array (the docs are quite clear this is a copy).
But did you test this idea with a small example, and actually look at the results (step by step)?
In [326]: rows = []
In [327]: rows = np.append(rows, np.arange(3), axis=0)
In [328]: rows
Out[328]: array([0., 1., 2.])
In [329]: rows.shape
Out[329]: (3,)
the first append doesn't do anything - the result is the same as arange(3).
In [330]: rows = np.append(rows, np.arange(3), axis=0)
In [331]: rows
Out[331]: array([0., 1., 2., 0., 1., 2.])
In [332]: rows.shape
Out[332]: (6,)
Do you understand why? We join 2 1d arrays on axis 0, making a 1d.
Using [] as a starting point is the same starting with this array:
In [333]: np.array([])
Out[333]: array([], dtype=float64)
In [334]: np.array([]).shape
Out[334]: (0,)
And with axis, np.append is just a call to concatenate:
In [335]: np.concatenate(( [], np.arange(3)), axis=0)
Out[335]: array([0., 1., 2.])
np.append sort looks like list append, but it is not a clone. It's really just a poorly named way to use concatenate. And you can't use it properly without actually understanding dimensions. np.append has an example with an error much like what you got with concatentate.
Repeated use of these array concatenates in a loop is not a good idea. It's hard to get the dimensions right, as you found. And even when it works, it is slow, since each step makes a copy (which grows with the iteration).
That's why the other answer sticks with list append.
vstack is like concatenate with axis 0, but it makes sure all arguments are 2d. But if the number columns differ, it raise an error:
In [336]: np.vstack(( [],np.arange(3)))
Traceback (most recent call last):
File "<ipython-input-336-22038d6ef0f7>", line 1, in <module>
np.vstack(( [],np.arange(3)))
File "<__array_function__ internals>", line 180, in vstack
File "/usr/local/lib/python3.8/dist-packages/numpy/core/shape_base.py", line 282, in vstack
return _nx.concatenate(arrs, 0)
File "<__array_function__ internals>", line 180, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 0 and the array at index 1 has size 3
In [337]: np.vstack(( [0,0,0],np.arange(3)))
Out[337]:
array([[0, 0, 0],
[0, 1, 2]])
If all you are joining are rows of a (n,30) array, then you do know the column size of the result.
In [338]: res = np.zeros((0,3))
In [339]: np.vstack(( res, np.arange(3)))
Out[339]: array([[0., 1., 2.]])
If you pay attention to the shape details, it is possible to create an array iteratively.
But instead of collecting rows one by one, why not create a mask and do the collection once.
Roughly do
mask = np.array([question.match(example) for example in rows])
true_rows = rows[mask]
false_rows = rows[~mask]
this still requires an iteration, but overall should be faster.

How to remove specific values in a multi level numpy array given matrix of indices

suppose x = np.array([[30,60,70],[100,20,80]]) and i wish to remove all elements that are <60. That is, the resulting array should be x = np.array([[60,70],[100,80]]).
I use x = np.array([[30,60,70],[100,20,80]]) to find the indices of the needed elements. And I get indices = (array([0, 1]), array([0, 1])). However, when I am trying to delete the elements in x via np.delete(x, indices), i get array([ 70, 100, 20, 80]) rather than what i was hoping.
What can I do to achieve the desired result?

import numpy as np
x = np.array([[30, 60, 70],
[100, 20, 80]])
new_x = np.array([(np.delete(i, np.where(i < 60)[0])) for i in x])
print(new_x)
Got it this way but idk if works too slow for large arrays

import numpy as np
d = np.array([
[30,60,70],
[100, 20, 80]
])
f = lambda x: x > 60
a = np.array([a[f(a)] for a in d])
print(a)

Partially unpacking of tuple in Numpy array indexing

In order to solve a problem which is only possible element by element I need to combine NumPy's tuple indexing with an explicit slice.
def f(shape, n):
"""
:param shape: any shape of an array
:type shape: tuple
:type n: int
"""
x = numpy.zeros( (n,) + shape )
for i in numpy.ndindex(shape): # i = (k, l, ...)
x[:, k, l, ...] = numpy.random.random(n)
x[:, *i] results in a SyntaxError and x[:, i] is interpreted as numpy.array([ x[:, k] for k in i ]). Unfortunally it's not possible to have the n-dimension as last (x = numpy.zeros(shape+(n,)) for x[i] = numpy.random.random(n)) because of the further usage of x.
EDIT: Here some example wished in comment.
>>> n, shape = 2, (3,4)
>>> x = np.arange(24).reshape((n,)+(3,4))
>>> print(x)
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
>>> i = (1,2)
>>> print(x[ ??? ]) # '???' expressed by i with any length is the question
array([ 6, 18])

If I understand the question correctly, you have a multi-dimensional numpy array and want to index it by combining a : slice with some number of other indices from a tuple i.
The index to the numpy array is a tuple, so you can basically just combine those 'partial' indices to one tuple and use that as the index. A naive approach might look like this
x[ (:,) + i ] = numpy.random.random(n) # does not work
but this will give a syntax error. Instead of :, you have to use the slice builtin.
x[ (slice(None),) + i ] = numpy.random.random(n)

Element-wise effecient multiplication of arrays of matrices

Suppose array_1 and array_2 are two arrays of matrices of the same sizes. Is there any vectorised way of multiplying element-wise, the elements of these two arrays(which their elements' multiplication is well defined)?
The dummy code:
def mat_multiply(array_1,array_2):
size=np.shape(array_1)[0]
result=np.array([])
for i in range(size):
result=np.append(result,np.dot(array_1[i],array_2[i]),axis=0)
return np.reshape(result,(size,2))
example input:
a=[[[1,2],[3,4]],[[1,2],[3,4]]]
b=[[1,3],[4,5]]
output:
[[ 7. 15.]
[ 14. 32.]]

Contrary to your first sentence, a and b are not the same size. But let's focus on your example.
So you want this - 2 dot products, one for each row of a and b
np.array([np.dot(x,y) for x,y in zip(a,b)])
or to avoid appending
X = np.zeros((2,2))
for i in range(2):
X[i,...] = np.dot(a[i],b[i])
the dot product can be expressed with einsum (matrix index notation) as
[np.einsum('ij,j->i',x,y) for x,y in zip(a,b)]
so the next step is to index that first dimension:
np.einsum('kij,kj->ki',a,b)
I'm quite familiar with einsum, but it still took a bit of trial and error to figure out what you want. Now that the problem is clear I can compute it in several other ways
A, B = np.array(a), np.array(b)
np.multiply(A,B[:,np.newaxis,:]).sum(axis=2)
(A*B[:,None,:]).sum(2)
np.dot(A,B.T)[0,...]
np.tensordot(b,a,(-1,-1))[:,0,:]
I find it helpful to work with arrays that have different sizes. For example if A were (2,3,4) and B (2,4), it would be more obvious the dot sum has to be on the last dimension.
Another numpy iteration tool is np.nditer. einsum uses this (in C).
http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html
it = np.nditer([A, B, None],flags=['external_loop'],
op_axes=[[0,1,2], [0,-1,1], None])
for x,y,w in it:
# x, y are shape (2,)
w[...] = np.dot(x,y)
it.operands[2][...,0]
Avoiding that [...,0] step, requires a more elaborate setup.
C = np.zeros((2,2))
it = np.nditer([A, B, C],flags=['external_loop','reduce_ok'],
op_axes=[[0,1,2], [0,-1,1], [0,1,-1]],
op_flags=[['readonly'],['readonly'],['readwrite']])
for x,y,w in it:
w[...] = np.dot(x,y)
# w[...] += x*y
print C
# array([[ 7., 15.],[ 14., 32.]])

There's one more option that #hpaulj left out in his extensive and comprehensive list of options:
>>> a = np.array(a)
>>> b = np.array(b)
>>> from numpy.core.umath_tests import matrix_multiply
>>> matrix_multiply.signature
'(m,n),(n,p)->(m,p)'
>>> matrix_multiply(a, b[..., np.newaxis])
array([[[ 7],
[15]],
[[14],
[32]]])
>>> matrix_multiply(a, b[..., np.newaxis]).shape
(2L, 2L, 1L)
>>> np.squeeze(matrix_multiply(a, b[..., np.newaxis]), axis=-1)
array([[ 7, 15],
[14, 32]])
The nice thing about matrix_multiply is that, it being a gufunc, it will work not only with 1D arrays of matrices, but also with broadcastable arrays. As an example, if instead of multiplying the first matrix with the first vector, and the second matrix with the second vector, you wanted to compute all possible multiplications, you could simply do:
>>> a = np.arange(8).reshape(2, 2, 2) # to have different matrices
>>> np.squeeze(matrix_multiply(a[...,np.newaxis, :, :],
... b[..., np.newaxis]), axis=-1)
array([[[ 3, 11],
[ 5, 23]],
[[19, 27],
[41, 59]]])

numpy ndarray slicing and iteration

I'm trying to slice and iterate over a multidimensional array at the same time. I have a solution that's functional, but it's kind of ugly, and I bet there's a slick way to do the iteration and slicing that I don't know about. Here's the code:
import numpy as np
x = np.arange(64).reshape(4,4,4)
y = [x[i:i+2,j:j+2,k:k+2] for i in range(0,4,2)
for j in range(0,4,2)
for k in range(0,4,2)]
y = np.array(y)
z = np.array([np.min(u) for u in y]).reshape(y.shape[1:])

Your last reshape doesn't work, because y has no shape defined. Without it you get:
>>> x = np.arange(64).reshape(4,4,4)
>>> y = [x[i:i+2,j:j+2,k:k+2] for i in range(0,4,2)
... for j in range(0,4,2)
... for k in range(0,4,2)]
>>> z = np.array([np.min(u) for u in y])
>>> z
array([ 0, 2, 8, 10, 32, 34, 40, 42])
But despite that, what you probably want is reshaping your array to 6 dimensions, which gets you the same result as above:
>>> xx = x.reshape(2, 2, 2, 2, 2, 2)
>>> zz = xx.min(axis=-1).min(axis=-2).min(axis=-3)
>>> zz
array([[[ 0, 2],
[ 8, 10]],
[[32, 34],
[40, 42]]])
>>> zz.ravel()
array([ 0, 2, 8, 10, 32, 34, 40, 42])

It's hard to tell exactly what you want in the last mean, but you can use stride_tricks to get a "slicker" way. It's rather tricky.
import numpy.lib.stride_tricks
# This returns a view with custom strides, x2[i,j,k] matches y[4*i+2*j+k]
x2 = numpy.lib.stride_tricks(
x, shape=(2,2,2,2,2,2),
strides=(numpy.array([32,8,2,16,4,1])*x.dtype.itemsize))
z2 = z2.min(axis=-1).min(axis=-2).min(axis=-3)
Still, I can't say this is much more readable. (Or efficient, as each min call will make temporaries.)
Note, my answer differs from Jaime's because I tried to match your elements of y. You can tell if you replace the min with max.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Slicing multidimensional numpy array to obtain a vector - python

All you need is: >>> P[np.arange(n), x, y] array([ 0, 23, 41]) Related: Indexing Multi-dimensional arrays

Related

how to append row vectors to an empty matrix without knowing the size of the matrix using numpy?

How to remove specific values in a multi level numpy array given matrix of indices

Partially unpacking of tuple in Numpy array indexing

Element-wise effecient multiplication of arrays of matrices

numpy ndarray slicing and iteration

Categories

Resources