Slicing a 3-D array using a 2-D array - python

Assume we have two matrices:
x = np.random.randint(10, size=(2, 3, 3))
idx = np.random.randint(3, size=(2, 3))
The question is to access the element of x using idx, in the way as:
dim1 = x[0, range(0,3), idx[0]] # slicing x[0] using idx[0]
dim2 = x[1, range(0,3), idx[1]]
res = np.vstack((dim1, dim2))
Is there a neat way to do this?

You can just index it the basic way, only that the size of indexer array has to match. That's what those .reshape s are for:
x[np.array([0,1]).reshape(idx.shape[0], -1),
np.array([0,1,2]).reshape(-1,idx.shape[1]),
idx]
Out[29]:
array([[ 0.10786251, 0.2527514 , 0.11305823],
[ 0.67264076, 0.80958292, 0.07703623]])

Here's another way to do it with reshaping -
x.reshape(-1,x.shape[2])[np.arange(idx.size),idx.ravel()].reshape(idx.shape)
Sample run -
In [2]: x
Out[2]:
array([[[5, 0, 9],
[3, 0, 7],
[7, 1, 2]],
[[5, 3, 5],
[8, 6, 1],
[7, 0, 9]]])
In [3]: idx
Out[3]:
array([[2, 1, 2],
[1, 2, 0]])
In [4]: x.reshape(-1,x.shape[2])[np.arange(idx.size),idx.ravel()].reshape(idx.shape)
Out[4]:
array([[9, 0, 2],
[3, 1, 7]])

Related

How to subtract along axis=0 in python?

I have a matrix mat like below;
mat = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
and a list s = [1, 2, 5].
I have to subtract along axis=1. I did as follow and it works..
mat - s = array([[ 0, 0, -2],
[ 3, 3, 1],
[ 6, 6, 4]])
However, if I subtract along axis=0;
ie,
mat - s[:,None]
I get errors.
TypeError: list indices must be integers or slices, not tuple
Here's a little hack:
s = np.array([1,2,5])
(mat.T - s).T
Output:
array([[0, 1, 2],
[2, 3, 4],
[2, 3, 4]])
Edit: .T does not change anything if s is 1d so you can remove it.
You were on the right track with the use of [:,None], but your definition of s was wrong.
In [128]: mat = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
In [129]: s = [1, 2, 5]
In [130]: mat - s
Out[130]:
array([[ 0, 0, -2],
[ 3, 3, 1],
[ 6, 6, 4]])
In this subtraction, s has automatically been 'promoted' to numpy array.
The [..,...] indexing does not work with a list:
In [131]: s[:,None]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-131-bafcfb7b67c1> in <module>
----> 1 s[:,None]
TypeError: list indices must be integers or slices, not tuple
The tuple in this error is the comma expression: s[:, None] is the same as s[(:,None)]. The python parser passes a tuple to the s.__getitem__ method. numpy arrays handle tuples (multidimensonal indexing), lists don't.
If we start with an array, then we can apply the reshape, and perform the desired subtraction:
In [132]: sa = np.array(s)
In [133]: sa
Out[133]: array([1, 2, 5])
In [134]: sa[:,None]
Out[134]:
array([[1],
[2],
[5]])
In [135]: mat - sa[:,None]
Out[135]:
array([[0, 1, 2],
[2, 3, 4],
[2, 3, 4]])
sa is 1d, so transpose doesn't change anything:
In [136]: sa.T
Out[136]: array([1, 2, 5])

How does "Fancy Indexing with Broadcasting and Boolean Masking" work?

I came across this snippet of code in Jake Vanderplas's Data Science Handbook. The concept of using Broadcasting along with Fancy Indexing here wasn't clear to me. Please explain.
In[5]: X = np.arange(12).reshape((3, 4))
X
Out[5]: array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In[6]: row = np.array([0, 1, 2])
col = np.array([2, 1, 3])
In[7]: X[row[:, np.newaxis], col]
Out[7]: array([[ 2, 1, 3],
[ 6, 5, 7],
[10, 9, 11]])
It says: "Here, each row value is matched with each column vector, exactly as we saw in broadcasting of arithmetic operations. For example:"
In[8]: row[:, np.newaxis] * col
Out[8]: array([[0, 0, 0],
[2, 1, 3],
[4, 2, 6]])
If you use an integer array to index another array
you basically loop over the given indices and pick the respective elements (may still be an array) along the axis you are indexing and stack them together.
arr55 = np.arange(25).reshape((5, 5))
# array([[ 0, 1, 2, 3, 4],
# [ 5, 6, 7, 8, 9],
# [10, 11, 12, 13, 14],
# [15, 16, 17, 18, 19],
# [20, 21, 22, 23, 24]])
arr53 = arr55[:, [3, 3, 4]]
# pick the elements at (arr[:, 3], arr[:, 3], arr[:, 4])
# array([[ 3, 3, 4],
# [ 8, 8, 9],
# [13, 13, 14],
# [18, 18, 19],
# [23, 23, 24]])
So if you index an (m, n) array with an row (or col) index of length k (or length l) the resulting shape is:
A_nm[row, :] -> A_km
A_nm[:, col] -> A_nl
If however you use two arrays row and col to index an array
you loop over both indices simultaneously and stack the elements (may still be arrays) at the respective position together.
Here it row and col must have the same length.
A_nm[row, col] -> A_k
array([ 3, 13, 24])
arr3 = arr55[[0, 2, 4], [3, 3, 4]]
# pick the element at (arr[0, 3], arr[2, 3], arr[4, 4])
Now finally for your question: it is possible to use broadcasting while indexing arrays. Sometimes it is not wanted that only the elements
(arr[0, 3], arr[2, 3], arr[4, 4])
are picked, but rather the expanded version:
(arr[0, [3, 3, 4]], arr[2, [3, 3, 4]], arr[4, [3, 3, 4]])
# each row value is matched with each column vector
This matching/broadcasting is exactly as in other arithmetic operations.
But the example here might be bad in the sense, that not the result of the shown multiplication is of importance for the indexing.
The focus here is on the combinations and the resulting shape:
row * col
# performs a element wise multiplication resulting in 3
numbers
row[:, np.newaxis] * col
# performs a multiplication where each row value is *matched* with each column vector
The example wanted to emphasis this matching of row and col.
We can have a look and play around with the different possibilities:
n = 3
m = 4
X = np.arange(n*m).reshape((n, m))
row = np.array([0, 1, 2]) # k = 3
col = np.array([2, 1, 3]) # l = 3
X[row, :] # A_nm[row, :] -> A_km
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11]])
X[:, col] # A_nm[:, col] -> A_nl
# array([[ 2, 1, 3],
# [ 6, 5, 7],
# [10, 9, 11]])
X[row, col] # A_nm[row, col] -> A_l == A_k
# array([ 2, 5, 11]
X[row, :][:, col] # A_nm[row, :][:, col] -> A_km[:, col] -> A_kl
# == X[:, col][row, :]
# == X[row[:, np.newaxis], col] # A_nm[row[:, np.newaxis], col] -> A_kl
# array([[ 2, 1, 3],
# [ 6, 5, 7],
# [10, 9, 11]])
X[row, col[:, np.newaxis]]
# == X[row[:, np.newaxis], col].T
# array([[ 2, 6, 10],
# [ 1, 5, 9],
# [ 3, 7, 11]])
I came here looking for an answer to this question, and hpaulj's comment helped me. I'm going to expand on it.
In the following snippet,
import numpy as np
X = np.arange(12).reshape((3, 4))
row = np.array([0, 1, 2])
col = np.array([2, 1, 3])
Y = X[row.reshape(-1, 1), col]
the indexes we're passing to X are getting broadcasted.
The code below, which follows the numpy broadcasting rules but uses far more memory, accomplishes the same slicing:
# Make the row and column indices 'conformable'
R = np.repeat(row.reshape(-1, 1), 3, axis=1) # repeat row index across columns
C = np.repeat(col.reshape(1, -1), 3, axis=0) # repeat column index across rows
Y = X[R, C] # Y[i, j] = X[R[i, j], C[i, j]]

How do I calculate xi^j in a matrix in Numpy

I am trying to calculate a matrix from an array that is inputted.
I would like to be able to input
a = [0,1,2]
in python and would like to reshape it with Numpy such that the result is that the array is in the form of x_i^j at row i and column j,
so for example
the input is:
a = [0,1,2]
and the output should be
[[1,0,0],
[1,1,1],
[1,2,4]]
and I have used the following code
xij = np.matrix([np.power(xi,j) for j in x for xi in x]).reshape(3,3)
[[ 1, 2, 3],
[ 1, 4, 9],
[ 1, 8, 27]]
I assume I'm using the wrong formula for Numpy,
please could you assist me in this to solve the problem.
Thanks in advance
You need to use a range(len(a)) to get the exponents and the correct order of for loops
a = [0,1,2]
xij = np.matrix([np.power(xi,j) for xi in a for j in range(len(a))]).reshape(3,3)
# matrix([[1, 0, 0],
# [1, 1, 1],
# [1, 2, 4]])
With array broadcasting:
In [823]: np.array([0,1,2])**np.arange(3)[:,None]
Out[823]:
array([[1, 1, 1],
[0, 1, 2],
[0, 1, 4]])
In [825]: np.array([1,2,3])**np.arange(1,4)[:,None]
Out[825]:
array([[ 1, 2, 3],
[ 1, 4, 9],
[ 1, 8, 27]])

Put numpy arrays split with np.split() back together

I have split a numpy array like so:
x = np.random.randn(10,3)
x_split = np.split(x,5)
which splits x equally into five numpy arrays each with shape (2,3) and puts them in a list. What is the best way to combine a subset of these back together (e.g. x_split[:k] and x_split[k+1:]) so that the resulting shape is similar to the original x i.e. (something,3)?
I found that for k > 0 this is possible with you do:
np.vstack((np.vstack(x_split[:k]),np.vstack(x_split[k+1:])))
but this does not work when k = 0 as x_split[:0] = [] so there must be a better and cleaner way. The error message I get when k = 0 is:
ValueError: need at least one array to concatenate
The comment by Paul Panzer is right on target, but since NumPy now gently discourages vstack, here is the concatenate version:
x = np.random.randn(10, 3)
x_split = np.split(x, 5, axis=0)
k = 0
np.concatenate(x_split[:k] + x_split[k+1:], axis=0)
Note the explicit axis argument passed both times (it has to be the same); this makes it easy to adapt the code to work for other axes if needed. E.g.,
x_split = np.split(x, 3, axis=1)
k = 0
np.concatenate(x_split[:k] + x_split[k+1:], axis=1)
np.r_ can turn several slices into a list of indices.
In [20]: np.r_[0:3, 4:5]
Out[20]: array([0, 1, 2, 4])
In [21]: np.vstack([xsp[i] for i in _])
Out[21]:
array([[9, 7, 5],
[6, 4, 3],
[9, 8, 0],
[1, 2, 2],
[3, 3, 0],
[8, 1, 4],
[2, 2, 5],
[4, 4, 5]])
In [22]: np.r_[0:0, 1:5]
Out[22]: array([1, 2, 3, 4])
In [23]: np.vstack([xsp[i] for i in _])
Out[23]:
array([[9, 8, 0],
[1, 2, 2],
[3, 3, 0],
[8, 1, 4],
[3, 2, 0],
[0, 3, 8],
[2, 2, 5],
[4, 4, 5]])
Internally np.r_ has a lot of ifs and loops to handle the slices and their boundaries, but it hides it all from us.
If the xsp (your x_split) was an array, we could do xsp[np.r_[...]], but since it is a list we have to iterate. Well we could also hide that iteration with an operator.itemgetter object.
In [26]: operator.itemgetter(*Out[22])
Out[26]: operator.itemgetter(1, 2, 3, 4)
In [27]: np.vstack(operator.itemgetter(*Out[22])(xsp))

Numpy assignment like 'numpy.take'

Is it possible to assign to a numpy array along the lines of how the take functionality works?
E.g. if I have a an array a, a list of indices inds, and a desired axis, I can use take as follows:
import numpy as np
a = np.arange(12).reshape((3, -1))
inds = np.array([1, 2])
print(np.take(a, inds, axis=1))
[[ 1 2]
[ 5 6]
[ 9 10]]
This is extremely useful when the indices / axis needed may change at runtime.
However, numpy does not let you do this:
np.take(a, inds, axis=1) = 0
print(a)
It looks like there is some limited (1-D) support for this via numpy.put, but I was wondering if there was a cleaner way to do this?
In [222]: a = np.arange(12).reshape((3, -1))
...: inds = np.array([1, 2])
...:
In [223]: np.take(a, inds, axis=1)
Out[223]:
array([[ 1, 2],
[ 5, 6],
[ 9, 10]])
In [225]: a[:,inds]
Out[225]:
array([[ 1, 2],
[ 5, 6],
[ 9, 10]])
construct an indexing tuple
In [226]: idx=[slice(None)]*a.ndim
In [227]: axis=1
In [228]: idx[axis]=inds
In [229]: a[tuple(idx)]
Out[229]:
array([[ 1, 2],
[ 5, 6],
[ 9, 10]])
In [230]: a[tuple(idx)] = 0
In [231]: a
Out[231]:
array([[ 0, 0, 0, 3],
[ 4, 0, 0, 7],
[ 8, 0, 0, 11]])
Or for a[inds,:]:
In [232]: idx=[slice(None)]*a.ndim
In [233]: idx[0]=inds
In [234]: a[tuple(idx)]
Out[234]:
array([[ 4, 0, 0, 7],
[ 8, 0, 0, 11]])
In [235]: a[tuple(idx)]=1
In [236]: a
Out[236]:
array([[0, 0, 0, 3],
[1, 1, 1, 1],
[1, 1, 1, 1]])
PP's suggestion:
def put_at(inds, axis=-1, slc=(slice(None),)):
return (axis<0)*(Ellipsis,) + axis*slc + (inds,) + (-1-axis)*slc
To be used as in a[put_at(ind_list,axis=axis)]
I've seen both styles on numpy functions. This looks like one used for extend_dims, mine was used in apply_along/over_axis.
earlier thoughts
In a recent take question I/we figured out that it was equivalent to arr.flat[ind] for some some raveled index. I'll have to look that up.
There is an np.put that is equivalent to assignment to the flat:
Signature: np.put(a, ind, v, mode='raise')
Docstring:
Replaces specified elements of an array with given values.
The indexing works on the flattened target array. `put` is roughly
equivalent to:
a.flat[ind] = v
Its docs also mention place and putmask (and copyto).
numpy multidimensional indexing and the function 'take'
I commented take (without axis) is equivalent to:
lut.flat[np.ravel_multi_index(arr.T, lut.shape)].T
with ravel:
In [257]: a = np.arange(12).reshape((3, -1))
In [258]: IJ=np.ix_(np.arange(a.shape[0]), inds)
In [259]: np.ravel_multi_index(IJ, a.shape)
Out[259]:
array([[ 1, 2],
[ 5, 6],
[ 9, 10]], dtype=int32)
In [260]: np.take(a,np.ravel_multi_index(IJ, a.shape))
Out[260]:
array([[ 1, 2],
[ 5, 6],
[ 9, 10]])
In [261]: a.flat[np.ravel_multi_index(IJ, a.shape)] = 100
In [262]: a
Out[262]:
array([[ 0, 100, 100, 3],
[ 4, 100, 100, 7],
[ 8, 100, 100, 11]])
and to use put:
In [264]: np.put(a, np.ravel_multi_index(IJ, a.shape), np.arange(1,7))
In [265]: a
Out[265]:
array([[ 0, 1, 2, 3],
[ 4, 3, 4, 7],
[ 8, 5, 6, 11]])
Use of ravel is unecessary in this case but might useful in others.
I have given an example for use of
numpy.take in 2 dimensions. Perhaps you can adapt that to your problem
You can juste use indexing in this way :
a[:,[1,2]]=0

Categories