Is it possible to perform min/max in-place assignment with NumPy multi-dimensional arrays without an extra copy?
Say, a and b are two 2D numpy arrays and I would like to have a[i,j] = min(a[i,j], b[i,j]) for all i and j.
One way to do this is:
a = numpy.minimum(a, b)
But according to the documentation, numpy.minimum creates and returns a new array:
numpy.minimum(x1, x2[, out])
Element-wise minimum of array elements.
Compare two arrays and returns a new array containing the element-wise minima.
So in the code above, it will create a new temporary array (min of a and b), then assign it to a and dispose it, right?
Is there any way to do something like a.min_with(b) so that the min-result is assigned back to a in-place?
numpy.minimum() takes an optional third argument, which is the output array. You can specify a there to have it modified in place:
In [9]: a = np.array([[1, 2, 3], [2, 2, 2], [3, 2, 1]])
In [10]: b = np.array([[3, 2, 1], [1, 2, 1], [1, 2, 1]])
In [11]: np.minimum(a, b, a)
Out[11]:
array([[1, 2, 1],
[1, 2, 1],
[1, 2, 1]])
In [12]: a
Out[12]:
array([[1, 2, 1],
[1, 2, 1],
[1, 2, 1]])
Related
Say I have two matrices, A and B:
A = np.array([[1, 3, 2],
[2, 2, 3],
[3, 1, 1]])
B = np.array([[0, 1, 0],
[1, 1, 0],
[1, 1, 1]])
I want to take one column in A and multiply it by each column in B element-wise, then proceed to the next column in A. So, using just one column as an example, I will use A[:,0] (values 1,2,3), and multiply it by each column in B to get this:
array([[0, 1, 0],
[2, 2, 0],
[3, 3, 3]])
I've implemented this using np.einsum like so:
np.einsum('i,ij->ij',A[:,0],B)
I then want to generate a 3D matrix with the depth dimension corresponding to the multiplication by each column in A, which I implemented using a for loop:
np.stack([np.einsum('i,ij->ij',A[:,i],B) for i in range(0,A.shape[1])])
This returns my desired array:
array([[[0, 1, 0],
[2, 2, 0],
[3, 3, 3]],
[[0, 3, 0],
[2, 2, 0],
[1, 1, 1]],
[[0, 2, 0],
[3, 3, 0],
[1, 1, 1]]])
How would I go about doing this without the loop? Can this be done purely with np.einsum? Is there another function in NumPy that will do this more simply?
Here's a simple way:
A.T[:,:,None]*B
adding the last None in indexing creates a new axis which is then used for broadcasting the elementwise multiplication.
How about this code?
A.T.reshape(3, 3, 1) * B
Reshaping ndarray can make doing many things...
Keeping with your usage of einsum:
np.einsum('ij,ik->jik', A, B)
I have a list of arrays, say
List = [A,B,C,D,E,...]
where each A,B,C etc. is an nxn array.
I wish to have the most efficient algorithm to find the unique nxn arrays in the list. That is, say if all entries of A and B are equal, then we discard one of them and generate the list
UniqueList = [A,C,D,E,...]
Not sure if there is a faster way, but I think this should be pretty fast (using the built-in unique function of numpy and choosing axis=0 to look for nxn unique arrays. More detail in the numpy doc):
[i for i in np.unique(np.array(List),axis=0)]
Example:
A = np.array([[1,1],[1,1]])
B = np.array([[1,1],[1,2]])
List = [A,B,A]
[array([[1, 1],
[1, 1]]),
array([[1, 1],
[1, 2]]),
array([[1, 1],
[1, 1]])]
Output:
[array([[1, 1],
[1, 1]]),
array([[1, 1],
[1, 2]])]
How can the rows in an array be sorted without that the values in each row will changed?
Furthermore: how to get the indicies of this sort-process?
input:
a = np.array([[4,3],[0,3],[3,0],[1,3],[1,2],[2,0]])
required sorting arrray:
b = np.array([1,4,3,5,2,0])
a = a[b]
output:
a = np.array([[0,3],[1,2],[1,3][2,0],[3,0],[4,3]])
How do I get the array b ?
You need lexsort here:
b = np.lexsort((a[:, 1], a[:, 0]))
# array([1, 4, 3, 5, 2, 0], dtype=int64)
And applied to your initial array:
>>> a[b]
array([[0, 3],
[1, 2],
[1, 3],
[2, 0],
[3, 0],
[4, 3]])
As #miradulo pointed out, you may also use:
b = np.lexsort(np.fliplr(a).T)
Which is less verbose than explicitly stating the columns to sort on.
Inspired by this other question, I'm trying to wrap my mind around advanced indexing in NumPy and build up more intuitive understanding of how it works.
I've found an interesting case. Here's an array:
>>> y = np.arange(10)
>>> y
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
if I index it a scalar, I get a scalar of course:
>>> y[4]
4
with a 1D array of integers, I get another 1D array:
>>> idx = [4, 3, 2, 1]
>>> y[idx]
array([4, 3, 2, 1])
so if I index it with a 2D array of integers, I get... what do I get?
>>> idx = [[4, 3], [2, 1]]
>>> y[idx]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: too many indices for array
Oh no! The symmetry is broken. I have to index with a 3D array to get a 2D array!
>>> idx = [[[4, 3], [2, 1]]]
>>> y[idx]
array([[4, 3],
[2, 1]])
What makes numpy behave this way?
To make this more interesting, I noticed that indexing with numpy arrays (instead of lists) behaves how I'd intuitively expect, and 2D gives me 2D:
>>> idx = np.array([[4, 3], [2, 1]])
>>> y[idx]
array([[4, 3],
[2, 1]])
This looks inconsistent from where I'm at. What's the rule here?
The reason is the interpretation of lists as index for numpy arrays: Lists are interpreted like tuples and indexing with a tuple is interpreted by NumPy as multidimensional indexing.
Just like arr[1, 2] returns the element arr[1][2] the arr[[[4, 3], [2, 1]]] is identical to arr[[4, 3], [2, 1]] and will, according to the rules of multidimensional indexing return the elements arr[4, 2] and arr[3, 1].
By adding one more list you do tell NumPy that you want slicing along the first dimension, because the outermost list is effectively interpreted as if you only passed in one "list of indices for the first dimension": arr[[[[4, 3], [2, 1]]]].
From the documentation:
Example
From each row, a specific element should be selected. The row index is just [0, 1, 2] and the column index specifies the element to choose for the corresponding row, here [0, 1, 0]. Using both together the task can be solved using advanced indexing:
>>> x = np.array([[1, 2], [3, 4], [5, 6]])
>>> x[[0, 1, 2], [0, 1, 0]]
array([1, 4, 5])
and:
Warning
The definition of advanced indexing means that x[(1,2,3),] is fundamentally different than x[(1,2,3)]. The latter is equivalent to x[1,2,3] which will trigger basic selection while the former will trigger advanced indexing. Be sure to understand why this occurs.
In such cases it's probably better to use np.take:
>>> y.take([[4, 3], [2, 1]]) # 2D array
array([[4, 3],
[2, 1]])
This function [np.take] does the same thing as “fancy” indexing (indexing arrays using arrays); however, it can be easier to use if you need elements along a given axis.
Or convert the indices to an array. That way NumPy interprets it (array is special cased!) as fancy indexing instead of as "multidimensional indexing":
>>> y[np.asarray([[4, 3], [2, 1]])]
array([[4, 3],
[2, 1]])
for example, I have the numpy arrays like this
a =
array([[1, 2, 3],
[4, 3, 2]])
and index like this to select the max values
max_idx =
array([[0, 2],
[1, 0]])
how can I access there positions at the same time, to modify them.
like "a[max_idx] = 0" getting the following
array([[1, 2, 0],
[0, 3, 2]])
Simply use subscripted-indexing -
a[max_idx[:,0],max_idx[:,1]] = 0
If you are working with higher dimensional arrays and don't want to type out slices of max_idx for each axis, you can use linear-indexing to assign zeros, like so -
a.ravel()[np.ravel_multi_index(max_idx.T,a.shape)] = 0
Sample run -
In [28]: a
Out[28]:
array([[1, 2, 3],
[4, 3, 2]])
In [29]: max_idx
Out[29]:
array([[0, 2],
[1, 0]])
In [30]: a[max_idx[:,0],max_idx[:,1]] = 0
In [31]: a
Out[31]:
array([[1, 2, 0],
[0, 3, 2]])
Numpy support advanced slicing like this:
a[b[:, 0], b[:, 1]] = 0
Code above would fit your requirement.
If b is more than 2-D. A better way should be like this:
a[np.split(b, 2, axis=1)]
The np.split will split ndarray into columns.