I want to extract a complete slice from a 3D numpy array using ndeumerate or something similar.
arr = np.random.rand(4, 3, 3)
I want to extract all possible arr[:, x, y] where x, y range from 0 to 2
ndindex is a convenient way of generating the indices corresponding to a shape:
In [33]: arr = np.arange(36).reshape(4,3,3)
In [34]: for xy in np.ndindex((3,3)):
...: print(xy, arr[:,xy[0],xy[1]])
...:
(0, 0) [ 0 9 18 27]
(0, 1) [ 1 10 19 28]
(0, 2) [ 2 11 20 29]
(1, 0) [ 3 12 21 30]
(1, 1) [ 4 13 22 31]
(1, 2) [ 5 14 23 32]
(2, 0) [ 6 15 24 33]
(2, 1) [ 7 16 25 34]
(2, 2) [ 8 17 26 35]
It uses nditer, but doesn't have any speed advantages over a nested pair of for loops.
In [35]: for x in range(3):
...: for y in range(3):
...: print((x,y), arr[:,x,y])
ndenumerate uses arr.flat as the iterator, but using it to
In [38]: for xy, _ in np.ndenumerate(arr[0,:,:]):
...: print(xy, arr[:,xy[0],xy[1]])
does the same thing, iterating on the elements of a 3x3 subarray. As with ndindex it generates the indices. The element won't be the size 4 array that you want, so I ignored that.
A different approach is to flatten the later axes, transpose, and then just iterate on the (new) first axis:
In [43]: list(arr.reshape(4,-1).T)
Out[43]:
[array([ 0, 9, 18, 27]),
array([ 1, 10, 19, 28]),
array([ 2, 11, 20, 29]),
array([ 3, 12, 21, 30]),
array([ 4, 13, 22, 31]),
array([ 5, 14, 23, 32]),
array([ 6, 15, 24, 33]),
array([ 7, 16, 25, 34]),
array([ 8, 17, 26, 35])]
or with the print as before:
In [45]: for a in arr.reshape(4,-1).T:print(a)
Why not just
[arr[:, x, y] for x in range(3) for y in range(3)]
Related
I am new to numpy.Recently only I started learning.I am doing one practice problem and getting error.
Question is to replace all even elements in the array by -1.
import numpy as np
np.random.seed(123)
array6 = np.random.randint(1,50,20)
slicing_array6 = array6[array6 %2==0]
print(slicing_array6)
slicing_array6[:]= -1
print(slicing_array6)
print("Answer is:")
print(array6)
I am getting output as :
[46 18 20 34 48 10 48 26 20]
[-1 -1 -1 -1 -1 -1 -1 -1 -1]
Answer is:
[46 3 29 35 39 18 20 43 23 34 33 48 10 33 47 33 48 26 20 15]
My doubt is why original array not replaced?
Thank you in advance for help
In [12]: np.random.seed(123)
...: array6 = np.random.randint(1,50,20)
...: slicing_array6 = array6[array6 %2==0]
In [13]: array6.shape
Out[13]: (20,)
In [14]: slicing_array6.shape
Out[14]: (9,)
slicing_array6 is not a view; it's a copy. It does not use or reference the array6 data:
In [15]: slicing_array6.base
Modifying this copy does not change array6:
In [16]: slicing_array6[:] = -1
In [17]: slicing_array6
Out[17]: array([-1, -1, -1, -1, -1, -1, -1, -1, -1])
In [18]: array6
Out[18]:
array([46, 3, 29, 35, 39, 18, 20, 43, 23, 34, 33, 48, 10, 33, 47, 33, 48,
26, 20, 15])
But if the indexing and modification occurs in the same step:
In [19]: array6[array6 %2==0] = -1
In [20]: array6
Out[20]:
array([-1, 3, 29, 35, 39, -1, -1, 43, 23, -1, 33, -1, -1, 33, 47, 33, -1,
-1, -1, 15])
slicing_array6 = array6[array6 %2==0] has actually done
array6.__getitem__(array6%2==0)
array6[array6 %2==0] = -1 does
array6.__setitem__(array6 %2==0, -1)
A __setitem__ applied to a view does change the original, the base.
An example with view that works:
In [32]: arr = np.arange(10)
In [33]: arr
Out[33]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [34]: x = arr[::3] # basic indexing, with a slice
In [35]: x
Out[35]: array([0, 3, 6, 9])
In [36]: x.base # it's a `view` of arr
Out[36]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [37]: x[:] = -1
In [38]: arr
Out[38]: array([-1, 1, 2, -1, 4, 5, -1, 7, 8, -1])
Explanation
Let's move step by step. We start with
array6 = np.array([46, 3, 29, 35, 39, 18, 20, 43, 23, 34, 33, 48, 10, 33, 47, 33, 48, 26, 20, 15])
print(array6 %2==0)
# array([ True, False, False, False, False, True, True, False, False,
# True, False, True, True, False, False, False, True, True,
# True, False])
You just made a mask (%2==0) to array6. Then you apply the mask to it and assign to a new variable:
slicing_array6 = array6[array6 %2==0]
print(slicing_array6)
# [46 18 20 34 48 10 48 26 20]
Note that this returns a new array:
print(id(array6))
# 2643833531968
print(id(slicing_array6))
# 2643833588112
print(array6)
# [46 3 29 35 39 18 20 43 23 34 33 48 10 33 47 33 48 26 20 15]
# unchanged !!
Final step, you assign all elements in slicing_array6 to -1:
slicing_array6[:]= -1
print(slicing_array6)
# [-1 -1 -1 -1 -1 -1 -1 -1 -1]
Solution
Instead of assigning masked array to a new variable, you apply new value directly to the original array:
array6[array6 %2==0] = -1
print(array6)
print(id(array6))
# [-1 3 29 35 39 -1 -1 43 23 -1 33 -1 -1 33 47 33 -1 -1 -1 15]
# 2643833531968
# same id !!
Numpy's slicing functionality directly modifies the array, but you have to assign the value you want to it.
Not to completely ruin your learning, here is a similar example to what you are trying to achieve:
import numpy as np
a = np.arange(10)**3
print(f"start: {a}")
#start: [ 0 1 8 27 64 125 216 343 512 729]
# to assign every 3rd item as -99
a[2::3] = -99
print(f"answer: {a}")
#answer: [ 0 1 -99 27 64 -99 216 343 -99 729]
I want to extract some members from many large numpy arrays. A simple example is
A = np.arange(36).reshape(6,6)
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])
I want to extract the members in a shifted windows in each row with minimum stride 2 and maximum stride 4. For example, in the first row, I would like to have
[2, 3, 4] # A[i,i+2:i+4+1] where i == 0
In the second row, I want to have
[9, 10, 11] # A[i,i+2:i+4+1] where i == 1
In the third row, I want to have
[16, 17, 0]
[[2, 3, 4],
[9, 10, 11],
[16 17, 0],
[23, 0, 0]]
I want to know efficient ways to do this. Thanks.
you can extract values in an array by providing a list of indices for each dimension. For example, if you want the second diagonal, you can use arr[np.arange(0, len(arr)-1), np.arange(1, len(arr))]
For your example, I would do smth like what's in the code below. although, I did not account for different strides. If you want to account for strides, you can change the behaviour of the index list creation. If you struggle with adding the stride functionality, comment and I'll edit this answer.
import numpy as np
def slide(arr, window_len = 3, start = 0):
# pad array to get zeros out of bounds
arr_padded = np.zeros((arr.shape[0], arr.shape[1]+window_len-1))
arr_padded[:arr.shape[0], :arr.shape[1]] = arr
# compute the number of window moves
repeats = min(arr.shape[0], arr.shape[1]-start)
# create index lists
idx0 = np.arange(repeats).repeat(window_len)
idx1 = np.concatenate(
[np.arange(start+i, start+i+window_len)
for i in range(repeats)])
return arr_padded[idx0, idx1].reshape(-1, window_len)
A = np.arange(36).reshape(6,6)
print(f'A =\n{A}')
print(f'B =\n{slide(A, start = 2)}')
output:
A =
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]
[24 25 26 27 28 29]
[30 31 32 33 34 35]]
B =
[[ 2. 3. 4.]
[ 9. 10. 11.]
[16. 17. 0.]
[23. 0. 0.]]
So I have an array:
array([[[27, 27, 28],
[27, 14, 28]],
[[14, 5, 4],
[ 5, 6, 14]]])
How can I iterate through it and on each iteration get the [a, b, c] values, I try like that:
for v in np.nditer(a):
print(v)
but it just prints
27
27
28
27
14
28
14
5
4
5
6
I need:
[27 27 28]
[27 14 28]...
b = a.reshape(-1, 3)
for triplet in b:
...
Apparently you want to iterate on the first 2 dimensions of the array, returning the 3rd (as 1d array).
In [242]: y = np.array([[[27, 27, 28],
...: [27, 14, 28]],
...:
...: [[14, 5, 4],
...: [ 5, 6, 14]]])
Double loops are fine, as is reshaping to a (4,2) and iterating.
nditer isn't usually needed, or encouraged as an iteration mechanism (its documentation needs a stronger disclaimer). It's really meant for C level code. It isn't used much in Python level code. One exception is the np.ndindex function, which can be useful in this case:
In [244]: for ij in np.ndindex(y.shape[:2]):
...: print(ij, y[ij])
...:
(0, 0) [27 27 28]
(0, 1) [27 14 28]
(1, 0) [14 5 4]
(1, 1) [ 5 6 14]
ndindex uses nditer in multi_index mode on a temp array of the specified shape.
Where possible try to work without iteration. Iteration, with any of these tricks, is relatively slow.
You could do something ugly as
for i in range(len(your_array)):
for j in range(len(your_array[i])):
print(your_array[i][j])
Think of it as having arrays within an array. So within array v you have array a which in turn contains the triplets b
import numpy as np
na = np.array
v=na([[[27, 27, 28], [27, 14, 28]], [[14, 5, 4],[ 5, 6, 14]]])
for a in v:
for b in a:
print b
Output:
[27 27 28]
[27 14 28]
[14 5 4]
[ 5 6 14]
Alternatively you could do the following,
v2 = [b for a in v for b in a]
Now all your triplets are stored in v2
[array([27, 27, 28]),
array([27, 14, 28]),
array([14, 5, 4]),
array([ 5, 6, 14])]
..and you can access them like a 1D array eg
print v2[0]
gives..
array([27, 27, 28])
Another alternative (useful for arbitrary dimensionality of the array containing the n-tuples):
a_flat = a.ravel()
n = 3
m = len(a_flat)/n
[a_flat[i:i+n] for i in range(m)]
or in one line (slower):
[a.ravel()[i:i+n] for i in range(len(a.ravel())/n)]
or for further usage within a loop:
for i in range(len(a.ravel())/n):
print a.ravel()[i:i+n]
Reshape the array A (whose shape is n1, n2, 3) to array B (whose shape is n1 * n2, 3), and iterate through B. Note that B is just A's view. A and B share the same data block in the memory, but they have different array headers information where records their shapes, and changing values in B will also change A's value. The code below:
a = np.array([[[27, 27, 28],[27, 14, 28]],
[[14, 5, 4],[ 5, 6, 14]]])
b = a.reshape((-1, 3))
for last_d in b:
a, b, c = last_d
# do something with last_d or a, b, c
What is the Pythonic way to get a list of diagonal elements in a matrix passing through entry (i,j)?
For e.g., given a matrix like:
[1 2 3 4 5]
[6 7 8 9 10]
[11 12 13 14 15]
[16 17 18 19 20]
[21 22 23 24 25]
and an entry, say, (1,3) (representing element 9) how can I get the elements in the diagonals passing through 9 in a Pythonic way? Basically, [3,9,15] and [5,9,13,17,21] both.
Using np.diagonal with a little offset logic.
import numpy as np
lst = np.array([[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]])
i, j = 1, 3
major = np.diagonal(lst, offset=(j - i))
print(major)
array([ 3, 9, 15])
minor = np.diagonal(np.rot90(lst), offset=-lst.shape[1] + (j + i) + 1)
print(minor)
array([ 5, 9, 13, 17, 21])
The indices i and j are the row and column. By specifying the offset, numpy knows from where to begin selecting elements for the diagonal.
For the major diagonal, You want to start collecting from 3 in the first row. So you need to take the current column index and subtract it by the current row index, to figure out the correct column index at the 0th row. Similarly for the minor diagonal, where the array is flipped (rotated by 90˚) and the process repeats.
As another alternative method, with raveling the array and for matrix with shape (n*n):
array = np.array([[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]])
x, y = 1, 3
a_mod = array.ravel()
size = array.shape[0]
if y >= x:
diag = a_mod[y-x:(x+size-y)*size:size+1]
else:
diag = a_mod[(x-y)*size::size+1]
if x-(size-1-y) >= 0:
reverse_diag = array[:, ::-1].ravel()[(x-(size-1-y))*size::size+1]
else:
reverse_diag = a_mod[x:x*size+1:size-1]
# diag --> [ 3 9 15]
# reverse_diag --> [ 5 9 13 17 21]
The correctness of the resulted arrays must be checked further. This can be developed to handle matrices with other shapes e.g. (n*m).
In python/numpy, how can I subset a multidimensional array where another one, of the same shape, is maximum along some axis (e.g. the first one)?
Suppose I have two 3*2*4 arrays, a and b. I want to obtain a 2*4 array containing the values of b at the locations where a has its maximal values along the first axis.
import numpy as np
np.random.seed(7)
a = np.random.rand(3*2*4).reshape((3,2,4))
b = np.random.rand(3*2*4).reshape((3,2,4))
print a
#[[[ 0.07630829 0.77991879 0.43840923 0.72346518]
# [ 0.97798951 0.53849587 0.50112046 0.07205113]]
#
# [[ 0.26843898 0.4998825 0.67923 0.80373904]
# [ 0.38094113 0.06593635 0.2881456 0.90959353]]
#
# [[ 0.21338535 0.45212396 0.93120602 0.02489923]
# [ 0.60054892 0.9501295 0.23030288 0.54848992]]]
print a.argmax(axis=0) #(I would like b at these locations along axis0)
#[[1 0 2 1]
# [0 2 0 1]]
I can do this really ugly manual subsetting:
index = zip(a.argmax(axis=0).flatten(),
[0]*a.shape[2]+[1]*a.shape[2], # a.shape[2] = 4 here
range(a.shape[2])+range(a.shape[2]))
# [(1, 0, 0), (0, 0, 1), (2, 0, 2), (1, 0, 3),
# (0, 1, 0), (2, 1, 1), (0, 1, 2), (1, 1, 3)]
Which would allow me to obtain my desired result:
b_where_a_is_max_along0 = np.array([b[i] for i in index]).reshape(2,4)
# For verification:
print a.max(axis=0) == np.array([a[i] for i in index]).reshape(2,4)
#[[ True True True True]
# [ True True True True]]
What is the smart, numpy way to achieve this? Thanks :)
Use advanced-indexing -
m,n = a.shape[1:]
b_out = b[a.argmax(0),np.arange(m)[:,None],np.arange(n)]
Sample run -
Setup input array a and get its argmax along first axis -
In [185]: a = np.random.randint(11,99,(3,2,4))
In [186]: idx = a.argmax(0)
In [187]: idx
Out[187]:
array([[0, 2, 1, 2],
[0, 1, 2, 0]])
In [188]: a
Out[188]:
array([[[49*, 58, 13, 69], # * are the max positions
[94*, 28, 55, 86*]],
[[34, 17, 57*, 50],
[48, 73*, 22, 80]],
[[19, 89*, 42, 71*],
[24, 12, 66*, 82]]])
Verify results with b -
In [193]: b
Out[193]:
array([[[18*, 72, 35, 51], # Mark * at the same positions in b
[74*, 57, 50, 84*]], # and verify
[[58, 92, 53*, 65],
[51, 95*, 43, 94]],
[[85, 23*, 13, 17*],
[17, 64, 35*, 91]]])
In [194]: b[a.argmax(0),np.arange(2)[:,None],np.arange(4)]
Out[194]:
array([[18, 23, 53, 17],
[74, 95, 35, 84]])
You could use ogrid
>>> x = np.random.random((2,3,4))
>>> x
array([[[ 0.87412737, 0.11069105, 0.86951092, 0.74895912],
[ 0.48237622, 0.67502597, 0.11935148, 0.44133397],
[ 0.65169681, 0.21843482, 0.52877862, 0.72662927]],
[[ 0.48979028, 0.97103611, 0.36459645, 0.80723839],
[ 0.90467511, 0.79118429, 0.31371856, 0.99443492],
[ 0.96329039, 0.59534491, 0.15071331, 0.52409446]]])
>>> y = np.argmax(x, axis=1)
>>> y
array([[0, 1, 0, 0],
[2, 0, 0, 1]])
>>> i, j = np.ogrid[:2,:4]
>>> x[i ,y, j]
array([[ 0.87412737, 0.67502597, 0.86951092, 0.74895912],
[ 0.96329039, 0.97103611, 0.36459645, 0.99443492]])