Related
I have got two arrays like those below:
first_array = np.array([2,2,2,10,10,15,20,20,20,20])
second_array = np.array([15,5,10,78,2,44,20,2,66,1,10,15,40,85,71,23,20,45,29,20,1])
I want to search for each element of the first array inside the second one to end up with a 2D array which includes the indexes of the second array and the values searched.
The codes below are working and giving what I desire to me, but they also seem definitely inefficient to me. There must exist an index operation or some different approaches rather than applying element by element search (loop).
out = []
for i in first_array:
index = np.argwhere(second_array == i)
out.append(np.array([*index.T,np.ones(len(index))*i]))
np.hstack(out).T
Here is the desired output for those who don't want to run the codes.
desired_output = np.array([[4,7,4,7,4,7,2,10,2,10,0,11,6,16,19,6,16,19,6,16,19,6,16,19],
[2,2,2,2,2,2,10,10,10,10,15,15,20,20,20,20,20,20,20,20,20,20,20,20]]).T
Thanks in advance!
It is easy to think of two solutions:
Broadcast comparison, but this is still the O(mn) algorithm (m = len(first_array) and n = len(second_array)):
>>> i, j = (first_array[:, None] == second_array).nonzero()
>>> np.array([j, first_array[i]]).T
array([[ 4, 2],
[ 7, 2],
[ 4, 2],
[ 7, 2],
[ 4, 2],
[ 7, 2],
[ 2, 10],
[10, 10],
[ 2, 10],
[10, 10],
[ 0, 15],
[11, 15],
[ 6, 20],
[16, 20],
[19, 20],
[ 6, 20],
[16, 20],
[19, 20],
[ 6, 20],
[16, 20],
[19, 20],
[ 6, 20],
[16, 20],
[19, 20]], dtype=int64)
Use the dictionary to build the mapping between elements and indices. If don't consider the final concatenating result step, this is the O(m + n) algorithm. However, in the current example, the performance is not as good as the first solution:
>>> mp = {}
>>> for i, val in enumerate(first_array):
... mp.setdefault(val, []).append(i)
...
>>> np.concatenate([(indices := mp[val], [val] * len(indices))
... for val in first_array], -1).T
array([[ 4, 2],
[ 7, 2],
[ 4, 2],
[ 7, 2],
[ 4, 2],
[ 7, 2],
[ 2, 10],
[10, 10],
[ 2, 10],
[10, 10],
[ 0, 15],
[11, 15],
[ 6, 20],
[16, 20],
[19, 20],
[ 6, 20],
[16, 20],
[19, 20],
[ 6, 20],
[16, 20],
[19, 20],
[ 6, 20],
[16, 20],
[19, 20]])
I have an array of 64 numbers arranged in an 8x8 using x = np.arange(64).reshape(8, 8) which i would like to reshape to be a 4x4 array with 2x2 sub arrays.
I.e. this original array should become this.
Simply reshaping the array using x = x.reshape(4,4,2,2) results in this outcome
How do i overcome this?
Thanks
You need to do a little rearranging of the axes to match your expected outcome. In this case you've split each 8-length side into 4 blocks of 2, but then want the blocks of 2 in the final dimensions. So:
np.arange(64).reshape(4,2,4,2).transpose(0,2,1,3)
Out[]:
array([[[[ 0, 1],
[ 8, 9]],
[[ 2, 3],
[10, 11]],
[[ 4, 5],
[12, 13]],
[[ 6, 7],
[14, 15]]],
[[[16, 17],
[24, 25]],
[[18, 19],
[26, 27]],
[[20, 21],
[28, 29]],
[[22, 23],
[30, 31]]],
[[[32, 33],
[40, 41]],
[[34, 35],
[42, 43]],
[[36, 37],
[44, 45]],
[[38, 39],
[46, 47]]],
[[[48, 49],
[56, 57]],
[[50, 51],
[58, 59]],
[[52, 53],
[60, 61]],
[[54, 55],
[62, 63]]]])
Another way could be using the sliding_window:
y = np.lib.stride_tricks.sliding_window_view(x, (2,2))[::2, ::2]
y
array([[[[ 0, 1],
[ 8, 9]],
[[ 2, 3],
[10, 11]],
[[ 4, 5],
[12, 13]],
[[ 6, 7],
[14, 15]]],
[[[16, 17],
[24, 25]],
I have a multidimensional array of shape (n,x,y). For this example can use this array
A = array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23]],
[[24, 25, 26],
[27, 28, 29],
[30, 31, 32],
[33, 34, 35]]])
I then have another multidimensional array that has index values that I want to use on the original array, A. This has shape (z,2) and the values represent row values index’s
Row_values = array([[0,1],
[0,2],
[1,2],
[1,3]])
So I want to use all the index values in row_values to apply to each of the three arrays in A so I end up with a final array of shape (12,2,3)
Result = ([[[0,1,2],
[3,4,5]],
[[0,1,2],
[6,7,8]],
[[3,4,5],
[6,7,8]]
[[3,4,5],
[9,10,11],
[[12,13,14],
[15,16,17]],
[[12,13,14],
[18,19,20]],
[[15,16,17],
[18,19,20]],
[[15,16,17],
[21,22,23]],
[[24,25,26],
[27,28,29]],
[[24,25,26],
[30,31,32]],
[[27,28,29],
[30,31,32]],
[[27,28,29],
[33,34,35]]]
I have tried using np.take() but haven’t been able to make it work. Not sure if there’s another numpy function that is easier to use
We can advantage of NumPy's advanced indexing and using np.repeat and np.tile along with it.
cidx = np.tile(Row_values, (A.shape[0], 1))
ridx = np.repeat(np.arange(A.shape[0]), Row_values.shape[0])
out = A[ridx[:, None], cidx]
# out.shape -> (12, 2, 3)
Using np.take
np.take(A, Row_values, axis=1).reshape((-1, 2, 3))
# Or
A[:, Row_values].reshape((-1, 2, 3))
Output:
array([[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 0, 1, 2],
[ 6, 7, 8]],
[[ 3, 4, 5],
[ 6, 7, 8]],
[[ 3, 4, 5],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17]],
[[12, 13, 14],
[18, 19, 20]],
[[15, 16, 17],
[18, 19, 20]],
[[15, 16, 17],
[21, 22, 23]],
[[24, 25, 26],
[27, 28, 29]],
[[24, 25, 26],
[30, 31, 32]],
[[27, 28, 29],
[30, 31, 32]],
[[27, 28, 29],
[33, 34, 35]]])
I am stuck at, as to how does np.argmax(arr, axis=0) work? I know how np.argmax(axis=0) works on 2D arrays. But this 3D one has really confused me.
My Code:
arr = np.array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]],
[[13, 14, 15],
[16, 17, 18],
[19, 20, 21],
[22, 23, 24]],
[[25, 26, 27],
[28, 29, 30],
[31, 32, 33],
[34, 35, 36]]])
Operation:
np.argmax(arr, axis = 0)
Output:
array([[2, 2, 2],
[2, 2, 2],
[2, 2, 2],
[2, 2, 2]], dtype=int64)
FYI - I do know how np.argmax(axis=0) works on 2D arrays. But this 3D one has really confused me.
You need to understand better what is axis=0 here. It can be interpreted as height level of rectangle. So your output shows different levels of that rectangle:
level 0 level 1 level 2
[ 1, 2, 3] [13, 14, 15] [16, 17, 18]
[ 4, 5, 6] [16, 17, 18] [19, 20, 21]
[ 7, 8, 9] [19, 20, 21] [22, 23, 24]
[10, 11, 12] [22, 23, 24] [25, 16, 27]
Then argmax describes indices of levels at which max values are attained. They are:
[16, 17, 18]
[19, 20, 21]
[22, 23, 24]
[25, 16, 27]
It's definitely the upmost level (number 2) for any of these cells
so argmax of every cell is assigned to 2.
I am working with images through numpy. I want to set a chunk of the image to its average color. I am able to do this, but I have to re-index the array, when I would like to use the original view to do this. In other words, I would like to use that 4th line of code, but I'm stuck with the 3rd one.
I have read a few posts about the as_strided function, but it is confusing to me, and I was hoping there might be a simpler solution. So is there a way to slightly modify that last line of code to do what I want?
box = im[x-dx:x+dx, y-dy:y+dy, :]
avg = block(box) #returns a 1D numpy array with 3 values
im[x-dx:x+dx, y-dy:y+dy, :] = avg[None,None,:] #sets box to average color
#box = avg[None,None,:] #does not affect original array
box = blah
just reassigns the box variable. The array that the box variable previously referred to is unaffected. This is not what you want.
box[:] = blah
is a slice assignment. It modifies the contents of the array. This is what you want.
Note that slice assignment is dependent on the syntactic form of the statement. The fact that box was assigned by box = im[stuff] does not make further assignments to box slice assignments. This is similar to how if you do
l = [1, 2, 3]
b = l[2]
b = 0
the assignment to b doesn't affect l.
Gray-scale Images
This will set a chunk of an array to its average (mean) value:
im[2:4, 2:4] = im[2:4, 2:4].mean()
For example:
In [9]: im
Out[9]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [10]: im[2:4, 2:4] = im[2:4, 2:4].mean()
In [11]: im
Out[11]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 12, 12],
[12, 13, 12, 12]])
Color Images
Suppose that we want to average over each component of color separately:
In [22]: im = np.arange(48).reshape((4,4,3))
In [23]: im
Out[23]:
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23]],
[[24, 25, 26],
[27, 28, 29],
[30, 31, 32],
[33, 34, 35]],
[[36, 37, 38],
[39, 40, 41],
[42, 43, 44],
[45, 46, 47]]])
In [24]: im[2:4, 2:4, :] = im[2:4, 2:4, :].mean(axis=0).mean(axis=0)[np.newaxis, np.newaxis, :]
In [25]: im
Out[25]:
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23]],
[[24, 25, 26],
[27, 28, 29],
[37, 38, 39],
[37, 38, 39]],
[[36, 37, 38],
[39, 40, 41],
[37, 38, 39],
[37, 38, 39]]])