I need some help to detect all values (coordinates) of 2D array which verify a specific conditional.
I have already asked a similar question, but now I masked specific values which don't interest me...
Last time, a person suggested to use zip(*np.where(test2D < 5000.))
For example:
import numpy as np
test2D = np.array([[ 3051.11, 2984.85, 3059.17],
[ 3510.78, 3442.43, 3520.7 ],
[ 4045.91, 3975.03, 4058.15],
[ 4646.37, 4575.01, 4662.29],
[ 5322.75, 5249.33, 5342.1 ],
[ 6102.73, 6025.72, 6127.86],
[ 6985.96, 6906.81, 7018.22],
[ 7979.81, 7901.04, 8021. ],
[ 9107.18, 9021.98, 9156.44],
[ 10364.26, 10277.02, 10423.1 ],
[ 11776.65, 11682.76, 11843.18]])
So I can get all positions which verify < 5000 :
positions=zip(*np.where(test2D < 5000.))
Now I want to reject some values which are useless for me (it s an array with coordinates):
rejectedvalues = np.array([[0, 0], [2, 2], [3, 1], [10, 2]])
i, j = rejectedvalues.T
mask = np.zeros(test2D.shape, bool)
mask[i,j] = True
m = np.ma.array(test2D, mask=mask)
positions2=zip(*np.where(m < 5000.))
But positions2 gives me the same as positions...
np.ma.where respects the mask -- it does not return indices in the condition (e.g. m < 5000.) that are masked.
In [58]: np.asarray(np.column_stack(np.ma.where(m < 5000.)))
Out[58]:
array([[0, 1],
[0, 2],
[1, 0],
[1, 1],
[1, 2],
[2, 0],
[2, 1],
[3, 0],
[3, 2]])
Compare that with the analogous expression using np.where:
In [57]: np.asarray(np.column_stack(np.where(m < 5000.)))
Out[57]:
array([[0, 0],
[0, 1],
[0, 2],
[1, 0],
[1, 1],
[1, 2],
[2, 0],
[2, 1],
[2, 2],
[3, 0],
[3, 1],
[3, 2]])
Related
here is the problem I am trying to solve:
coord = [[0, 0], [1, 0], [1, 1], [2, 0], [2, 1], [2, 1], [2, 2] ..]
new_arr = [[[0, 0], 1], [[1, 0], 1], [[1, 1], 1], [[2, 0], 1], [[2, 1], 2], [[2, 2], 1] ..]
This is the target I am trying to map to
[0, 0][0, 1][0, 2]
[1, 0][1, 1][1, 2]
[2, 0][2, 1][2, 2]
the ultimate output would be the counts against each of the coordinates
1 0 0
1 1 0
1 2 1
------ clarifications --------
the goal is to generate this square of numbers (counts) which is the second element in new_arr. E.g. [[0, 0], 1], [[1, 0], 1], can be interpreted as the value 1 for the coordinate [0,0] and value 1 for coordinate [1,0]
the first list (coord) is simply a map of the coordinates. The goal is to get the corresponding value (from new_arr) and display it in the form of a square. Hope this clarified. The output will be a grid of the format
1 0 0
1 1 0
1 2 1
to the question of N (I just took a sample value of 3). The actual use case is when the user enters an integer, say 6 and the result is in a 6 X 6 square. The counts are chess move computations on the ways to reach a specific cell (two movements only (i+1, j) & (i+1, j+1) ....... starting from (0,0)
The logic is not fully clear, but is looks like you want to map the values of new_arr on the Cartesian product of coordinates:
N = 3 # how this is determined is unclear
d = {tuple(l):x for l, x in new_arr}
# {(0, 0): 1, (1, 0): 1, (1, 1): 1, (2, 0): 1, (2, 1): 2, (2, 2): 1}
out = [d.get((i,j), 0) for i in range(N) for j in range(N)]
# [1, 0, 0, 1, 1, 0, 1, 2, 1]
# 2D variant
out2 = [[d.get((i,j), 0) for j in range(N)] for i in range(N)]
# [[1, 0, 0],
# [1, 1, 0],
# [1, 2, 1]]
alternative with numpy
import numpy as np
N = 3
a = np.zeros((N,N), dtype=int)
# get indices and values
idx, val = zip(*new_arr)
# assign values (option 1)
a[tuple(zip(*idx))] = val
# assign values (option 2)
a[tuple(np.array(idx).T.tolist())] = val
print(a)
output:
array([[1, 0, 0],
[1, 1, 0],
[1, 2, 1]])
Use numpy:
import numpy as np
i = []
coord = [[0, 0], [1, 0], [1, 1], [2, 0], [2, 1], [2, 1], [2, 2]]
new_arr = [[[0, 0], 1], [[1, 0], 1], [[1, 1], 1], [[2, 0], 1], [[2, 1], 2], [[2, 2], 1]]
result = np.zeros([coord[-1][0] + 1, coord[-1][1] + 1])
for i in new_arr:
for j in coord:
if i[0] == j:
result[j[0],j[1]]= i[1]
print(result)
Output:
[[1. 0. 0.]
[1. 1. 0.]
[1. 2. 1.]]
I have two ragged arrays Ii0 and Iv0. I sort the values according to increasing j which Ii01 shows but then I want Iv01 to reflect the same sorting order. In general, I want the code to be able to handle many different shapes of Ii0 and Iv0. The current and desired outputs are attached.
import numpy as np
Ii0=np.array([[[0, 1],
[0, 3],
[1, 2]],
[[0, 3],
[2,5],
[0, 1]]])
Iv0 = np.array([[[10],
[20],
[30]],
[[100],
[200],
[300]]])
Ii01 = np.array([sorted(i, key = lambda x : x[1]) for i in Ii0])
print("Ii01 =",Ii01)
Iv01 = np.array([sorted(i, key = lambda x : x[0]) for i in Iv0])
print("Iv01 =",Iv01)
The current output is
Ii01 = [array([[[0, 1],
[1, 2],
[0, 3]],
[[0, 1],
[0, 3],
[2, 5]]])]
Iv01 = [array([[[ 10],
[ 20],
[ 30]],
[[100],
[200],
[300]]])]
The expected output is
Ii01 = [array([[[0, 1],
[1, 2],
[0, 3]],
[[0, 1],
[0, 3],
[2, 5]]])]
Iv01 = [array([[[ 10],
[ 30],
[ 20]],
[[300],
[100],
[200]]])]
import numpy as np
Ii0 = np.array([[
[0, 1],
[0, 3],
[1, 2]],
[[0, 3],
[2, 5],
[0, 1]]])
Iv0 = np.array([[[10],
[20],
[30]],
[[100],
[200],
[300]]])
Ii01 = np.array([Ii[np.argsort(Ii[:, 1])] for Ii in Ii0])
print("Ii01 =", Ii01)
Iv01 = np.array([Iv[np.argsort(Ii[:, 1])] for Ii,Iv in zip(Ii0,Iv0)])
print("Iv01 =", Iv01)
we need to sort our first array with np.argsort which basically return index of sorted array and that indices we used to arrange second array (actually second array is not sorted, it rearranges passions based on first array sort)
I'm encountering a problem that I hope you can help me solve.
I have a 2D numpy array which I want to divide into bins by value. Then I need to know the exact initial indices of all the numbers in each bin.
For example, consider the matrix
[[1,2,3], [4,5,6], [7,8,9]]
and the bin array
[0,2,4,6,8,10].
Then the element first element ([0,0]) should be stored in one bin, the next two elements ([0,1],[0,2]) should be stored in another bin and so on. The desired output looks like this:
[[[0,0]],[[0,1],[0,2]],[[1,0],[1,1]],[[1,2],[2,0]],[[2,1],[2,2]]]
Even though I tried several numpy functions, I'm not able to do this in an elegant way. The best attempt might be
>>> a = [[1,2,3], [4,5,6], [7,8,9]]
>>> bins = [0,2,4,6,8,10]
>>> bin_in_mat = np.digitize(a, bins, right=False)
>>> bin_in_mat
array([[1, 2, 2],
[3, 3, 4],
[4, 5, 5]])
>>> indices = np.argwhere(bin_in_mat)
>>> indices
array([[0, 0],
[0, 1],
[0, 2],
[1, 0],
[1, 1],
[1, 2],
[2, 0],
[2, 1],
[2, 2]])
but this doesn't solve my problem. Any suggestions?
You need to leave numpy and use a loop for this - it's not capable of representing your result:
bin_in_mat = np.digitize(a, bins, right=False)
bin_contents = [np.argwhere(bin_in_mat == i) for i in range(len(bins))]
>>> for b in bin_contents:
... print(repr(b))
array([], shape=(0, 2), dtype=int64)
array([[0, 0]], dtype=int64)
array([[0, 1],
[0, 2]], dtype=int64)
array([[1, 0],
[1, 1]], dtype=int64)
array([[1, 2],
[2, 0]], dtype=int64)
array([[2, 1],
[2, 2]], dtype=int64)
Note that digitize is a bad choice for large integer input (until 1.15), and is faster and more correct as bin_in_mat = np.searchsorted(bins, a, side='left')
Consider the array a
np.random.seed([3,1415])
a = np.random.randint(0, 10, (10, 2))
a
array([[0, 2],
[7, 3],
[8, 7],
[0, 6],
[8, 6],
[0, 2],
[0, 4],
[9, 7],
[3, 2],
[4, 3]])
What is a vectorized way to get the cumulative argmax?
array([[0, 0], <-- both start off as max position
[1, 1], <-- 7 > 0 so 1st col = 1, 3 > 2 2nd col = 1
[2, 2], <-- 8 > 7 1st col = 2, 7 > 3 2nd col = 2
[2, 2], <-- 0 < 8 1st col stays the same, 6 < 7 2nd col stays the same
[2, 2],
[2, 2],
[2, 2],
[7, 2], <-- 9 is new max of 2nd col, argmax is now 7
[7, 2],
[7, 2]])
Here is a non-vectorized way to do it.
Notice that as the window expands, argmax applies to the growing window.
pd.DataFrame(a).expanding().apply(np.argmax).astype(int).values
array([[0, 0],
[1, 1],
[2, 2],
[2, 2],
[2, 2],
[2, 2],
[2, 2],
[7, 2],
[7, 2],
[7, 2]])
Here's a vectorized pure NumPy solution that performs pretty snappily:
def cumargmax(a):
m = np.maximum.accumulate(a)
x = np.repeat(np.arange(a.shape[0])[:, None], a.shape[1], axis=1)
x[1:] *= m[:-1] < m[1:]
np.maximum.accumulate(x, axis=0, out=x)
return x
Then we have:
>>> cumargmax(a)
array([[0, 0],
[1, 1],
[2, 2],
[2, 2],
[2, 2],
[2, 2],
[2, 2],
[7, 2],
[7, 2],
[7, 2]])
Some quick testing on arrays with thousands to millions of values suggests that this is anywhere between 10-50 times faster than looping at the Python level (either implicitly or explicitly).
I cant think of a way to vectorize this over both columns easily; but if the number of columns is small relative to the number of rows, that shouldn't be an issue and a for loop should suffice for that axis:
import numpy as np
import numpy_indexed as npi
a = np.random.randint(0, 10, (10))
max = np.maximum.accumulate(a)
idx = npi.indices(a, max)
print(idx)
I would like to make a function that computes cumulative argmax for 1d array and then apply it to all columns. This is the code:
import numpy as np
np.random.seed([3,1415])
a = np.random.randint(0, 10, (10, 2))
def cumargmax(v):
uargmax = np.frompyfunc(lambda i, j: j if v[j] > v[i] else i, 2, 1)
return uargmax.accumulate(np.arange(0, len(v)), 0, dtype=np.object).astype(v.dtype)
np.apply_along_axis(cumargmax, 0, a)
The reason for converting to np.object and then converting back is a workaround for Numpy 1.9, as mentioned in generalized cumulative functions in NumPy/SciPy?
I want to plot a 2D figure which is a plane-cut from a 4D array.
for example:
In[1]:
x = [0, 1, 2]
y = [3, 4, 5]
z = [6, 7, 8]
f = [9, 10, 11]
X, Y, Z, F = meshgrid(x, y, z, f) #create 4D grid
Out[1]:
array([[[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[1, 1, 1],
[1, 1, 1],
[1, 1, 1]],
[[2, 2, 2],
[2, 2, 2],
[2, 2, 2]]],
[[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[1, 1, 1],
[1, 1, 1],
[1, 1, 1]],
[[2, 2, 2],
[2, 2, 2],
[2, 2, 2]]],
[[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]],
[[1, 1, 1],
[1, 1, 1],
[1, 1, 1]],
[[2, 2, 2],
[2, 2, 2],
[2, 2, 2]]]])
In[2]:
A = X + 1j*Y + Z + 1j* F
Out[2]:
array([[[[ 6.+12.j, 6.+13.j, 6.+14.j],
[ 7.+12.j, 7.+13.j, 7.+14.j],
[ 8.+12.j, 8.+13.j, 8.+14.j]],
[[ 7.+12.j, 7.+13.j, 7.+14.j],
[ 8.+12.j, 8.+13.j, 8.+14.j],
[ 9.+12.j, 9.+13.j, 9.+14.j]],
[[ 8.+12.j, 8.+13.j, 8.+14.j],
[ 9.+12.j, 9.+13.j, 9.+14.j],
[ 10.+12.j, 10.+13.j, 10.+14.j]]],
[[[ 6.+13.j, 6.+14.j, 6.+15.j],
[ 7.+13.j, 7.+14.j, 7.+15.j],
[ 8.+13.j, 8.+14.j, 8.+15.j]],
[[ 7.+13.j, 7.+14.j, 7.+15.j],
[ 8.+13.j, 8.+14.j, 8.+15.j],
[ 9.+13.j, 9.+14.j, 9.+15.j]],
[[ 8.+13.j, 8.+14.j, 8.+15.j],
[ 9.+13.j, 9.+14.j, 9.+15.j],
[ 10.+13.j, 10.+14.j, 10.+15.j]]],
[[[ 6.+14.j, 6.+15.j, 6.+16.j],
[ 7.+14.j, 7.+15.j, 7.+16.j],
[ 8.+14.j, 8.+15.j, 8.+16.j]],
[[ 7.+14.j, 7.+15.j, 7.+16.j],
[ 8.+14.j, 8.+15.j, 8.+16.j],
[ 9.+14.j, 9.+15.j, 9.+16.j]],
[[ 8.+14.j, 8.+15.j, 8.+16.j],
[ 9.+14.j, 9.+15.j, 9.+16.j],
[ 10.+14.j, 10.+15.j, 10.+16.j]]]])
Now the shape of A is
(3, 3, 3, 3)
Now my question is how to plot 2D figure from this 4D array which is (Y=0 and F = 0), and is this the right way to plot a plane-cut from a 4D figure?
A plane cut with the shape n x m from a spacial figure in 3D grids (surf plot) may be plotted using matplotlib.pyplot.imshow. If it is not a surf plot, however, you may use plot, but you have to calculate the contour of your figure from the desired dimensions.
from matplotlib.pyplot import imshow, show
imshow(variable, interpolation='none')
show()
The shape may also adjusted as follows:
# Given |variable| is an object of type numpy.array:
var_reshaped = variable.reshape(int(variable.size/2), -1)
This creates a square-shaped output from the data. You should be vigilant when reshaping an array to ensure that the integrity of the shape is not compromised. That is, you must make sure m x n of your reshaped array is identical to the two dimensions you're extracting your plane from (any double combination of x, y or z).
Also, you cannot take a plane directly from 4D. You must extract a 3D array first, and then a 2D one. For instance, imagine an MRI with 10 exposures (4D: 10 x 50 x 50 x 50). You must first extract one exposure (3D: 50 x 50 x 50), and then attempt to display that a slice (2D: 50 x 50).