I am currently training a LSTM which classifies frames. What I am trying to do is compare two 2d numpy arrays to check for accuracy between my prediction and target. I have currently looked around for non-naive ways to solve this problem using NumPy / SciPy.
I am aware that there is np.testing.assert_array_equal(x, y) which uses Assertion to output the results. I am looking for a way to solve this issue using NumPy / SciPy so I can store the results rather than an Assert print out:
Arrays are not equal
(mismatch 14.285714285714292%)
x: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
y: array([0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0])
x = np.asarray([[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0]])
y = np.asarray([[0, 0, 0], [0, 0, 0], [0, 0, 1], [0, 1, 0], [1, 0, 0], [0, 0, 0], [0, 0, 0]])
try:
np.testing.assert_array_equal(x, y)
res = True
except AssertionError as err:
res = False
print (err)
I am looking for a way which I can store the mismatch of these two arrays without using a naive fashion (Two comparative loops):
accuracy = thisFunction(x,y)
I am sure there is something in NumPy which can solve this, I've had no luck with searching for built-in functions.
As hpaulj noted in the comment, you can use numpy.allclose() for checking the array equality, with acceptable difference of up to some tolerance value (see below or NumPy notes).
Here is a small illustration with two simple float arrays.
In [7]: arr1 = np.array([1.3, 1.4, 1.5, 3.4])
In [8]: arr2 = np.array([1.299999, 1.4, 1.4999999, 3.3999999999])
In [9]: np.allclose(arr1, arr2)
Out[9]: True
numpy.allclose will return True if the corresponding elements in the arrays are dissimilar (only up to the tolerance value). Else it would return False. NumPy default for relative & absolute tolerance values are rtol=1e-05, atol=1e-08 respectively.
Having said that, if you only want to compare int arrays, then you'd be better off with numpy.array_equal() which is approx. 8x faster than numpy.allclose.
In [17]: arr1 = np.random.randint(23045)
In [18]: arr2 = np.random.randint(23045)
In [19]: %timeit np.allclose(arr1, arr2)
22.9 µs ± 471 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [20]: %timeit np.array_equal(arr1, arr2)
3.99 µs ± 68.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
np.array_equal(x, y) is roughly equivalent to (x == y).all(). You can use this to compute the discrepancies:
def array_comp(x, y):
"""
Return the status of the comparison and the discrepancy.
For arrays of the same shape, the discrepancy is a ratio of mismatches to the total size.
For arrays of different shapes or sizes, the discrepancy is a message indicating the mismatch.
"""
if x.shape != y.shape:
return False, 'shape'
count = x.size - np.count_nonzero(x == y)
return count == 0, count / x.size
Related
Given a bool tensor in pytorch, I would like to have a "lockout period" of N values after each True value along each row. More specifically, in the example below, moving from left to right on any given row I would like to ensure that after each True the following N values are all False.
e.g.
N = 3
input = tensor([[0, 0, 0, 0, 1, 1, 0, 1, 0, 1],
[1, 1, 1, 0, 1, 0, 1, 1, 0, 1]])
# should output
tensor([[0, 0, 0, 0, 1, 0, 0, 0, 0, 1],
[1, 0, 0, 0, 1, 0, 0, 0, 0, 1]])
I can solve this with a double loop e.g.
for row in input:
for element in row:
# if sum of previous N entries > 0 set input[row,element] = 0
However I would like to solve this either (a) without looping at all or (b) with just a single loop (e.g. for column in input). Is there a way to achieve preferably (a), or otherwise (b)? I cannot assume the input tensor will be sparse or have any paritcular distribution.
Thanks Naga for the (Edit: since deleted) answer, but I need a pytorch solution and I am not sure that solution is O(n) as stated.
I found the following solution (looping over columns) seems to do the trick.
input = input.to(torch.bool)
for i, col in enumerate(input.t()):
input[:,i+1:i+1+N] = torch.mul(~col, input[:,i+1:i+1+N].t()).t()
Additionally based on a quick comparison on the sort of array size I'm interested in, it seems to be around 5x faster.
def a():
input = torch.randint(high=2, size = (200,100))
input = input.to(torch.bool)
N = 10
for i, col in enumerate(input.t()):
input[:,i+1:i+1+N] = torch.mul(~col, input[:,i+1:i+1+N].t()).t()
def b():
N = 10
a = np.random.randint(0, high=2, size = (200,100), dtype=int)
inds = np.where(a==1)
for r,c in np.nditer(inds):
if a[r,c]==1:
a[r,c+1:c+N]=0
%timeit a()
# 100 loops, best of 5: 2.47 ms per loop
%timeit b()
# 100 loops, best of 5: 12.8 ms per loop
Set values for a window of size n of an array based on the current value of another array
Ignore values that the window overrides
Need to be able to change the window size (n) for different runs
This code works but it is very slow.
n = 3
def signal(arr):
signal = pd.Series(data=0, index=arr.index)
i = 0
while i < len(arr) - 1:
s = arr.iloc[i]
if s in [-1, 1]:
j = i + n
signal.iloc[i: j] = s
i = i + n
else:
i += 1
return signal
arr = [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, -1, -1, 0, 0, 0, 0]
signal = [0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, -1, -1, -1, 0, 0, 0]
Don't make arr a pandas series object but just a numpy array.
Try this:
import numpy as np
def signal(arr, n):
size = len(arr)
signal = np.zeros(size)
for i in range(size):
s = arr[i]
if s in [-1, 1]:
j = i + n
signal[i: j] = s
i = i + n
else:
i += 1
return signal
arr = [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, -1, -1, 0, 0, 0, 0]
n = 3
signal(arr, n)
I benchmarked the two different solutions and this is way faster:
Original: 738 µs ± 21.9 µs per loop (mean ± std. dev. of 7 runs, 1000
loops each)
New: 9.56 µs ± 778 ns per loop (mean ± std. dev. of 7 runs, 100000
loops each)
I want to avoid using for loop in the following code to achieve performance. Is vectorization suitable for this kind of problem?
a = np.array([[0,1,2,3,4],
[5,6,7,8,9],
[0,1,2,3,4],
[5,6,7,8,9],
[0,1,2,3,4]],dtype= np.float32)
temp_a = np.copy(a)
for i in range(1,a.shape[0]-1):
for j in range(1,a.shape[1]-1):
if a[i,j] > 3:
temp_a[i+1,j] += a[i,j] / 5.
temp_a[i-1,j] += a[i,j] / 5.
temp_a[i,j+1] += a[i,j] / 5.
temp_a[i,j-1] += a[i,j] / 5.
temp_a[i,j] -= a[i,j] * 4. / 5.
a = np.copy(temp_a)
You are basically doing convolution, with some special treatment for borders.
Try the following:
from scipy.signal import convolve2d
# define your filter
f = np.array([[0.0, 0.2, 0.0],
[0.2,-0.8, 0.2],
[0.0, 0.2, 0.0]])
# select parts of 'a' to be used for convolution
b = (a * (a > 3))[1:-1, 1:-1]
# convolve, padding with zeros ('same' mode)
c = convolve2d(b, f, mode='same')
# add the convolved result to 'a', excluding borders
a[1:-1, 1:-1] += c
# treat the special cases of the borders
a[0, 1:-1] += .2 * b[0, :]
a[-1, 1:-1] += .2 * b[-1, :]
a[1:-1, 0] += .2 * b[:, 0]
a[1:-1, -1] += .2 * b[:, -1]
It gives the following result, which is the same as you nested loops.
[[ 0. 2.2 3.4 4.6 4. ]
[ 6.2 2.6 4.2 3. 10.6]
[ 0. 3.4 4.8 6.2 4. ]
[ 6.2 2.6 4.2 3. 10.6]
[ 0. 2.2 3.4 4.6 4. ]]
My trail uses 3 filters, rot90, np.where, np.sum, and np.multiply. I am not sure which way to benchmark is more reasonable. If you do not take into account the time to create filters, it is roughly 4 times faster.
# Each filter basically does what `op` tries to achieve in a loop
filter1 = np.array([[0, 1 ,0, 0, 0],
[1, -4, 1, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]]) /5.
filter2 = np.array([[0, 0 ,1, 0, 0],
[0, 1, -4, 1, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]]) /5.
filter3 = np.array([[0, 0 ,0, 0, 0],
[0, 0, 1, 0, 0],
[0, 1, -4, 1, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 0]]) /5.
# only loop over the center of the matrix, a
center = np.array([[0, 0, 0, 0, 0],
[0, 1, 1, 1, 0],
[0, 1, 1, 1, 0],
[0, 1, 1, 1, 0],
[0, 0, 0, 0, 0]])
filter1 and filter2 can be rotated to represent 4 filters individually.
filter1_90_rot = np.rot90(filter1, k=1)
filter1_180_rot = np.rot90(filter1, k=2)
filter1_270_rot = np.rot90(filter1, k=3)
filter2_90_rot = np.rot90(filter2, k=1)
filter2_180_rot = np.rot90(filter2, k=2)
filter2_270_rot = np.rot90(filter2, k=3)
# Based on different index from `a` return different filter
filter_dict = {
(1,1): filter1,
(3,1): filter1_90_rot,
(3,3): filter1_180_rot,
(1,3): filter1_270_rot,
(1,2): filter2,
(2,1): filter2_90_rot,
(3,2): filter2_180_rot,
(2,3): filter2_270_rot,
(2,2): filter3
}
Main function
def get_new_a(a):
x, y = np.where(((a > 3) * center) > 0) # find pairs that match the condition
return a + np.sum(np.multiply(filter_dict[i, j], a[i,j])
for (i, j) in zip(x,y))
Note: There seem to be some numerical errors such that np.equal() would mostly return False between my result and OP's while np.close() would return true.
Timing results
def op():
temp_a = np.copy(a)
for i in range(1,a.shape[0]-1):
for j in range(1,a.shape[1]-1):
if a[i,j] > 3:
temp_a[i+1,j] += a[i,j] / 5.
temp_a[i-1,j] += a[i,j] / 5.
temp_a[i,j+1] += a[i,j] / 5.
temp_a[i,j-1] += a[i,j] / 5.
temp_a[i,j] -= a[i,j] * 4. / 5.
a2 = np.copy(temp_a)
%timeit op()
167 µs ± 2.72 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit get_new_a(a):
37.2 µs ± 2.68 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Note again, we ignore the time to create filter as I think it would be a one time thing. If you do want to include the time to create filters, it is roughly two times faster. You might think it is not fair becasue op's method contains two np.copy. The bottleneck of op's method, I think, is the for loop.
Reference:
numpy.multiply do a elementwise multiplication between two matrix.
np.rot90 does rotation for us. k is a parameter that you can decide how many times to rotate.
np.isclose can use this function to check whether two matrices are close within some error that you can define.
I came up with this solution:
a = np.array([[0,0,0,0,0],
[0,6,2,8,0],
[0,1,5,3,0],
[0,6,7,8,0],
[0,0,0,0,0]],dtype= np.float32)
up= np.zeros_like(a)
down= np.zeros_like(a)
right= np.zeros_like(a)
left = np.zeros_like(a)
def new_a(a,u,r,d,l):
c = np.copy(a)
c[c <= 3] = 0
up[:-2, 1:-1] += c[1:-1,1:-1] / 5.
down[2:, 1:-1] += c[1:-1,1:-1] / 5.
left[1:-1, :-2] += c[1:-1,1:-1]/ 5.
right[1:-1, 2:] += c[1:-1,1:-1] / 5.
a[1:-1,1:-1] -= c[1:-1,1:-1] * 4. / 5.
a += up + down + left + right
return a
So lets say I have a an array that looks similar to this :
array([[0, 0, 0, 0, 0],
[0, 1, 1, 1, 0],
[0, 1, 1, 1, 0],
[0, 1, 1, 1, 0],
[0, 0, 0, 0, 0]])
I would like to return the location of the center of the largest sum of values within a certain n*n square. So in this case it would be (2,2) if n = 3. If I let n = 4 it would be the same result.
Does numpy have a method for finding this location?
Approach #1 : We can use SciPy's 2D convolution to get summations in sliding windows of shape (n,n) and choose the index of the window with the biggest sum with argmax and translate to row, col indices with np.unravel_index, like so -
from scipy.signal import convolve2d as conv2
def largest_sum_pos_app1(a, n):
idx = conv2(a, np.ones((n,n),dtype=int),'same').argmax()
return np.unravel_index(idx, a.shape)
Sample run -
In [558]: a
Out[558]:
array([[0, 0, 0, 0, 0],
[0, 1, 1, 1, 0],
[0, 1, 1, 1, 0],
[0, 1, 1, 1, 0],
[0, 0, 0, 0, 0]])
In [559]: largest_sum_pos_app1(a, n=3)
Out[559]: (2, 2)
Approach #1S (Super charged) : We can boost it further by using uniform filter, like so -
from scipy.ndimage.filters import uniform_filter as unif2D
def largest_sum_pos_app1_mod1(a, n):
idx = unif2D(a.astype(float),size=n, mode='constant').argmax()
return np.unravel_index(idx, a.shape)
Approach #2 : Another based on scikit-image's sliding window creating tool view_as_windows, we would create sliding windows of shape (n,n) to give us a 4D array with the last two axes of shape (n,n) corresponding to the search window size. So, we would sum along those two axes and get the argmax index and translate it to the actual row, col positions.
Hence, the implementation would be -
from skimage.util.shape import view_as_windows
def largest_sum_pos_app2(a, n):
h = (n-1)//2 # half window size
idx = view_as_windows(a, (n,n)).sum((-2,-1)).argmax()
return tuple(np.array(np.unravel_index(idx, np.array(a.shape)-n+1))+h)
As also mentioned in the comments, a search square with an even n would be confusing given that it won't have its center at any element coordinate.
Runtime test
In [741]: np.random.seed(0)
In [742]: a = np.random.randint(0,1000,(1000,1000))
In [743]: largest_sum_pos_app1(a, n= 5)
Out[743]: (966, 403)
In [744]: largest_sum_pos_app1_mod1(a, n= 5)
Out[744]: (966, 403)
In [745]: largest_sum_pos_app2(a, n= 5)
Out[745]: (966, 403)
In [746]: %timeit largest_sum_pos_app1(a, n= 5)
...: %timeit largest_sum_pos_app1_mod1(a, n= 5)
...: %timeit largest_sum_pos_app2(a, n= 5)
...:
10 loops, best of 3: 57.6 ms per loop
100 loops, best of 3: 10.1 ms per loop
10 loops, best of 3: 47.7 ms per loop
I want to find the minimum-sized 2-dimensional ndarray within an ndarray that contains all values meeting a condition.
For example:
Let's say I have the array
x = np.array([[1, 1, 5, 3, 11, 1],
[1, 2, 15, 19, 21, 33],
[1, 8, 17, 22, 21, 31],
[3, 5, 6, 11, 23, 19]])
and call f(x, x % 2 == 0)
Then the return value of the program would be the array
[[2, 15, 19]
[8, 17, 22]
[5, 6, 11]]
Because it is the smallest rectangular array that includes all the even numbers (the condition).
I've found a way to find all the indices for which the condition is true by using np.argwhere and then slicing from the minimum to maximum indices from the original array, and I've done it using a for loop but I was wondering if there was a more efficient way to do it using numpy or scipy.
My current method:
def f(arr, cond_arr):
indices = np.argwhere(cond_arr)
min = np.amin(indices, axis = 0) #get first row, col meeting cond
max = np.amax(indices, axis = 0) #get last row, col meeting cond
return arr[min[0]:max[0] + 1, min[1] : max[1] + 1]
The function is pretty efficient already - but you can do better.
Instead of checking the condition for every row/column and then finding the minimum and maximum, we can collapse the condition into each axis (using reduction with the logical OR) and find the first/last indices:
def f2(arr, cond_arr):
c0 = np.where(np.logical_or.reduce(cond_arr, axis=0))[0]
c1 = np.where(np.logical_or.reduce(cond_arr, axis=1))[0]
return arr[c0[0]:c0[-1] + 1, c1[0]:c1[-1] + 1]
How it works:
With the example data cond_array looks like this:
>>> (x%2==0).astype(int)
array([[0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0],
[0, 1, 0, 1, 0, 0],
[0, 0, 1, 0, 0, 0]])
This are the column conditions:
>>> np.logical_or.reduce(cond_arr, axis=0).astype(int)
array([0, 1, 1, 1, 0, 0])
And this the row conditions:
>>> np.logical_or.reduce(cond_arr, axis=).astype(int)
array([0, 1, 1, 1])
Now we only need to find the first/last nonzero element for each of the two arrays.
Is it really faster?
%timeit f(x, x%2 == 0) # 10000 loops, best of 3: 24.6 µs per loop
%timeit f2(x, x%2 == 0) # 100000 loops, best of 3: 12.6 µs per loop
Well, a little bit... but it really shines with larger arrays:
x = np.random.randn(1000, 1000)
c = np.zeros((1000, 1000), dtype=bool)
c[400:600, 400:600] = True
%timeit f(x,c) # 100 loops, best of 3: 5.28 ms per loop
%timeit f2(x,c) # 1000 loops, best of 3: 225 µs per loop
Finally, this version has slightly more overhead but is generic over the number of dimensions:
def f3(arr, cond_arr):
s = []
for a in range(arr.ndim):
c = np.where(np.logical_or.reduce(cond_arr, axis=a))[0]
s.append(slice(c[0], c[-1] + 1))
return arr[s]