Mask outside of an interval of Two 2D array? - python

I have one 3D array, i.e. param:
param.shape = (20, 50, 50)
I want to mask its first axis outside of one interval, i.e. two 2D arrays, bot and top:
bot.shape = (50, 50)
top.shape = (50, 50)
What I have tried is:
bot_n = np.broadcast_to(bot[0, :, :], param.shape)
top_n = np.broadcast_to(top[0, :, :], param.shape)
output = np.ma.masked_outside(param, bot_n, top_n)
But I got the following error:
if v2 < v1:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
In fact, I want to extract the value of param which is between bot and top values.

You could construct the mask yourself:
output = np.ma.array(param, (param < bot_n) & (param > top_n))

The code for masked_outside is quite simple:
if v2 < v1:
(v1, v2) = (v2, v1)
xf = filled(x)
condition = (xf < v1) | (xf > v2)
return masked_where(condition, x, copy=copy)
The condition1 expression should work with your array bot_n, but the if v2<v1 test only works with scalar limits. The function author was thinking of a simple [3, 9] interval, not your more general 2d one.
So, yes, write your own mask.

Related

How to get indices from tensor where the scores (values) satisfies a condition after using torch.topk

I have a Tensor of shape mxm which is basically the similarity or inner product of two tensors. I want to Get all the values which are above 0.5 only. How could I go this? (numpy operations would do too)
Example:
x = torch.randn((9052, 512))
similarities = x # x.T
scores, indices = torch.topk(similarities, x.shape[0]) # topk == all the values, sorted
I have tried
mask = torch.ones(scores.size()[0])
mask = 1 - mask.diag()
sim_vec = torch.nonzero((scores >= 0.5)*mask)
Gives me a tensor of shape [39672595, 2]
I've also tried
(scores > 0.5 ).nonzero(as_tuple=True)[0]
It gives me a tensor of shape [51152826]
Expected Result Pseudo Code:
result = []
for i, row in enumerate(scores):
temp = []
for j, value in enumerate(row):
if value > 0.5:
temp.append(indices[i][j].item())
result.append(temp)
Update:
The below code gives the Upper Triangle which shows that which element is closest to the other one. Main problem still persists of filtering and getting the index:
import pandas as pd
matrix = pd.DataFrame(scores.numpy().astype(np.float32))
upper_tri = matrix.where(np.triu(np.ones(matrix.shape),k=1).astype(np.bool))

Slicing 2D numpy array periodically

I have a numpy array of 300x300 where I want to keep all elements periodically. Specifically, for both axes I want to keep the first 5 elements, then discard 15, keep 5, discard 15, etc. This should result in an array of 75x75 elements. How can this be done?
You can created a 1D mask, that carries out the keep/discard function, and then repeat the mask and apply the mask to the array. Here is an example.
import numpy as np
size = 300
array = np.arange(size).reshape((size, 1)) * np.arange(size).reshape((1, size))
mask = np.concatenate((np.ones(5), np.zeros(15))).astype(bool)
period = len(mask)
mask = np.repeat(mask.reshape((1, period)), repeats=size // period, axis=0)
mask = np.concatenate(mask, axis=0)
result = array[mask][:, mask]
print(result.shape)
You can view the array as series of 20x20 blocks, of which you want to keep the upper-left 5x5 portion. Let's say you have
keep = 5
discard = 15
This only works if
assert all(s % (keep + discard) == 0 for s in arr.shape)
First compute the shape of the view and use it:
block = keep + discard
shape1 = (arr.shape[0] // block, block, arr.shape[1] // block, block)
view = arr.reshape(shape1)[:, :keep, :, :keep]
The following operation will create a copy of the data because the view creates a non-contiguous buffer:
shape2 = (shape1[0] * keep, shape1[2] * keep)
result = view.reshape(shape2)
You can compute shape1 and shape2 in a more general manner with something like
shape1 = tuple(
np.stack((np.array(arr.shape) // block,
np.full(arr.ndim, block)), -1).ravel())
shape2 = tuple(np.array(shape1[::2]) * keep)
I would recommend packaging this into a function.
Here is my first thought of a solution. Will update later if I think of one with fewer lines. This should work even if the input is not square:
output = []
for i in range(len(arr)):
tmp = []
if i % (15+5) < 5: # keep first 5, then discard next 15
for j in range(len(arr[i])):
if j % (15+5) < 5: # keep first 5, then discard next 15
tmp.append(arr[i,j])
output.append(tmp)
Update:
Building off of Yang's answer, here is another way which uses np.tile, which repeats an array a given number of times along each axis. This relies on the input array being square in dimension.
import numpy as np
# Define one instance of the keep/discard box
keep, discard = 5, 15
mask = np.concatenate([np.ones(keep), np.zeros(discard)])
mask_2d = mask.reshape((keep+discard,1)) * mask.reshape((1,keep+discard))
# Tile it out -- overshoot, then trim to match size
count = len(arr)//len(mask_2d) + 1
tiled = np.tile(mask_2d, [count,count]).astype('bool')
tiled = tiled[:len(arr), :len(arr)]
# Apply the mask to the input array
dim = sum(tiled[0])
output = arr[tiled].reshape((dim,dim))
Another option using meshgrid and a modulo:
# MyArray = 300x300 numpy array
r = np.r_[0:300] # A slide from 0->300
xv, yv = np.meshgrid(r, r) # x and y grid
mask = ((xv%20)<5) & ((yv%20)<5) # We create the boolean mask
result = MyArray[mask].reshape((75,75)) # We apply the mask and reshape the final output

How to fix 'need at least one array to concatenate' error?

I have read through the various posts on ValueError but I'm not getting much satisfactory solution. Please, can anyone help me what I am doing wrong??
Code:
assert(type(images) == list)
# assert(type(images[0]) == np.ndarray)
# assert(len(images[0].shape) == 3)
# assert(np.max(images[0]) > 10)
# assert(np.min(images[0]) >= 0.0)
inps = []
for img in images:
img = img.astype(np.float32)
inps.append(np.expand_dims(img, 0))
bs = 100
with tf.Session() as sess:
preds = []
n_batches = int(math.ceil(float(len(inps)) / float(bs)))
for i in range(n_batches):
sys.stdout.write(".")
sys.stdout.flush()
inp = inps[(i * bs):min((i + 1) * bs, len(inps))]
inp = np.concatenate(inp, 0)
pred = sess.run(softmax, {'ExpandDims:0': inp})
preds.append(pred)
preds = np.concatenate(preds, 0)
scores = []
for i in range(splits):
part = preds[(i * preds.shape[0] // splits):((i + 1) * preds.shape[0] // splits), :]
kl = part * (np.log(part) - np.log(np.expand_dims(np.mean(part, 0), 0)))
kl = np.mean(np.sum(kl, 1))
scores.append(np.exp(kl))
return np.mean(scores), np.std(scores)
Error :
>File "/content/Inception-Score/inception_score.py", line 45, in >get_inception_score
> preds = np.concatenate(preds, 0)
>ValueError: need at least one array to concatenate
It appears that you are missing the argument for the array you would like to concatenate. You specified the initial array and the axis to concatenate on, but not the second array -- hence "need at least one array to concatenate".
np.concatenate() has a minimum of two arrays in the first argument, as detailed in the documentation here. Looks like "preds" is only one array. I am not sure what you are trying to do, but maybe concatenate is not what you want?
The problem seems to be in np.concatenate where it expects an array of arrays and you are not providing that
#syntax
numpy.concatenate((a1, a2, ...), axis=0, out=None)
Parameters:
a1, a2, … : sequence of array_like The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default).
axis : int, optional The axis along which the arrays will be joined. If axis is None, arrays are flattened before use. Default is 0.
out : ndarray, optional If provided, the destination to place the result. The shape must be correct, matching that of what concatenate would have returned if no out argument were specified.
Returns: ndarray The concatenated array.
check preds what it returns

OpenCV Python cv2.perspectiveTransform

I'm currently trying to video stabilization using OpenCV and Python.
I use the following function to calculate rotation:
def accumulate_rotation(src, theta_x, theta_y, theta_z, timestamps, prev, current, f, gyro_delay=None, gyro_drift=None, shutter_duration=None):
if prev == current:
return src
pts = []
pts_transformed = []
for x in range(10):
current_row = []
current_row_transformed = []
pixel_x = x * (src.shape[1] / 10)
for y in range(10):
pixel_y = y * (src.shape[0] / 10)
current_row.append([pixel_x, pixel_y])
if shutter_duration:
y_timestamp = current + shutter_duration * (pixel_y - src.shape[0] / 2)
else:
y_timestamp = current
transform = getAccumulatedRotation(src.shape[1], src.shape[0], theta_x, theta_y, theta_z, timestamps, prev,
current, f, gyro_delay, gyro_drift)
output = cv2.perspectiveTransform(np.array([[pixel_x, pixel_y]], dtype="float32"), transform)
current_row_transformed.append(output)
pts.append(current_row)
pts_transformed.append(current_row_transformed)
o = utilities.meshwarp(src, pts_transformed)
return o
I get the following error when it gets to output = cv2.perspectiveTransform(np.array([[pixel_x, pixel_y]], dtype="float32"), transform):
cv2.error: /Users/travis/build/skvark/opencv-python/opencv/modules/core/src/matmul.cpp:2271: error: (-215) scn + 1 == m.cols in function perspectiveTransform
Any help or suggestions would really be appreciated.
This implementation really needs to be changed in a future version, or the docs should be more clear.
From the OpenCV docs for perspectiveTransform():
src – input two-channel (...) floating-point array
Slant emphasis added by me.
>>> A = np.array([[0, 0]], dtype=np.float32)
>>> A.shape
(1, 2)
So we see from here that A is just a single-channel matrix, that is, two-dimensional. One row, two cols. You instead need a two-channel image, i.e., a three-dimensional matrix where the length of the third dimension is 2 or 3 depending on if you're sending in 2D or 3D points.
Long story short, you need to add one more set of brackets to make the set of points you're sending in three-dimensional, where the x values are in the first channel, and the y values are in the second channel.
>>> A = np.array([[[0, 0]]], dtype=np.float32)
>>> A.shape
(1, 1, 2)
Also, as suggested in the comments:
If you have an array points of shape (n_points, dimension) (i.e. dimension is 2 or 3), a nice way to re-format it for this use-case is points[np.newaxis]
It's not intuitive, and though it's documented, it's not very explicit on that point. That's all you need. I've answered an identical question before, but for the cv2.transform() function.

Pyplot truth value of an array with more than one element is ambiguous

I am trying to implement a knn 1D estimate:
# nearest neighbors estimate
def nearest_n(x, k, data):
# Order dataset
#data = np.sort(data, kind='mergesort')
nnb = []
# iterate over all data and get k nearest neighbours around x
for n in data:
if nnb.__len__()<k:
nnb.append(n)
else:
for nb in np.arange(0,k):
if np.abs(x-n) < np.abs(x-nnb[nb]):
nnb[nb] = n
break
nnb = np.array(nnb)
# get volume(distance) v of k nearest neighbours around x
v = nnb.max() - nnb.min()
v = k/(data.__len__()*v)
return v
interval = np.arange(-4.0, 8.0, 0.1)
plt.figure()
for k in (2,8,35):
plt.plot(interval, nearest_n(interval, k,train_data), label=str(o))
plt.legend()
plt.show()
Which throws:
File "x", line 55, in nearest_n
if np.abs(x-n) < np.abs(x-nnb[nb]):
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I know the error comes from the array input in plot(), but I am not sure how to avoid this in a function with operators >/==/<
'data' comes from a 1D txt file containing floats.
I tried using vectorize:
nearest_n = np.vectorize(nearest_n)
which results in:
line 50, in nearest_n
for n in data:
TypeError: 'numpy.float64' object is not iterable
Here is an example, let's say:
data = [0.5,1.7,2.3,1.2,0.2,2.2]
k = 2
nearest_n(1.5) should then lead to
nbb=[1.2,1.7]
v = 0.5
and return 2/(6*0.5) = 2/3
The function runs for example neares_n(2.0,4,data) and gives 0.0741586011463
You're passing in np.arange(-4, 8, .01) as your x, which is an array of values. So x - n is an array of the same length as x, in this case 120 elements, since subtraction of an array and a scalar does element-wise subtraction. Same with nnb[nb]. So the result of your comparison there is a 120-length array with boolean values depending on whether each element of np.abs(x-n) is less than the corresponding element of np.abs(x-nnb[nb]). This can't be directly used as a conditional, you would need to coalesce these values to a single boolean (using all(), any(), or simply rethinking your code).
plt.figure()
X = np.arange(-4.0,8.0,0.1)
for k in [2,8,35]:
Y = []
for n in X:
Y.append(nearest_n(n,k,train_data))
plt.plot(X,Y,label=str(k))
plt.show()
is working fine. I thought pyplot.plot would do this exact thing for me already, but I guess it does not...

Categories