pairwise/rowwise comparison of pytorch tensor - python

I have a 2D tensor representing integer coordinates on a grid.
And I would like to check my tensor for any occurences of a specific coordinate (x,y)
A psuedo-code example:
positions = torch.arange(20).repeat(2).view(-1,2)
xy_dst1 = torch.tensor((5,7))
xy_dst2 = torch.tensor((4,5))
positions == xy_dst1 # should give none
positions == xy_dst2 # should give index 2 and 12
My only solution so far is to convert the tensors to lists or tuples and then go through them iteratively, but with the conversions back and forth and the iterations that can't be a very good solution.
Does anyone know of a better solution that stays in the tensor framework?

Try
def check(positions, xy):
return (positions == xy.view(1, 2)).all(dim=1).nonzero()
print(check(positions, xy_dst1))
# Output: tensor([], size=(0, 1), dtype=torch.int64)
print(check(positions, xy_dst2))
# Output:
# tensor([[ 2],
# [12]])

Related

Resizing array - possibly an environment issue, seems to not like 0

I have a class task to resize an array to [0,1] ie. so that the smallest number becomes 0 and largest 1.
It seems to not like the 0, as whenever there is a 0 in the code it spits out an empty array, but doing e.g [5,1] works. The output is this for [0,1]:
array([], shape=(0, 1), dtype=int64)
Is there any way to make it work? Profs have said it's right and are unsure why it's not working. Collab is the env.
test = [0,1,2,3,4,5]
arr1 = np.array(test)
def rescale(a):
"""Return the rescaled version of a on the [0,1] interval."""
a = (np.resize(a,[0, 1]))
return a
print(a)
rescale(arr1)
If I understand correctly what you really want is to normalize the values of the array, so the title is a bit misleading.
np.resize() changes the shape of the array but it does not change the values.
When any of the dimensions in the shape given to resize() is zero, then the array becomes empty because the number of elements in the array is the product of the dimensions and if one of them is zero, the product is zero. That explains your output.
So if you'd like to normalize the values, check this post:
How to normalize a NumPy array to within a certain range?
You need to compute the increment per unit and multiply this increment with each value and add the start of your interval:
test = [0,1,2,3,4,5]
def rescale(a, interval):
"""Return the rescaled version of a on the [0,1] interval."""
incr = (interval[1] - interval[0]) / (max(a) - min(a))
a = [i*incr + interval[0] for i in a]
return a
rescale(test, [0, 1])

Efficient mapping of values from one Numpy array to the cloesest value on another

I have two arrays: the 1st is a nested array of floats (which we will call the value array) and the 2nd is a 1d array of floats (which we will call the key array). The goal is to map each element within the value array to the numerically closest value on the key array.
To give some background, I am trying to map the weights of a CNN to discrete weights as part of a simulation project. The shape of the weights is dependent on the layer and network definition. In this particular case, I am working with a tf.keras.applications.ResNet50V2 network with CIFAR-10 as the dataset, which has weights that go from 3D to 1D. The weights are returned as nested lists with each index indicating the layer. The number of elements within the value array is very large when completely flattened
I currently have a working solution which I have included below, but I am wondering if anyone could think of any further optimizations. I keep getting warnings about the callback class taking longer than the actual training. This is a function that should be executed at the end of each training batch, so a little optimization can go a long way.
for ii in range(valueArray.size):
# Flatten array to 1D
flatArr = valueArray[ii].flatten()
# Using searchsorted since our discrete values have been sorted
idx = np.searchsorted(keyArray, flatArr, side="left")
# Clip any values that exceed array indices
np.clip(idx, 0, keyArray.size - 1, out=idx)
flatMinVal = keyArray[idx]
# Get bool array of idx that have values to the left (idx > 0)
hasValLeft = idx > 0
# Ensures that closer on left is an bool array of equal size as the original
closerOnLeft = hasValLeft
# Check if abs value for right is greater than left (Acts on values with idx > 0)
closerOnLeftSub = np.abs(flatArr[hasValLeft] - keyArray[idx[hasValLeft]]) > \
np.abs(flatArr[hasValLeft] - keyArray[idx[hasValLeft]-1])
# Only assign values that have a value on the left, else always false
closerOnLeft[hasValLeft] = closerOnLeftSub
# If left element is closer, use that as state
flatMinVal[closerOnLeft] = keyArray[idx[closerOnLeft]-1]
# Return reshaped values
valueArray[ii] = np.reshape(flatMinVal, valueArray[ii].shape)

List of simple arrays with pyplot.plot

I have some trouble to understand how pyplot.plot works.
I take a simple example: I want to plot pyplot.plot(lst2, lst2) where lst2 is a list.
The difficulty comes from the fact that each element of lst2 is an array of shape (1,1). If the elements were floating and not array, there would be no problems.
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
V2 = np.array([[1]])
W2 = np.array([[2]])
print('The shape of V2 is', V2.shape)
print('The shape of W2 is', W2.shape)
lst2 = [V2, W2]
plt.plot(lst2, lst2)
plt.show
Below is the end of the error message I got:
~\Anaconda3\lib\site-packages\matplotlib\axes\_base.py in _xy_from_xy(self,x, y)
245 if x.ndim > 2 or y.ndim > 2:
246 raise ValueError("x and y can be no greater than 2-D, but have "
--> 247 "shapes {} and {}".format(x.shape, y.shape))
248
249 if x.ndim == 1:
ValueError: x and y can be no greater than 2-D, but have shapes (2, 1, 1) and (2, 1, 1)
What surprised me in the error message is the mention of an array of dimension (2,1,1). It seems like the array np.array([V2,W2]) is built when we call pyplot.plot.
My question is then what happens behind the scenes when we call pyplot.plot(x,y) with x and y list? It seems like an array with the elements of x is built (and same for y). And these arrays must have maximum 2 axis. Am I correct?
I know that if I use numpy.squeeze on V2 and W2, it would work. But I would like to understand what it happening inside pyplot.plot in the example I gave.
Take a closer look at what you're doing:
V2 = np.array([[1]])
W2 = np.array([[2]])
lst2 = [V2, W2]
plt.plot(lst2, lst2)
For some odd reason you're defining your arrays to be of shape (1,1) by using a nested pair of brackets. When you construct lst2, you stack your arrays along a new leading dimension. This has nothing do with pyplot, this is numpy.
Numpy arrays are rectangular, and they are compatible with lists of lists of ... of lists. The level of nesting determines the number of dimensions of an array. Look at a simple 2d example:
>>> M = np.arange(2*3).reshape(2,3)
>>> print(repr(M))
array([[0, 1, 2],
[3, 4, 5]])
You can for all intents and purposes think of this 2x3 matrix as two row vectors. M[0] is the same as M[0,:] and is the first row, M[1] is the same as M[1,:] is the second row. You could then also construct this array from the two rows in the following way:
row1 = [0, 1, 2]
row2 = [3, 4, 5]
lst = [row1, row2]
np.array(lst)
My point is that we took two flat lists of length 3 (which are compatible with 1d numpy arrays of shape (3,)), and concatenated them in a list. The result was compatible with a 2d array of shape (2,3). The "2" is due to the fact that we put 2 lists into lst, and the "3" is due to the fact that both lists had a length of 3.
So, when you create lst2 above, you're doing something that is equivalent to this:
lst2 = [ [[1]], [[2]] ]
You put two nested sublists into an array-compatible list, and both sublists are compatible with shape (1,1). This implies that you'll end up with a 3d array (in accordance with the fact that you have three opening brackets at the deepest level of nesting), with shape (2,1,1). Again the 2 comes from the fact that you have two arrays inside, and the trailing dimensions come from the contents.
The real question is what you're trying to do. For one, your data shouldn't really be of shape (1,1). In the most straightforward application of pyplot.plot you have 1d datasets: one for the x and one for the y coordinates of your plot. For this you can use a simple (flat) list or 1d array for both x and y. What matters is that they are of the same length.
Then when you plot the two against each other, you pass the x coordinates first, then the y coordinates second. You presumably meant something like
plt.plot(V2,W2)
In which case you'd pass 2d arrays to plot, and you wouldn't see the error caused by passing a 3d-array-like. However, the behaviour of pyplot.plot is non-trivial for 2d inputs (columns of both datasets will get plotted against one another), and you have to make sure that you really want to pass 2d arrays as inputs. But you almost never want to pass the same object as the first two arguments to pyplot.plot.

Mapping RGB values in an image to a corresponding ID using a dictionary

I am working on a segmentation problem where given an image, each RGB value corresponds to a class label. The problem I have is to efficiently map RGB values from an image (numpy array) to a corresponding class label image.
Let's provide the following simplified example:
color2IdMap
{(100,0,100):0, (0,200,0):2}
labelOld
array([[[100,0,100],
[0,200,0]],
[[100,0,100],
[0,200,0]]], dtype=uint8)
(in a real example the colorIdMap will have about 20 entries and labelOld will be an array of shape: (1024,512,3))
Now I want the result to be the following mapped array. with shape: (1024,512)
labelNew
array([[ 0, 2],
[ 0, 2]])
I attempted to do this with loops and list comprehensions but both methods are quite slow (about ~10seconds per image, which is a big number for 250K images). And I am wondering if there is a faster way of doing it.
Attempted method 1:
labelNew = np.empty((1052,1914), dtype=np.uint8)
for i in range(1052):
for j in range(1914):
labelNew[i, j] = color2IdMap[tuple(labelOld[i, j])]
Attempted method 2:
labelNew = [[color2IdMap[tuple(x)] for x in y] for y in labelOld]
So, my question is if there is any faster and more efficient way of doing this?
Here's one approach based on dimensionality-reduction -
# Get keys and values
k = np.array(list(color2IdMap.keys()))
v = np.array(list(color2IdMap.values()))
# Setup scale array for dimensionality reduction
s = 256**np.arange(3)
# Reduce k to 1D
k1D = k.dot(s)
# Get sorted k1D and correspondingly re-arrange the values array
sidx = k1D.argsort()
k1Ds = k1D[sidx]
vs = v[sidx]
# Reduce image to 2D
labelOld2D = np.tensordot(labelOld, s, axes=((-1),(-1)))
# Get the positions of 1D sorted keys and get the correspinding values by
# indexing into re-arranged values array
out = vs[np.searchsorted(k1Ds, labelOld2D)]
Alternatively, we could use sidx as sorter input arg for np.searchsorted to get the final output -
out = v[sidx[np.searchsorted(k1D, labelOld2D, sorter=sidx)]]
Assume the RGB value map is like this(store in a Python dict):
color_dict = {(128,128,128):(255,255,255),
(128,256,128):(255,128,255),
}
The RGB value remap operation can be done using np.where() and np.all():
cvt_img = np.zeros_like(img)
for rgb_src, rgb_dst in color_dict.items():
rgb_src = np.array(rgb_src)
rgb_dst = np.array(rgb_dst)
idx = np.where(np.all(img == rgb_src, axis=-1))
cvt_img[idx] = rgb_dst

Sinusoidal embedding - Attention is all you need

In Attention Is All You Need, the authors implement a positional embedding (which adds information about where a word is in a sequence). For this, they use a sinusoidal embedding:
PE(pos,2i) = sin(pos/10000**(2*i/hidden_units))
PE(pos,2i+1) = cos(pos/10000**(2*i/hidden_units))
where pos is the position and i is the dimension. It must result in an embedding matrix of shape [max_length, embedding_size], i.e., given a position in a sequence, it returns the tensor of PE[position,:].
I found the Kyubyong's implementation, but I do not fully understand it.
I tried to implement it in numpy the following way:
hidden_units = 100 # Dimension of embedding
vocab_size = 10 # Maximum sentence length
# Matrix of [[1, ..., 99], [1, ..., 99], ...]
i = np.tile(np.expand_dims(range(hidden_units), 0), [vocab_size, 1])
# Matrix of [[1, ..., 1], [2, ..., 2], ...]
pos = np.tile(np.expand_dims(range(vocab_size), 1), [1, hidden_units])
# Apply the intermediate funcitons
pos = np.multiply(pos, 1/10000.0)
i = np.multiply(i, 2.0/hidden_units)
matrix = np.power(pos, i)
# Apply the sine function to the even colums
matrix[:, 1::2] = np.sin(matrix[:, 1::2]) # even
# Apply the cosine function to the odd columns
matrix[:, ::2] = np.cos(matrix[:, ::2]) # odd
# Plot
im = plt.imshow(matrix, cmap='hot', aspect='auto')
I don't understand how this matrix can give information on the position of inputs. Could someone first tell me if this is the right way to compute it and second what is the rationale behind it?
Thank you.
I found the answer in a pytorch implementation:
# keep dim 0 for padding token position encoding zero vector
position_enc = np.array([
[pos / np.power(10000, 2*i/d_pos_vec) for i in range(d_pos_vec)]
if pos != 0 else np.zeros(d_pos_vec) for pos in range(n_position)])
position_enc[1:, 0::2] = np.sin(position_enc[1:, 0::2]) # dim 2i
position_enc[1:, 1::2] = np.cos(position_enc[1:, 1::2]) # dim 2i+1
return torch.from_numpy(position_enc).type(torch.FloatTensor)
where d_pos_vec is the embedding dimension and n_position the max sequence length.
EDIT:
In the paper, the authors say that this representation of the embedding matrix allows "the model to extrapolate to sequence lengths longer than the ones encountered during training".
The only difference between two positions is the pos variable. Check the image below for a graphical representation.

Categories