I am implementing YOLO network with a selfdefine loss.
Say there two tensor,GT and PD (ground truth and predicts).both are 2 dims matrix of 4x4.
Assume GT is:
0,0,0,0
0,1,0,0
0,0,1,0
0,0,0,0
PD has the same size with some random nums.
Here I need to calc Mean Squared Error separately.
calc MSE with ones in GT and calc MSE with zeros in GT seperately.
I prefer to use a mask to cover the unrelated elements, so the calculation with only calc the related elements. I already implemented this in numpy, but don't know how to do this with tf(v1.14)
import numpy as np
import numpy.ma as ma
conf = y_true[...,0]
conf = np.expand_dims(conf,-1)
conf_pred = y_pred[...,0]
conf_pred = np.expand_dims(conf_pred,-1)
noobj_conf = ma.masked_equal(conf,1) #cover grid with objects
obj_conf = ma.masked_equal(conf,0) #cover grid without objects
loss_obj = np.sum(np.square(obj_conf - conf_pred))
loss_noobj = np.sum(np.square(noobj_conf - conf_pred))
Any suggestions about how to implement this in tensorflow?
If I understand you correctly, you want to calculate mean square errors of 0's and 1's separately.
You can do something like below:
y_true = tf.constant([[0,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,0]], dtype=tf.float32)
y_pred = tf.random.uniform([4, 4], minval=0, maxval=1)
# find indices where 0 is present in y_true
indices0 = tf.where(tf.equal(y_true, tf.zeros([1.])))
# find indices where 1 is present in y_true
indices1 = tf.where(tf.equal(y_true, tf.ones([1.])))
# find all values in y_pred which are present at indices0
y_pred_indices0 = tf.gather_nd(y_pred, indices0)
# find all values in y_pred which are present at indices1
y_pred_indices1 = tf.gather_nd(y_pred, indices1)
# mse loss calculations
mse0 = tf.losses.mean_squared_error(labels=tf.gather_nd(y_true, indices0), predictions=y_pred_indices0)
mse1 = tf.losses.mean_squared_error(labels=tf.gather_nd(y_true, indices1), predictions=y_pred_indices1)
# mse0 = tf.reduce_sum(tf.squared_difference(tf.gather_nd(y_true, indices0), y_pred_indices0))
# mse1 = tf.reduce_sum(tf.squared_difference(tf.gather_nd(y_true, indices1), y_pred_indices1))
with tf.Session() as sess:
y_, loss0, loss1 = sess.run([y_pred, mse0, mse1])
print(y_)
print(loss0, loss1)
output:
[[0.12770343 0.43467927 0.9362457 0.09105921]
[0.46243036 0.8838414 0.92655015 0.9347118 ]
[0.14018488 0.14527774 0.8395766 0.14391887]
[0.1209656 0.7793218 0.70543754 0.749542 ]]
0.341359 0.019614244
Related
What could be causing that this code behaves differently in TensorFlow by exchanging both lines as marked in the code? The broadcasting would happen after creating the mask so I don't think this is caused by the tf.math.equal function. In the end the sum is close but different in both cases.
def get_loss(mask_value):
mask_value = tf.Variable(mask_value, dtype=tf.float32)
def masked_seq_exact_orientation_loss(y_true, y_pred):
# find out which timesteps in `y_true` are not the padding character
mask = tf.reduce_all(tf.math.equal(y_true, mask_value), axis=-1)
mask = 1.0 - tf.cast(mask, tf.float32)
# These two lines should do the same but don't!!!!!!!! ----------
mask = tf.tile(tf.expand_dims(mask, axis=-1), [1, 1, y_pred.shape[-1]])
#mask = tf.expand_dims(mask, axis=-1)
# ----------------------------------------------------------------
# multiply loss with the mask
loss = tf.math.abs(y_true[:,:,0:2] - y_pred) * mask
tf.print(tf.math.reduce_sum(loss))
# take average w.r.t. the number of unmasked entries
return tf.math.reduce_sum(loss) / tf.math.reduce_sum(mask)
return masked_seq_exact_orientation_loss
Thanks!
I'm trying to create an output tensor with dimensionality 32 x 576 x 2 from an operation between matrices M and X, with the following shapes:
M.shape: (576, 2, 2048)
X.shape: (32, 2048)
The operation I'm defining is an element-wise cosine similarity, from the following equation:
which represents the cosine similarity between the feature vector 𝑥 and the vector M_j,k.
This is how I've implemented it in code (incorrectly), where BATCH_SIZE=32, C=576, V=2:
#tf.function
def call(self, X):
M = self.kernel
norm_M = tf.norm(M, ord=2, axis=2)
norm_X = tf.norm(X, ord=2, axis=1)
l_r = (some scalar value, separate to this question)
# Compute cosine similarity between X and M
# as a matrix with dimensionality:
# BATCH_SIZE x C x V
feature_batch_size = tf.shape(X)[0]
c = tf.shape(M)[0]
v = tf.shape(M)[1]
output_matrix = tf.zeros([feature_batch_size, c, v])
output_matrix = tf.Variable(output_matrix, trainable=False)
for row in tf.range(feature_batch_size):
for column in tf.range(c):
for channel in tf.range(v):
a = tf.tensordot(M[column][channel], X[row], 1)
b = norm_M[column][channel] * norm_X[row]
output_matrix[row][column][channel] = a / b
return [output_matrix, l_r]
This fails on the line output_matrix[row][column][channel] = a / b because it's unhappy with an assignment to an individual row:column:channel of a tf.Variable.
Is there a better way to do this operation over these two matrices to create the desired output matrix so that it can be done without these three nested for loops and maintain compatibility with the tf.Function graph functionality?
If not, what can I do to assign variables to individual elements on a tf.Variable as I'm unsuccessfully attempting to do here?
Extra information:
norm_M.shape: (576, 2)
norm_X.shape: (32,)
You can replace these loops completely by using vectorized operations in the place of for loops.
num = tf.einsum('ij,klj->ikl',X,M)
denom = tf.einsum('i,jk->ijk',norm_X, norm_M)
output_matrix = num/denom
I need to calculate the covariance matrix for RGB values across an image dataset, and then apply Cholesky decomposition to the final result.
The covariance matrix for RGB values is a 3x3 matrix M, where M_(i, i) is the variance of channel i and M_(i, j) is the covariance between channels i and j.
The end result should be something like this:
([[0.26, 0.09, 0.02],
[0.27, 0.00, -0.05],
[0.27, -0.09, 0.03]])
I'd prefer to stick to PyTorch functions even though Numpy has a Cov function.
I attempted to recreate the numpy Cov function in PyTorch here based on other cov implementations and clones:
def pytorch_cov(tensor, tensor2=None, rowvar=True):
if tensor2 is not None:
tensor = torch.cat((tensor, tensor2), dim=0)
tensor = tensor.view(1, -1) if tensor.dim() < 2 else tensor
tensor = tensor.t() if not rowvar and tensor.size(0) != 1 else tensor
tensor = tensor - torch.mean(tensor, dim=1, keepdim=True)
return 1 / (tensor.size(1) - 1) * tensor.mm(tensor.t())
def cov_vec(x):
c = x.size(0)
m1 = x - torch.sum(x, dim=[1],keepdims=True)/ c
out = torch.einsum('ijk,ilk->ijl',m1,m1) / (c - 1)
return out
The dataset loading would be like this:
dataset = torchvision.datasets.ImageFolder(data_path)
loader = torch.utils.data.DataLoader(dataset)
for images, _ in loader:
batch_size = images.size(0)
...
For the moment I'm just experimenting with images created with torch.randn(batch_size, 3, height, width).
Edit:
I'm attempting to replicate the matrix from Tensorflow's Lucid here, and somewhat explained on distill.pub here.
Second Edit:
In order to make the output resemble the example one, you have to do this instead of using Cholesky:
rgb_cov_tensor = rgb_cov_tensor / len(loader.dataset)
U,S,V = torch.svd(rgb_cov_tensor)
epsilon = 1e-10
svd_sqrt = U # torch.diag(torch.sqrt(S + epsilon))
The resulting matrix can then be used to perform color decorrelation, which is useful for visualizing features (DeepDream). I've implemented it in my project here.
Here is a function for computing the (unbiased) sample covariance matrix on a 3 channel image, named rgb_cov. Cholesky decomposition is straightforward with torch.cholesky:
import torch
def rgb_cov(im):
'''
Assuming im a torch.Tensor of shape (H,W,3):
'''
im_re = im.reshape(-1, 3)
im_re -= im_re.mean(0, keepdim=True)
return 1/(im_re.shape[0]-1) * im_re.T # im_re
#Test:
im = torch.randn(50,50,3)
cov = rgb_cov(im)
L_cholesky = torch.cholesky(cov)
it's known that when the number of variables (p) is larger than the number of samples (n) the least square estimator is not defined.
In sklearn I receive this values:
In [30]: lm = LinearRegression().fit(xx,y_train)
In [31]: lm.coef_
Out[31]:
array([[ 0.20092363, -0.14378298, -0.33504391, ..., -0.40695124,
0.08619906, -0.08108713]])
In [32]: xx.shape
Out[32]: (1097, 3419)
Call [30] should return an error. How does sklearn work when p>n like in this case?
EDIT:
It seems that the matrix is filled with some values
if n > m:
# need to extend b matrix as it will be filled with
# a larger solution matrix
if len(b1.shape) == 2:
b2 = np.zeros((n, nrhs), dtype=gelss.dtype)
b2[:m,:] = b1
else:
b2 = np.zeros(n, dtype=gelss.dtype)
b2[:m] = b1
b1 = b2
When the linear system is underdetermined, then the sklearn.linear_model.LinearRegression finds the minimum L2 norm solution, i.e.
argmin_w l2_norm(w) subject to Xw = y
This is always well defined and obtainable by applying the pseudoinverse of X to y, i.e.
w = np.linalg.pinv(X).dot(y)
The specific implementation of scipy.linalg.lstsq, which is used by LinearRegression uses get_lapack_funcs(('gelss',), ... which is precisely a solver that finds the minimum norm solution via singular value decomposition (provided by LAPACK).
Check out this example
import numpy as np
rng = np.random.RandomState(42)
X = rng.randn(5, 10)
y = rng.randn(5)
from sklearn.linear_model import LinearRegression
lr = LinearRegression(fit_intercept=False)
coef1 = lr.fit(X, y).coef_
coef2 = np.linalg.pinv(X).dot(y)
print(coef1)
print(coef2)
And you will see that coef1 == coef2. (Note that fit_intercept=False is specified in the constructor of the sklearn estimator, because otherwise it would subtract the mean of each feature before fitting the model, yielding different coefficients)
I am currently using a modified version of the U-Net (https://arxiv.org/pdf/1505.04597.pdf) to segment cell organelles in microscopy images. Since I am using Keras, I took the code from https://github.com/zhixuhao/unet. However, in this version no weight map is implemented to force the network to learn the border pixels.
The results that I have obtained so far are quite good, but the network fails to separate objects that are close to each other. So I want to try and make use of the weight map mentioned in the paper. I have been able to generate the weight map (based on the given formula) for each label image, but I was unable to find out how to use this weight map to train my network and thus solve the above mentioned problem.
Do weight maps and label images have to be combined somehow or is there a Keras function that will allow me to make use of the weight maps? I am Biologist, who only recently started to work with neural networks, so my understanding is still limited. Any help or advice would be greatly appreciated.
In case it is still relevant: I needed to solve this recently. You can paste the code below into a Jupyter notebook to see how it works.
%matplotlib inline
import numpy as np
from skimage.io import imshow
from skimage.measure import label
from scipy.ndimage.morphology import distance_transform_edt
import numpy as np
def generate_random_circles(n = 100, d = 256):
circles = np.random.randint(0, d, (n, 3))
x = np.zeros((d, d), dtype=int)
f = lambda x, y: ((x - x0)**2 + (y - y0)**2) <= (r/d*10)**2
for x0, y0, r in circles:
x += np.fromfunction(f, x.shape)
x = np.clip(x, 0, 1)
return x
def unet_weight_map(y, wc=None, w0 = 10, sigma = 5):
"""
Generate weight maps as specified in the U-Net paper
for boolean mask.
"U-Net: Convolutional Networks for Biomedical Image Segmentation"
https://arxiv.org/pdf/1505.04597.pdf
Parameters
----------
mask: Numpy array
2D array of shape (image_height, image_width) representing binary mask
of objects.
wc: dict
Dictionary of weight classes.
w0: int
Border weight parameter.
sigma: int
Border width parameter.
Returns
-------
Numpy array
Training weights. A 2D array of shape (image_height, image_width).
"""
labels = label(y)
no_labels = labels == 0
label_ids = sorted(np.unique(labels))[1:]
if len(label_ids) > 1:
distances = np.zeros((y.shape[0], y.shape[1], len(label_ids)))
for i, label_id in enumerate(label_ids):
distances[:,:,i] = distance_transform_edt(labels != label_id)
distances = np.sort(distances, axis=2)
d1 = distances[:,:,0]
d2 = distances[:,:,1]
w = w0 * np.exp(-1/2*((d1 + d2) / sigma)**2) * no_labels
else:
w = np.zeros_like(y)
if wc:
class_weights = np.zeros_like(y)
for k, v in wc.items():
class_weights[y == k] = v
w = w + class_weights
return w
y = generate_random_circles()
wc = {
0: 1, # background
1: 5 # objects
}
w = unet_weight_map(y, wc)
imshow(w)
I think you want to use class_weight in Keras. This is actually simple to introduce in your model if you have already calculated the class weights.
Create a dictionary with your class labels and their associated weights. For example
class_weight = {0: 10.9,
1: 20.8,
2: 1.0,
3: 50.5}
Or create a 1D Numpy array of the same length as your number of classes. For example
class_weight = [10.9, 20.8, 1.0, 50.5]
Pass this parameter during training in your model.fit or model.fit_generator
model.fit(x, y, batch_size=batch_size, epochs=num_epochs, verbose=1, class_weight=class_weight)
You can look up the Keras documentation for more details here.