Concatenate identity matrix to each vector - python

I want to modify my input by adding several different suffixes to the input vectors. For example, if the (single) input is [1, 5, 9, 3] I want to create three vectors (stored as matrix) like this:
[[1, 5, 9, 3, 1, 0, 0],
[1, 5, 9, 3, 0, 1, 0],
[1, 5, 9, 3, 0, 0, 1]]
Of course, this is just one observation so the input to the model is (None, 4) in this case. The simple way is to prepare the input data somewhere else (numpy most probably) and adjust the shape of input accordingly. That I can do but I would prefer doing it inside TensorFlow/Keras.
I have isolated the problem into this code:
import keras.backend as K
from keras import Input, Model
from keras.layers import Lambda
def build_model(dim_input: int, dim_eye: int):
input = Input((dim_input,))
concat = Lambda(lambda x: concat_eye(x, dim_input, dim_eye))(input)
return Model(inputs=[input], outputs=[concat])
def concat_eye(x, dim_input, dim_eye):
x = K.reshape(x, (-1, 1, dim_input))
x = K.repeat_elements(x, dim_eye, axis=1)
eye = K.expand_dims(K.eye(dim_eye), axis=0)
eye = K.tile(eye, (-1, 1, 1))
out = K.concatenate([x, eye], axis=2)
return out
def main():
import numpy as np
n = 100
dim_input = 20
dim_eye = 3
model = build_model(dim_input, dim_eye)
model.compile(optimizer='sgd', loss='mean_squared_error')
x_train = np.zeros((n, dim_input))
y_train = np.zeros((n, dim_eye, dim_eye + dim_input))
model.fit(x_train, y_train)
if __name__ == '__main__':
main()
The problem seems to be in the -1 in shape argument in tile function. I tried to replace it with 1 and None. Each has its own error:
-1: error during model.fit
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected multiples[0] >= 0, but got -1
1: error duting model.fit
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [32,3,20] vs. shape[1] = [1,3,3]
None: error during build_model:
Failed to convert object of type <class 'tuple'> to Tensor. Contents: (None, 1, 1). Consider casting elements to a supported type.

You need to use K.shape() instead to get the symbolic shape of input tensor. That's because the batch size is None and therefore passing K.int_shape(x)[0] or None or -1 as a part of the second argument of K.tile() would not work:
eye = K.tile(eye, (K.shape(x)[0], 1, 1))

Related

Proper way to input a scalar into a Tensorflow 2 model

In my Tensorflow 2 model, I want my batch size to be parametric, such that I can build tensors which have appropriate batch size dynamically. I have the following code:
batch_size_param = 128
tf_batch_size = tf.keras.Input(shape=(), name="tf_batch_size", dtype=tf.int32)
batch_indices = tf.range(0, tf_batch_size, 1)
md = tf.keras.Model(inputs={"tf_batch_size": tf_batch_size}, outputs=[batch_indices])
res = md(inputs={"tf_batch_size": batch_size_param})
The code throws an error in tf.range:
ValueError: Shape must be rank 0 but is rank 1
for 'limit' for '{{node Range}} = Range[Tidx=DT_INT32](Range/start, tf_batch_size, Range/delta)' with input shapes: [], [?], []
I think the problem is with the fact that tf.keras.Input automatically tries to expand the input array at the first dimension, since it expects the partial shape of the input without the batch size and will attach the batch size according to the shape of the input array, which in my case a scalar. I can just feed the scalar value as a constant integer into tf.range but this time, I won't be able to change it after the model graph has been compiled.
Interestingly, I failed to find a proper way to input only a scalar into a TF-2 model even though I checked the documentation, too. So, what would be the best way to handle such a case?
Don't use tf.keras.Input and just define the model by subclassing.
import tensorflow as tf
class ScalarModel(tf.keras.Model):
def __init__(self):
super().__init__()
def call(self, x):
return tf.range(0, x, 1)
print(ScalarModel()(10))
# tf.Tensor([0 1 2 3 4 5 6 7 8 9], shape=(10,), dtype=int32)
I'm not sure if this is actually a good idea, but you could use tf.squeeze like
inp = keras.Input(shape=(), dtype=tf.int32)
batch_indices = tf.range(tf.squeeze(inp))
model = keras.Model(inputs=inp, outputs=batch_indices)
so that
model(6)
gives
<tf.Tensor: shape=(6,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5])>
Edit:
Depending on what you want to achieve, it might also be worth looking into ragged tensors:
inp = keras.Input(shape=(), dtype=tf.int32)
batch_indices = tf.ragged.range(inp)
model = keras.Model(inputs=inp, outputs=batch_indices)
would make
model(np.array([6,7]))
return
<tf.RaggedTensor [[0, 1, 2, 3, 4, 5], [0, 1, 2, 3, 4, 5, 6]]>

Why is pytorch softmax function not working?

so this is my code
import torch.nn.functional as F
import torch
inputs = [1,2,3]
input = torch.tensor(inputs)
output = F.softmax(input, dim=1)
print(output)
is the reason why the code not working because of the dim?
the error here:
File "c:\Users\user\Desktop\AI\pytorch_jovian\linear_reg.py", line 19, in <module>
output = F.softmax(input, dim=1)
File "C:\Users\user\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\functional.py", line 1583, in softmax
ret = input.softmax(dim)
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
Apart from dim=0, there is another issue in your code. Softmax doesn't work on a long tensor, so it should be converted to a float or double tensor first
>>> input = torch.tensor([1, 2, 3])
>>> input
tensor([1, 2, 3])
>>> F.softmax(input.float(), dim=0)
tensor([0.0900, 0.2447, 0.6652])

Explicit broadcasting of variable batch-size tensor

I'm trying to implement a custom Keras Layer in Tensorflow 2.0RC and need to concatenate a [None, Q] shaped tensor onto a [None, H, W, D] shaped tensor to produce a [None, H, W, D + Q] shaped tensor. It is assumed that the two input tensors have the same batch size even though it is not known beforehand. Also, none of H, W, D, and Q are known at write-time but are evaluated in the layer's build method when the layer is first called. The issue that I'm experiencing is when broadcasting the [None, Q] shaped tensor up to a [None, H, W, Q] shaped tensor in order to concatenate.
Here is an example of trying to create a Keras Model using the Functional API that performs variable-batch broadcasting from shape [None, 3] to shape [None, 5, 5, 3]:
import tensorflow as tf
import tensorflow.keras.layers as kl
import numpy as np
x = tf.keras.Input([3]) # Shape [None, 3]
y = kl.Reshape([1, 1, 3])(x) # Need to add empty dims before broadcasting
y = tf.broadcast_to(y, [-1, 5, 5, 3]) # Broadcast to shape [None, 5, 5, 3]
model = tf.keras.Model(inputs=x, outputs=y)
print(model(np.random.random(size=(8, 3))).shape)
Tensorflow produces the error:
InvalidArgumentError: Dimension -1 must be >= 0
And then when I change -1 to None it gives me:
TypeError: Failed to convert object of type <class 'list'> to Tensor. Contents: [None, 5, 5, 3]. Consider casting elements to a supported type.
How can I perform the specified broadcasting?
You need to use the dynamic shape of y to determine the batch size. The dynamic shape of a tensor y is given by tf.shape(y) and is a tensor op representing the shape of y evaluated at runtime. The modified example demonstrates this by selecting between the old shape, [None, 1, 1, 3], and the new shape using tf.where.
import tensorflow as tf
import tensorflow.keras.layers as kl
import numpy as np
x = tf.keras.Input([3]) # Shape [None, 3]
y = kl.Reshape([1, 1, 3])(x) # Need to add empty dims before broadcasting
# Retain the batch and depth dimensions, but broadcast along H and W
broadcast_shape = tf.where([True, False, False, True],
tf.shape(y), [0, 5, 5, 0])
y = tf.broadcast_to(y, broadcast_shape) # Broadcast to shape [None, 5, 5, 3]
model = tf.keras.Model(inputs=x, outputs=y)
print(model(np.random.random(size=(8, 3))).shape)
# prints: "(8, 5, 5, 3)"
References:
"TensorFlow: Shapes and dynamic dimensions"

Trying to understand cross_entropy loss in PyTorch

This is a very newbie question but I'm trying to wrap my head around cross_entropy loss in Torch so I created the following code:
x = torch.FloatTensor([
[1.,0.,0.]
,[0.,1.,0.]
,[0.,0.,1.]
])
print(x.argmax(dim=1))
y = torch.LongTensor([0,1,2])
loss = torch.nn.functional.cross_entropy(x, y)
print(loss)
which outputs the following:
tensor([0, 1, 2])
tensor(0.5514)
What I don't understand is given my input matches the expected output why is the loss not 0?
That is because the input you give to your cross entropy function is not the probabilities as you did but the logits to be transformed into probabilities with this formula:
probas = np.exp(logits)/np.sum(np.exp(logits), axis=1)
So here the matrix of probabilities pytorch will use in your case is:
[0.5761168847658291, 0.21194155761708547, 0.21194155761708547]
[0.21194155761708547, 0.5761168847658291, 0.21194155761708547]
[0.21194155761708547, 0.21194155761708547, 0.5761168847658291]
torch.nn.functional.cross_entropy function combines log_softmax(softmax followed by a logarithm) and nll_loss(negative log likelihood loss) in a single
function, i.e. it is equivalent to F.nll_loss(F.log_softmax(x, 1), y).
Code:
x = torch.FloatTensor([[1.,0.,0.],
[0.,1.,0.],
[0.,0.,1.]])
y = torch.LongTensor([0,1,2])
print(torch.nn.functional.cross_entropy(x, y))
print(F.softmax(x, 1).log())
print(F.log_softmax(x, 1))
print(F.nll_loss(F.log_softmax(x, 1), y))
output:
tensor(0.5514)
tensor([[-0.5514, -1.5514, -1.5514],
[-1.5514, -0.5514, -1.5514],
[-1.5514, -1.5514, -0.5514]])
tensor([[-0.5514, -1.5514, -1.5514],
[-1.5514, -0.5514, -1.5514],
[-1.5514, -1.5514, -0.5514]])
tensor(0.5514)
Read more about torch.nn.functional.cross_entropy loss function from here.
Complete, copy/paste runnable example showing an example categorical cross-entropy loss calculation via:
-paper+pencil+calculator
-NumPy
-PyTorch
Other than minor rounding differences all 3 come out to be the same:
import torch
import torch.nn.functional as F
import numpy as np
def main():
### paper + pencil + calculator calculation #################
"""
predictions before softmax:
columns
(4 categories)
rows 1, 4, 1, 1
(3 samples) 5, 1, 2, 1
1, 2, 5, 1
ground truths (NOT one hot encoded)
1, 0, 2
preds softmax calculation:
(e^1/(e^1+e^4+e^1+e^1)), (e^4/(e^1+e^4+e^1+e^1)), (e^1/(e^1+e^4+e^1+e^1)), (e^1/(e^1+e^4+e^1+e^1))
(e^5/(e^5+e^1+e^2+e^1)), (e^1/(e^5+e^1+e^2+e^1)), (e^2/(e^5+e^1+e^2+e^1)), (e^1/(e^5+e^1+e^2+e^1))
(e^1/(e^1+e^2+e^5+e^1)), (e^2/(e^1+e^2+e^5+e^1)), (e^5/(e^1+e^2+e^5+e^1)), (e^1/(e^1+e^2+e^5+e^1))
preds after softmax:
0.04332, 0.87005, 0.04332, 0.04332
0.92046, 0.01686, 0.04583, 0.01686
0.01686, 0.04583, 0.92046, 0.01686
categorical cross-entropy loss calculation:
(-ln(0.87005) + -ln(0.92046) + -ln(0.92046)) / 3 = 0.10166
Note the loss ends up relatively low because all 3 predictions are correct
"""
### calculation via NumPy ###################################
# predictions from model (just made up example data in this case)
# rows = 3 samples, cols = 4 categories
preds = np.array([[1, 4, 1, 1],
[5, 1, 2, 1],
[1, 2, 5, 1]], dtype=np.float32)
# ground truths, NOT one hot encoded
gndTrs = np.array([1, 0, 2], dtype=np.int64)
preds = softmax(preds)
loss = calcCrossEntropyLoss(preds, gndTrs)
print('\n' + 'NumPy loss = ' + str(loss) + '\n')
### calculation via PyTorch #################################
# predictions from model (just made up example data in this case)
# rows = 3 samples, cols = 4 categories
preds = torch.tensor([[1, 4, 1, 1],
[5, 1, 2, 1],
[1, 2, 5, 1]], dtype=torch.float32)
# ground truths, NOT one hot encoded
gndTrs = torch.tensor([1, 0, 2], dtype=torch.int64)
loss = F.cross_entropy(preds, gndTrs)
print('PyTorch loss = ' + str(loss) + '\n')
# end function
def softmax(x: np.ndarray) -> np.ndarray:
numSamps = x.shape[0]
for i in range(numSamps):
x[i] = np.exp(x[i]) / np.sum(np.exp(x[i]))
# end for
return x
# end function
def calcCrossEntropyLoss(preds: np.ndarray, gndTrs: np.ndarray) -> np.ndarray:
assert len(preds.shape) == 2
assert len(gndTrs.shape) == 1
assert preds.shape[0] == gndTrs.shape[0]
numSamps = preds.shape[0]
mySum = 0.0
for i in range(numSamps):
# Note: in numpy, "log" is actually natural log (ln)
mySum += -1 * np.log(preds[i, gndTrs[i]])
# end for
crossEntLoss = mySum / numSamps
return crossEntLoss
# end function
if __name__ == '__main__':
main()
program output:
NumPy loss = 0.10165966302156448
PyTorch loss = tensor(0.1017)

tensorflow: Strange result from convolution compared to theano (not flipping, though)

I am trying to implement some deep neural network with tensorflow. But I have already a problem at the first steps.
When I type the following using theano.tensor.nnet.conv2d, I get the expected result:
import theano.tensor as T
import theano
import numpy as np
# Theano expects input of shape (batch_size, channels, height, width)
# and filters of shape (out_channel, in_channel, height, width)
x = T.tensor4()
w = T.tensor4()
c = T.nnet.conv2d(x, w, filter_flip=False)
f = theano.function([x, w], [c], allow_input_downcast=True)
base = np.array([[1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1]]).T
i = base[np.newaxis, np.newaxis, :, :]
print f(i, i) # -> results in 3 as expected because np.sum(i*i) = 3
However, when I do the presumingly same thing in tf.nn.conv2d, my result is different:
import tensorflow as tf
import numpy as np
# TF expects input of (batch_size, height, width, channels)
# and filters of shape (height, width, in_channel, out_channel)
x = tf.placeholder(tf.float32, shape=(1, 4, 3, 1), name="input")
w = tf.placeholder(tf.float32, shape=(4, 3, 1, 1), name="weights")
c = tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding='VALID')
with tf.Session() as sess:
base = np.array([[1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1]]).T
i = base[np.newaxis, :, :, np.newaxis]
weights = base[:, :, np.newaxis, np.newaxis]
res = sess.run(c, feed_dict={x: i, w: weights})
print res # -> results in -5.31794233e+37
The layout of the convolution operation in tensorflow is a little different from theano, which is why the input looks slightly different.
However, since strides in Theano default to (1,1,1,1) and a valid convolution is the default, too, this should be the exact same input.
Furthermore, tensorflow does not flip the kernel (implements cross-correlation).
Do you have any idea why this is not giving the same result?
Thanks in advance,
Roman
Okay, I found a solution, even though it is not really one because I do not understand it myself.
First, it seems that for the task that I was trying to solve, Theano and Tensorflow use different convolutions.
The task at hand is a "1.5 D convolution" which means sliding a kernel in only one direction over the input (here DNA sequences).
In Theano, I solved this with the conv2d operation that had the same amount of rows as the kernels and it was working fine.
However, Tensorflow (probably correctly) wants me to use conv1d for that, interpreting the rows as channels.
So, the following should work but didn't in the beginning:
import tensorflow as tf
import numpy as np
# TF expects input of (batch_size, height, width, channels)
# and filters of shape (height, width, in_channel, out_channel)
x = tf.placeholder(tf.float32, shape=(1, 4, 3, 1), name="input")
w = tf.placeholder(tf.float32, shape=(4, 3, 1, 1), name="weights")
x_star = tf.reshape(x, [1, 4, 3])
w_star = tf.reshape(w, [4, 3, 1])
c = tf.nn.conv1d(x_star, w_star, stride=1, padding='VALID')
with tf.Session() as sess:
base = np.array([[1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1]]).T
i = base[np.newaxis, :, :, np.newaxis]
weights = base[:, :, np.newaxis, np.newaxis]
res = sess.run(c, feed_dict={x: i, w: weights})
print res # -> produces 3 after updating tensorflow
This code produced NaN until I updated Tensorflow to version 1.0.1 and since then, it produces the expected output.
To summarize, my problem was partly solved by using 1D convolution instead of 2D convolution but still required the update of the framework. For the second part, I have no idea at all what might have caused wrong behavior in the first place.
EDIT: The code I posted in my original question is now working fine, too. So the different behavior came only from an old (maybe corrupt) version of TF.

Categories