I'm trying to create variational autoencoder and that means I need custom loss function. The problem is that inside loss function I have 2 different losses - mse and divergence. And mse is Tensor and divergence is KerasTensor ( because of dispersion and mu, I get out from encoder ). And I get such errors:
TypeError: Cannot convert a symbolic Keras input/output to a numpy
array. This error may indicate that you're trying to pass a symbolic
value to a NumPy call, which is not supported. Or, you may be trying
to pass Keras symbolic inputs/outputs to a TF API that does not
register dispatching, preventing Keras from automatically converting
the API call to a lambda layer in the Functional Model.
So here is my architecture:
import tensorflow.keras as keras
from keras.layers import Input, Dense, Flatten, Reshape
from keras.layers import Conv2D, MaxPooling2D, UpSampling2D, Conv2DTranspose
from keras.models import Model
import tensorflow as tf
import keras.backend as K
encoded_dim = 2
class Sampling(keras.layers.Layer):
"""Uses (z_mean, z_log_var) to sample z, the vector encoding a digit."""
def call(self, inputs):
z_mean, z_log_var = inputs
batch = tf.shape(z_mean)[0]
dim = tf.shape(z_mean)[1]
epsilon = K.random_normal(shape=(batch, dim))
return z_mean + tf.exp(0.5 * z_log_var) * epsilon
img = Input((28,28,1), name='img')
x = Conv2D(32, (3,3), padding='same', activation='relu')(img)
x = MaxPooling2D()(x)
x = Conv2D(64, (3,3), padding='same', activation='relu')(x)
x = MaxPooling2D()(x)
x = Flatten()(x)
x = Dense(16, activation='relu')(x)
mu = Dense(encoded_dim, name='mu')(x)
sigma = Dense(encoded_dim, name='sigma')(x)
z = Sampling()([mu,sigma])
# print(mu)
xx = Input((encoded_dim,))
x = Dense(7*7*64, activation='relu')(xx)
x = Reshape((7,7,64))(x)
x = Conv2DTranspose(64, 3, activation="relu", strides=2, padding="same")(x)
x = Conv2DTranspose(32, 3, activation="relu", strides=2, padding="same")(x)
out = Conv2DTranspose(1, 3, activation="sigmoid", padding="same")(x)
encoder = Model(img,z, name='encoder')
decoder = Model(xx,out,name='decoder')
autoencoder = Model(img,decoder(encoder(img)),name='autoencoder')
And the loss function:
def vae_loss(x, y):
loss = tf.reduce_mean(K.square(x-y))
kl_loss = -0.5 * tf.reduce_mean(1 + sigma - tf.square(mu) - tf.exp(sigma))
print(type(loss))
print(type(kl_loss))
return loss + kl_loss
autoencoder.compile(optimizer='adam',
loss = vae_loss)
autoencoder.fit(train,train,
epochs=1,
batch_size=60,
shuffle=True,
verbose = 2)
Types of loss and lk_loss:
class 'tensorflow.python.framework.ops.Tensor'
class 'tensorflow.python.keras.engine.keras_tensor.KerasTensor'
you need to pass mu and sigma to your loss function. vae_loss is now accepting 4 inputs:
def vae_loss(x, y, mu, sigma):
loss = tf.reduce_mean(K.square(x-y))
kl_loss = -0.5 * tf.reduce_mean(1 + sigma - tf.square(mu) - tf.exp(sigma))
return loss + kl_loss
you can use it inside your model simply using autoencoder.add_loss.
It's important also that encoder returns not only z but also mu and sigma.
z, mu, sigma = encoder(img)
out = decoder(z)
autoencoder = Model(img, out, name='autoencoder')
autoencoder.add_loss(vae_loss(img, out, mu, sigma)) # <======= add_loss
autoencoder.compile(loss=None, optimizer='adam')
here the running notebook https://colab.research.google.com/drive/1r5lMZ2Dc_Lq4KJDqiirXla1qfDVmdwxW?usp=sharing
Related
I am trying to use a custom loss function in my Keras sequential model (TensorFlow 2.6.0). This custom loss (ideally) will calculate the data loss plus the residual of a physical equation (say, diffusion equation, Navier Stokes, etc.). This residual error is based on the model output derivative wrt its inputs and I want to use GradientTape.
In this MWE, I removed the data loss term and other equation losses, and just used the derivative of the output wrt its input. The dataset can be found here.
from numpy import loadtxt
from keras.models import Sequential
from keras.layers import Dense
import tensorflow as tf #tf.__version__ = '2.6.0'
# load the dataset
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=',')
# split into input (X) and output (y) variables
X = dataset[:,0:8] #X.shape = (768, 8)
y = dataset[:,8]
X = tf.convert_to_tensor(X, dtype=tf.float32)
y = tf.convert_to_tensor(y, dtype=tf.float32)
def customLoss(y_true,y_pred):
x_tensor = tf.convert_to_tensor(model.input, dtype=tf.float32)
# x_tensor = tf.cast(x_tensor, tf.float32)
with tf.GradientTape() as t:
t.watch(x_tensor)
output = model(x_tensor)
DyDX = t.gradient(output, x_tensor)
dy_t = DyDX[:, 5:6]
R_pred=dy_t
# loss_data = tf.reduce_mean(tf.square(yTrue - yPred), axis=-1)
loss_PDE = tf.reduce_mean(tf.square(R_pred))
return loss_PDE
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(12, activation='relu'))
model.add(Dense(12, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss=customLoss, optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=15)
After execution, I get this ValueError:
ValueError: Passed in object of type <class 'keras.engine.keras_tensor.KerasTensor'>, not tf.Tensor
When I change loss=customLoss to loss='mse', the model starts training, but using that customLoss is the whole point. Any ideas?
The problem seems to come from model.input in the loss function, If I understand your code correctly, you can use the loss :
def custom_loss_pass(model, x_tensor):
def custom_loss(y_true,y_pred):
with tf.GradientTape() as t:
t.watch(x_tensor)
output = model(x_tensor)
DyDX = t.gradient(output, x_tensor)
dy_t = DyDX[:, 5:6]
R_pred=dy_t
# loss_data = tf.reduce_mean(tf.square(yTrue - yPred), axis=-1)
loss_PDE = tf.reduce_mean(tf.square(R_pred))
return loss_PDE
return custom_loss
And then:
model.compile(loss=custom_loss_pass(model, X), optimizer='adam', metrics=['accuracy'])
I am not sure it does what you want but at least it works!
I am trying to write a custom loss function for a keras NN model, but it seems like the loss function is outputting the wrong value. My loss function is
def tangle_loss3(input_tensor):
def custom_loss(y_true, y_pred):
true_diff = y_true - input_tensor
pred_diff = y_pred - input_tensor
normalized_diff = K.abs(tf.math.divide(pred_diff, true_diff))
normalized_diff = tf.reduce_mean(normalized_diff)
return normalized_diff
return custom_loss
Then I use it in this simple feed-forward network:
input_layer = Input(shape=(384,), name='input')
hl_1 = Dense(64, activation='elu', name='hl_1')(input_layer)
hl_2 = Dense(32, activation='elu', name='hl_2')(hl_1)
hl_3 = Dense(32, activation='elu', name='hl_3')(hl_2)
output_layer = Dense(384, activation=None, name='output')(hl_3)
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
model = tf.keras.models.Model(input_layer, output_layer)
model.compile(loss=tangle_loss3(input_layer), optimizer=optimizer)
Then to test whether the loss function is working, I created a random input and target vector and did the numpy calculation of what I expect, but this does not seem to match the result from keras.
X = np.random.rand(1, 384)
y = np.random.rand(1, 384)
np.mean(np.abs((model.predict(X) - X)/(y - X)))
# returns some number
model.test_on_batch(X, y)
# always returns 0.0
Why does my loss function always return zero? And should these answers match?
I misunderstood your issue, and I have updated my method. it should work now. I stack the input layer and output layer to get a new layer that I pass to output.
def tangle_loss3(y_true, y_pred):
true_diff = y_true - y_pred[0]
pred_diff = y_pred[1] - y_pred[0]
normalized_diff = tf.abs(tf.math.divide(pred_diff, true_diff))
normalized_diff = tf.reduce_mean(normalized_diff)
return normalized_diff
input_layer = Input(shape=(384,), name='input')
hl_1 = Dense(64, activation='elu', name='hl_1')(input_layer)
hl_2 = Dense(32, activation='elu', name='hl_2')(hl_1)
hl_3 = Dense(32, activation='elu', name='hl_3')(hl_2)
output_layer = Dense(384, activation=None, name='output')(hl_3)
out = tf.stack([input_layer, output_layer])
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
model = tf.keras.models.Model(input_layer, out)
model.compile(loss=tangle_loss3, optimizer=optimizer)
and now when I calculate the loss it works
X = np.random.rand(1, 384)
y = np.random.rand(1, 384)
np.mean(np.abs((model.predict(X)[1] - X)/(y - X)))
# returns some number
model.test_on_batch(X, y)
Note that I have to use model.predict(X)[1] as we get two outputs, both the input and output layers' results. This is just one hacky solution but it works.
The custom loss works well with single non-nested custom_loss(y_true,y_pred). You can try to add Subtract layer of keras for output and then try to use new label as new_label = label - input right before you add to to the training pipeline.
Now only use customloss
I am using the custom loss function in addition to the mean squared error loss function in my Keras model. Code for the custom loss function is given below:
def grad1(matrix):
dx = 1.0
u_x = np.gradient(matrix,dx,axis=0)
u_xx = np.gradient(u_x,dx,axis=0)
return u_xx
def artificial_diffusion(y_true, y_pred):
u_xxt = tf.py_func(grad1,[y_true],tf.float32)
u_xxp = tf.py_func(grad1,[y_pred],tf.float32)
lap_mse = tf.losses.mean_squared_error(u_xxt,u_xxp) + K.epsilon()
I have the 1D CNN model.
input_img = Input(shape=(n_states,n_features))
x = Conv1D(32, kernel_size=5, activation='relu', padding='same')(input_img)
x = Conv1D(32, kernel_size=5, activation='relu', padding='same')(x)
x = Conv1D(32, kernel_size=5, activation='relu', padding='same')(x)
decoded1 = Conv1D(n_outputs, kernel_size=3, activation='linear', padding='same',
name='regression')(x)
decoded2 = Conv1D(n_outputs, kernel_size=3, activation='linear', padding='same',
name='diffusion')(x)
model = Model(inputs=input_img, outputs=[decoded1,decoded2])
model.compile(loss=['mse',artificial_diffusion],
loss_weights=[1, 1],
optimizer='adam',metrics=[coeff_determination])
When I compile and run the model, I get an error An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.. If I create the model as model = Model(inputs=input_img, outputs=[decoded1,decoded1]), then there is no error. But, then I can't monitor two losses separately. Am I doing any mistake while constructing the model?
I am working on generalizing the inputs to the sample variational autoencoder in the Keras repository, but seem to have made some elementary mistakes. In particular, only certain batch sizes work for the model below:
from keras.layers import Lambda, Input, Dense, Reshape
from keras.models import Model
from keras.losses import mse
from keras import backend as K
import numpy as np
# reparameterization trick
# instead of sampling from Q(z|X), sample epsilon = N(0,I)
# z = z_mean + sqrt(var) * epsilon
def sampling(args):
z_mean, z_log_var = args
batch = K.shape(z_mean)[0]
dim = K.int_shape(z_mean)[1]
# by default, random_normal has mean = 0 and std = 1.0
epsilon = K.random_normal(shape=(batch, dim))
return z_mean + K.exp(0.5 * z_log_var) * epsilon
# network parameters
original_dim = 45
input_shape = (original_dim, )
intermediate_dim = 512
latent_dim = 2
# VAE model = encoder + decoder
# build encoder model
inputs = Input(shape=input_shape, name='encoder_input')
x = Reshape((original_dim,))(inputs)
x = Dense(intermediate_dim, activation='relu')(x)
z_mean = Dense(latent_dim, name='z_mean')(x)
z_log_var = Dense(latent_dim, name='z_log_var')(x)
z = Lambda(sampling, output_shape=(latent_dim,), name='z')([z_mean, z_log_var])
encoder = Model(inputs, [z_mean, z_log_var, z], name='encoder')
# build decoder model
latent_inputs = Input(shape=(latent_dim,), name='z_sampling')
x = Dense(intermediate_dim, activation='relu')(latent_inputs)
x = Dense(original_dim, activation='sigmoid')(x)
outputs = Reshape(input_shape)(x)
decoder = Model(latent_inputs, outputs, name='decoder')
# instantiate VAE model
outputs = decoder(encoder(inputs)[2])
vae = Model(inputs, outputs, name='vae_mlp')
vae.add_loss(mse(inputs, outputs))
vae.compile(optimizer='adam')
x_train = np.random.rand(1000, 45)
vae.fit(x_train, epochs=100, batch_size=10) # works, while 23 fails
Can anyone help me understand why some batch sizes fail (e.g. 23)? I'd be grateful for any insights others can offer on this question.
You currently have unequal batch sizes if data%batch_size != 0.You can solve your problem by changing your code to:
x_train = np.random.rand(1000, 45)
batch_size = 23
vae.fit(x_train, epochs=100, steps_per_epoch = x_train.size//batch_size)
This results in all batches having the same size, here is the documentation of fit with its attributes.
I am learning the tutorial here. My Model part is:
input_img = keras.Input(shape=img_shape)
x = layers.Conv2D(32, (3, 3),
padding='same', activation='relu')(input_img)
...
x = layers.Conv2D(64, (3, 3),
padding='same', activation='relu')(x)
shape_before_flattening = K.int_shape(x)
x = layers.Flatten()(x)
x = layers.Dense(32, activation='relu')(x)
z_mean = layers.Dense(latent_dim)(x)
z_log_var = layers.Dense(latent_dim)(x)
def sampling(args):
...
z = layers.Lambda(sampling)([z_mean, z_log_var])
decoder_input = layers.Input(K.int_shape(z)[1:])
x = layers.Dense(np.prod(shape_before_flattening[1:]),
activation='relu')(decoder_input)
x = layers.Reshape(shape_before_flattening[1:])(x)
x = layers.Conv2DTranspose(32, 3,
padding='same', activation='relu',
strides=(2, 2))(x)
x = layers.Conv2D(1, 3,
padding='same', activation='sigmoid')(x)
# This is our decoder model from letent space to reconstructed images
decoder = Model(decoder_input, x)
# We then apply it to `z` to recover the decoded `z`.
z_decoded = decoder(z)
def vae_loss(self, x, z_decoded):
...
# Fit the end-to-end model
vae = Model(input_img, z_decoded) # vae = Model(input_img, x)
vae.compile(optimizer='rmsprop', loss=vae_loss)
vae.summary()
My question is: the end-to-end is vae = Model(input_img, z_decoded) or vae = Model(input_img, x). Should we compute loss on input_img and z_decoded OR between input_img and x? Thanks
x is changing throughout the model, where x = layers.Conv2D(1, 3,padding='same', activation='sigmoid')(x) you set x to be the last layer of your decoder model.
When doing z_decoded = decoder(z) you chain your decoder straight after the encoder, z_decoded is actually the output layer of your decoder, thus, the same x as earlier. Also, you create the link between the actual input and the output.
Computing the loss would yield the same results on both (as they both represent the same layer).
In short - Both vae = Model(input_img, z_decoded) and vae = Model(input_img, x) are the end to end model, i would suggest using the z_decoded version, for readability.