I am trying to custom a loss function using the outputs of each neuron of the last layer. And the function may not be linear. Here is what I am working on:
## some previous layers##
## my last dense layer##
dense1 = Dense(4, activation="relu", name="dense_layer1")(previous layer)
dense11 = Dense(1, activation = "sigmoid", name = "dense11")(dense1)
dense12 = Dense(1, activation = "sigmoid", name = "dense12")(dense1)
dense13 = Dense(1, activation = "sigmoid", name = "dense13")(dense1)
dense14 = Dense(1, activation = "sigmoid", name = "dense14")(dense1)
## custom loss function ##
def custom_layer(tensor):
return tensor[1]*2+tensor[2]+tensor[3]/(tensor[4]*2) #some nonlinear function like this
lambda_layer = Lambda(custom_layer, name="lambda_layer")([dense11,dense12,dense13,dense14])
model = Model(inputs=Input, outputs=lambda_layer) # "Input" are in previous layers, not shown here
model.compile(loss='mse', optimizer='adam')
model.fit(X_train, Y_train, epochs=2, batch_size=512, verbose=1)
My Y_train is n*1 (n is the sample size).
So I am basically applying a nonlinear transformation of those final four neurons' output, which is equivalent as to construct a new loss function. After the transformation, the y hat should also be a n*1 vector.
But the code keeps not working. I think it is due to the lambda_layer or the custom_layer function. I also tried to define a new loss function (then there would be no "lambda_layer"), but it didn't work either. I have no idea what's wrong with it. (headache!)
Any ideas or suggestions are appreciated!! Thanks a lot! (I'm using Python3.7 with Tensorflow version 2.0.0)
Solved, thanks!
Related
I had 5 LSTM layers and 2 MLP's which must be concatenate together into another MLP which produce the final output. Here is the code I wrote using the API approach, which works fine:
lstm_input = Input(shape=(X_dynamic_LSTM.shape[1], X_dynamic_LSTM.shape[2]))
x = LSTM(70, activation='tanh', return_sequences=True)(lstm_input )
x = Dropout(0.3)(x)
x = Dense(1, activation='tanh')(x)
mlp_input=Input(shape=(X_static_MLP.shape[1]))
mlp = Dense(30, activation='relu')(mlp_input)
mlp = Dense(10, activation='relu')(mlp)
merge = Concatenate()([x, mlp])
hidden1 = Dense(5, activation='relu')(merge)
mlp_out = Dense(1, activation='relu')(hidden1)
model = Model(inputs=[lstm_input, mlp_input], outputs=mlp_out)
model.compile(loss='mae', optimizer='Adam')
history = model.fit([X_dynamic_LSTM, X_static_MLP], y_train, batch_size=20,
epochs=10, validation_split=0.2)
If I want to convert this format to one similar to below:
x = Sequential()
x.add(LSTM(70, return_sequences=True))
x.add(Dropout(0.3))
x.add(Dense(1, activation='tanh'))
Can any one help me how should I define the MLP, the Concatenate and the part regarding the "model = Model(inputs=[lstm_input, mlp_input], outputs=mlp_out)" ??
My main problem is I want to add an Embedding layer to the LSTM. when I add the dollowing code to non-API approach the model works perfect.
x.add(Embedding(X_dynamic_LSTM.shape[0], 1,mask_zero=True))
But when instead I used
lstm_input = Embedding(X_dynamic_LSTM.shape[0], 1,mask_zero=True)
It gave me the error : TypeError: Inputs to a layer should be tensors, So I got to stick with non-API approach.
If I add, e.g. kernel_regularizer=tf.keras.regularizers.L1(0.01) to a layer, do I need to add something to my loss description when I compile, or is it automatically added to my normal loss?
Using the tf.keras.regularizers.L1(0.01) will automatically add a penalty to your loss function. You can observe the changes in the loss function with and without the penalty using this simple example:
import tensorflow as tf
tf.random.set_seed(1)
x_input = tf.keras.layers.Input((1,))
x = tf.keras.layers.Dense(3, kernel_regularizer=tf.keras.regularizers.L1(0.01))(x_input)
x_output = tf.keras.layers.Dense(1, activation='sigmoid')(x)
model = tf.keras.Model(x_input, x_output)
model.compile(optimizer='adam', loss=tf.keras.losses.BinaryCrossentropy())
x = tf.random.normal((1, 1))
y = tf.random.uniform((1, 1), maxval=2, dtype=tf.int32)
model.fit(x, y, epochs=1)
If you were to use a custom training loop, you would have to manually add the additional penalties you have defined in certain layers to your loss, as shown here.
I am trying to replace all ReLU activation functions in the MobileNetV2 with some custom activation functions(abs, swish, leaky relu etc.). First I liked to start with abs.
I saw a few similar posts but they were not really helpful for my problem -> see this one.
backbone = tf.keras.applications.mobilenet_v2.MobileNetV2(
input_shape=IN_SHAPE, include_top=False, weights=None)
x = tf.keras.layers.GlobalAveragePooling2D()(backbone.output)
x = tf.keras.layers.Dropout(dropout_rate)(x)
kernel_initializer = tf.random_normal_initializer(mean=0.0, stddev=0.02)
bias_initializer = tf.constant_initializer(value=0.0)
x = tf.keras.layers.Dense(
NUM_CLASSES, activation='sigmoid', name='Logits',
kernel_initializer=kernel_initializer,
bias_initializer=bias_initializer)(x)
model = tf.keras.models.Model(inputs=backbone.input, outputs=x)
for layer in model.layers:
layer.trainable = True
smooth = 0.1
loss = tf.keras.losses.BinaryCrossentropy(label_smoothing=smooth)
model = replace_relu_with_abs(model) # function I like to call to replace the activation function
model.compile(
optimizer=optimizer,
loss=loss,
metrics=['accuracy'])
model.save("mobilenetv2-abs")
model = tf.keras.models.load_model("mobilenetv2-abs")
print(model.summary())
# Here the function I like to implement
def replace_relu_with_abs(model):
for layer in model.layers:
# do something
return model
I tried to implement the solution from the link above but that didn't work and after debugging it I saw that MobileNetV2 has in the Conv2d layer linear as an activation function and the relu activation function is a separate layer which comes after the BatchNormalization layer.
Has anyone a tip or solution how to replace the relu activation functions with custom ones( here its rather a ReLU activation layers)?
I am using tensorflow version 2.6
I want to specify 2 loss functions 1 for the object class which is cross-entropy and the other for the bounding box which is mean squared error. how to specify in model.compile each output with the corresponding loss function?
model = Sequential()
model.add(Dense(128, activation='relu'))
out_last_dense = model.add(Dense(128, activation='relu'))
object_type = model.add(Dense(1, activation='softmax'))(out_last_dense)
object_coordinates = model.add(Dense(4, activation='softmax'))(out_last_dense)
/// here is the problem i want to specify loss function for object type and coordinates
model.compile(loss= keras.losses.categorical_crossentropy,
optimizer= 'sgd', metrics=['accuracy'])
First of all, you can't use Sequential API here since your model has two output layers (i.e. what you have written is all wrong and would raise error). Instead you must use Keras Functional API:
inp = Input(shape=...)
x = Dense(128, activation='relu')(inp)
x = Dense(128, activation='relu')(x)
object_type = Dense(1, activation='sigmoid', name='type')(x)
object_coordinates = Dense(4, activation='linear', name='coord')(x)
Now, you can specify a loss function (as well as metric) for each output layer based on their names given above and using a dictionary:
model.compile(loss={'type': 'binary_crossentropy', 'coord': 'mse'},
optimizer='sgd', metrics={'type': 'accuracy', 'coord': 'mae'})
Further, note that you are using softmax as the activation function and I have changed it to sigomid and linear above. That's because: 1) using softmax on a layer with one unit does not make sense (if there are more than 2 classes then you should use softmax), and 2) the other layer predicts coordinates and therefore using softmax is not suitable at all (unless the problem formulation let you do so).
I am interested in building reinforcement learning models with the simplicity of the Keras API. Unfortunately, I am unable to extract the gradient of the output (not error) with respect to the weights. I found the following code that performs a similar function (Saliency maps of neural networks (using Keras))
get_output = theano.function([model.layers[0].input],model.layers[-1].output,allow_input_downcast=True)
fx = theano.function([model.layers[0].input] ,T.jacobian(model.layers[-1].output.flatten(),model.layers[0].input), allow_input_downcast=True)
grad = fx([trainingData])
Any ideas on how to calculate the gradient of the model output with respect to the weights for each layer would be appreciated.
To get the gradients of model output with respect to weights using Keras you have to use the Keras backend module. I created this simple example to illustrate exactly what to do:
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras import backend as k
model = Sequential()
model.add(Dense(12, input_dim=8, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
To calculate the gradients we first need to find the output tensor. For the output of the model (what my initial question asked) we simply call model.output. We can also find the gradients of outputs for other layers by calling model.layers[index].output
outputTensor = model.output #Or model.layers[index].output
Then we need to choose the variables that are in respect to the gradient.
listOfVariableTensors = model.trainable_weights
#or variableTensors = model.trainable_weights[0]
We can now calculate the gradients. It is as easy as the following:
gradients = k.gradients(outputTensor, listOfVariableTensors)
To actually run the gradients given an input, we need to use a bit of Tensorflow.
trainingExample = np.random.random((1,8))
sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())
evaluated_gradients = sess.run(gradients,feed_dict={model.input:trainingExample})
And thats it!
The below answer is with the cross entropy function, feel free to change it your function.
outputTensor = model.output
listOfVariableTensors = model.trainable_weights
bce = keras.losses.BinaryCrossentropy()
loss = bce(outputTensor, labels)
gradients = k.gradients(loss, listOfVariableTensors)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
evaluated_gradients = sess.run(gradients,feed_dict={model.input:training_data1})
print(evaluated_gradients)