I am struggling to train the model in Keras, by minimizing the loss between the correct data and "input*output", but do not know how to deal with it.
Given that
X: model input (training data)
Y: model output
T: correct data
model = Model(inputs=X, outputs=Y)
Then, in my understanding,
model.fit(X,T) trains the model to minimize the distance between Y(=model(X)) and T, according to the user-defined loss function.
My question is:
What if I want to minimize the distance between Y*X and T?
I thought writing such as "model.fit(X * model.predict(X), T)" would work well? (It did not, actually)
I wonder how to write the code to do that.
Thank you for the advice in advance.
Make a functional API model:
inputs = Input(input_shape)
outputs = SomeLayer(...)(inputs)
outputs = SomeLayer(...)(outputs)
outputs = SomeLayer(...)(outputs)
....
outputs = Multiply()([inputs, outputs])
model = Model(inputs, outputs)
model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
model.fit(X, T, ...)
Related
I want to train a classification model with two losses as follows:
model.compile(optimizer=adam)
#tf.function
def train(model, inputs_data_1, inputs_data_2, y):
with tf.GradientTape(persistent=True) as tape:
logits1, features1 = model(inputs_data_1) # logits: output of fully-connected layer
logits2, features2 = model(inputs_data_2) # features: output of feature extractor
loss_fn1 = cross-entropy(y, logits1)
loss_fn2 = euclidean_dist(features1-features2)
losses = loss_fn1 + loss_fn2
optim.apply_gradients(zip(tape.gradient(losses, model.trainable_weights), model.trainable_variables))
when I try this, it just stopped without an error.
I didn't change the input data by using tf.split or tf.reshape
how can I compile the model and train with two losses?
Plz, give me some opinions or code implementation to reference this problem. Thank you.
I have a question regarding the evaluation of an LSTM Model. I have trained an LSTM Model and stored it with model.save(...). Now I want load_model and evaluate it on the validation set datasets. Since neural networks are stochastic, I run it several times and compute the mean and the variance of the different metrics I am interested in.
Now I am shocked that after the first run all consecutive runs have the same performance on every metric. I don't think that is right, but I don't know where the error occurs.
So my question is:
what is my mistake in setting up the validation of my model?
and how can I fix that?
Here are the code snippets that should explain what I am doing:
Compile and fit the Model
def compile_and_fit( hparams,
MAX_EPOCHS,
model_path ):
window = WindowGenerator( input_width= hparams[HP_WINDOW_SIZE],
label_width=hparams[HP_WINDOW_SIZE], shift=1,
label_columns=['q_MARI'], batch_size = hparams[HP_BATCH_SIZE])
model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(hparams[HP_NUM_UNITS], return_sequences=True, name="LSTM_1"),
tf.keras.layers.Dropout(hparams[HP_DROPOUT], name="Dropout_1"),
tf.keras.layers.LSTM(hparams[HP_NUM_UNITS], return_sequences=True, name="LSTM_2"),
tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(1))
])
learning_rate = hparams[HP_LEARNING_RATE]
model.compile(loss=tf.losses.MeanSquaredError(),
optimizer=tf.optimizers.Adam(learning_rate=learning_rate),
metrics=get_metrics())
history = model.fit(window.train,
epochs=MAX_EPOCHS,
validation_data=window.val,
callbacks= get_callbacks(model_path))
_, a,_,_,_,_ = model.evaluate(window.val)
return a, model, history
Train and safe it
a, model, history = compile_and_fit( hparams = hparams, MAX_EPOCHS = MAX_EPOCHS, model_path = run_path)
model.save(run_path)
Load and evaluate it
model = tf.keras.models.load_model(os.path.join(hparam_path, model_name),
custom_objects={"max_error": max_error, "median_absolute_error": median_absolute_error, "rev_metric": rev_metric, "nse_metric": nse_metric})
model.compile(loss=tf.losses.MeanSquaredError(), optimizer="adam", metrics=get_metrics())
metric_values = np.empty(shape = (nr_runs, len(metrics)), dtype=float)
for j in range(nr_runs):
window = WindowGenerator(input_width= hparam_vals[i], label_width=hparam_vals[i], shift=1,
label_columns=['q_MARI'])
metric_values[j]= np.array(model.evaluate(window.val))
means = metric_values.mean(axis=0)
varis = metric_values.var(axis=0)
print(f'means: {means}, varis: {varis}')
The results I am getting
For setting up the Training I follow those two guides:
https://www.tensorflow.org/tutorials/structured_data/time_series
https://www.tensorflow.org/tensorboard/hyperparameter_tuning_with_hparams
LSTM is not stochastic. Evaluation results should be the same for the same data.
There are two steps, when you train the model, randomness will influence the model you trained. However, after that, you saved the model, the prediction result would be same if you use the same model.
I am trying to implement an autoencoder in Keras that not only minimizes the reconstruction error but its constructed features should also maximize a measure I define. I don't really have an idea of how to do this.
Here's a snippet of what I have so far:
corrupt_data = self._corrupt(self.data, 0.1)
# define encoder-decoder network structure
# create input layer
input_layer = Input(shape=(corrupt_data.shape[1], ))
encoded = Dense(self.encoding_dim, activation = "relu")(input_layer)
decoded = Dense(self.data.shape[1], activation="sigmoid")(encoded)
# create autoencoder
dae = Model(input_layer, decoded)
# define custom multitask loss with wlm measure
def multitask_loss(y_true, y_pred):
# extract learned features from hidden layer
learned_fea = Model(input_layer, encoded).predict(self.data)
# additional measure I want to optimize from an external function
wlm_measure = wlm.measure(learned_fea, self.labels)
cross_entropy = losses.binary_crossentropy(y_true, y_pred)
return wlm_measure + cross_entropy
# create optimizer
dae.compile(optimizer=self.optimizer, loss=multitask_loss)
dae.fit(corrupt_data, self.data,
epochs=self.epochs, batch_size=20, shuffle=True,
callbacks=[tensorboard])
# separately create an encoder model
encoder = Model(input_layer, encoded)
Currently this does not work properly... When I viewed the training history the model seems to ignore the additional measure and train only based on the cross entropy loss. Also if I change the loss function to consider only wlm measure, I get the error "numpy.float64" object has no attribute "get_shape" (I don't know if changing my wlm function's return type to a tensor will help).
There are a few places that I think may have gone wrong. I don't know if I am extracting the outputs of the hidden layer correctly in my custom loss function. Also I don't know if my wlm.measure function is outputting correctly—whether it should output numpy.float32 or a 1-dimensional tensor of type float32.
Basically a conventional loss function only cares about the output layer's predicted labels and the true labels. In my case, I also need to consider the hidden layer's output (activation), which is not that straightforward to implement in Keras.
Thanks for the help!
You don't want to define your learned_fea Model inside your custom loss function. Rather, you could define a single model upfront with two outputs: the output of the decoder (the reconstruction) and the output of the endoder (the feature representation):
multi_output_model = Model(inputs=input_layer, outputs=[decoded, encoded])
Now you can write a custom loss function that only applies to the output of the encoder:
def custom_loss(y_true, y_pred):
return wlm.measure(y_pred, y_true)
Upon compiling the model, you pass a list of loss functions (or a dictionary if you name your tensors):
model.compile(loss=['binary_crossentropy', custom_loss], optimizer=...)
And fit the model by passing a list of outputs:
model.fit(X=X, y=[data_to_be_reconstructed,labels_for_wlm_measure])
I am creating a custom loss in Keras. Lets assume that we have the following:
def a_loss(X):
a, b = X
loss = . . .
return loss
def mean_loss(y_true, y_pred):
return K.mean(y_pred - 0 * y_true)
And the model goes something like:
.
.
.
z1 = Dense(shape1, activation="linear")(conv_something)
z2 = Dense(shape1, activation="linear")(conv_something2)
loss = a_loss([z1, z2])
model = Model(
inputs=[input1, input2, ..],
outputs=[loss])
model.compile(loss=mean_loss,optimizer=Adam())
Now this hypothetical model compines normally. But when I have to use the trained model to predict something I am using:
model.predict(X_dictionary)
I am assuming that the output of the above is the loss(output of a_loss function).Right? If not correct me.
What I want for output of model.predict is to be the z2. Searching the API u can use multiple outputs:
model = Model(
inputs=[sequence_input_desc, sequence_input_title_positive, sequence_input_title_negative],
outputs=[loss, z2]
)
But the above will train to minimize both loss and z2. What I want is to train only to minimize loss and the predict function to output z2. One way checking the doc is to use loss_weights=[1.0,0.0] in the compile but it doesn't work. It outputs the error The model expects 2target arrays, but only received one array. Found: array with shape ..
Any idea how to do it?
After training is done you can simply create a new model that uses the same layers but has a different output:
model = Model(
inputs=[input1, input2, ..],
outputs=[z2])
It will re-use the learned weights as they are stored in the layers, not in the model (it is just a container).
You can then use model.predict to get the results as you would normally.
I'm wondering if it's possible to add a custom model to a loss function in keras. For example:
def model_loss(y_true, y_pred):
inp = Input(shape=(128, 128, 1))
x = Dense(2)(inp)
x = Flatten()(x)
model = Model(inputs=[inp], outputs=[x])
a = model(y_pred)
b = model(y_true)
# calculate MSE
mse = K.mean(K.square(a - b))
return mse
This is a simplified example. I'll actually be using a VGG net in the loss, so just trying to understand the mechanics of keras.
The usual way of doing that is appending your VGG to the end of your model, making sure all its layers have trainable=False before compiling.
Then you recalculate your Y_train.
Suppose you have these models:
mainModel - the one you want to apply a loss function
lossModel - the one that is part of the loss function you want
Create a new model appending one to another:
from keras.models import Model
lossOut = lossModel(mainModel.output) #you pass the output of one model to the other
fullModel = Model(mainModel.input,lossOut) #you create a model for training following a certain path in the graph.
This model will have the exact same weights of mainModel and lossModel, and training this model will affect the other models.
Make sure lossModel is not trainable before compiling:
lossModel.trainable = False
for l in lossModel.layers:
l.trainable = False
fullModel.compile(loss='mse',optimizer=....)
Now adjust your data for training:
fullYTrain = lossModel.predict(originalYTrain)
And finally do the training:
fullModel.fit(xTrain, fullYTrain, ....)
This is old but I'm going to answer it because no one did directly. You definitely can call another model in a custom loss, and I actually think it's much easier than adding the model to the end of your main model and creating a whole new one and a whole new set of training labels.
Here is an example that both calls a model and an outside function that we define -
def normalize_tensor(in_feat):
norm_factor = tf.math.sqrt(tf.keras.backend.sum(in_feat**2, axis=-1, keepdims=True))
return in_feat / (norm_factor + 1e-10)
def VGGLoss(y_true, y_pred):
true = vgg(preprocess_input(y_true * 255))
pred = vgg(preprocess_input(y_pred * 255))
t = normalize_tensor(true[i])
p = normalize_tensor(pred[i])
vggLoss = tf.math.reduce_mean(tf.math.square(t - p))
return vggLoss
vgg() just calls the vgg16 model with no head.
preprocess_input is a keras function that normalizes inputs to be used in the vgg model (here we are assuming your model outputs an image in 0-1 range, then we multiply by 255 to get 0-255 range for vgg).
normalize_tensor takes the vgg activations and makes them have a magnitude of 1 for each channel, otherwise your loss will be massive.