I'm struggling to implement a custom metric in Keras (2.4.3 with the tensorflow backend) such that I can trigger an early stopping mechanic. Essentially, I want to have Keras stop training a model should there be too big a decrease in the training loss function. To do this, I am using the following code:
def custom_metric(y_true,y_pred):
y=keras.losses.CategoricalCrossentropy(y_true,y_pred)
z=1.0/(1.0-y.numpy())
return z
model.compile(loss='categorical_crossentropy',
optimizer=opt,
metrics=['categorical_accuracy',custom_metric])
custom_stop = EarlyStopping(monitor='custom_metric',min_delta=0,patience=2,
verbose=1,mode='min',restore_best_weights=True)
I'm getting errors along the lines of AttributeError: 'CategoricalCrossentropy' object has no attribute 'numpy', which I understand is due to the definition of z, but I can't get something equivalent to work using by replacing the floats in the definition of z with tf.constants or anything like that. Does anyone have any suggestions?
Thanks a lot
Use this instead, mind the spelling:
keras.losses.categorical_crossentropy(y_true,y_pred)
This should work:
def custom_metric(y_true,y_pred):
y=keras.losses.categorical_crossentropy(y_true,y_pred)
z=1.0/(1.0-y)
return z
Related
I am a interested in GAN.
I tried to adjust the DCGAN's discriminator by this method below:
https://github.com/vasily789/adaptive-weighted-gans/blob/main/aw_loss.py
which name is aw method.
So I find a DCGAN code in kaggle(https://www.kaggle.com/vatsalmavani/deep-convolutional-gan-in-pytorch) and try to edit the discriminator by class the aw_loss.
Here is my code:
https://colab.research.google.com/drive/1AsZztd0Af0UMzBXXkI9QKQZhAUoK01bk?usp=sharing
it seems like I can not class the aw loss correctly. Because the discriminator's loss is still 0 when I training.
Any one can help me. Please!
In the code you provided, it does display the correct error when trying to use aw_method(), you should first instance the class as shown below after which you should be able to call the method.
aw_instance = aw_method()
aw_loss = aw_instance.aw_loss(D_real_loss, D_fake_loss, D_opt, D)
Notice that we are using default parameters for the class, not so familiar with aw loss to tell you if you should tweak that.
Regarding your discriminator's loss, correct code relies on aw_cost to work. It doesn't seem like your providing both losses from real and fake, so the discriminator is only learning to output 1's or 0's (which can be easily verified by printing those values or monitoring with wandb or similar tools). Again didn't go deep enough into the algorithm of the aw loss, so check this specifically.
Also could try to test as a linear combination of your normal D_loss = (D_fake_loss + D_real_loss + aw_loss) / 3.
I want to use the external optimizer interface within tensorflow, to use newton optimizers, as tf.train only has first order gradient descent optimizers. At the same time, i want to build my network using tf.keras.layers, as it is way easier than using tf.Variables when building large, complex networks. I will show my issue with the following, simple 1D linear regression example:
import tensorflow as tf
from tensorflow.keras import backend as K
import numpy as np
#generate data
no = 100
data_x = np.linspace(0,1,no)
data_y = 2 * data_x + 2 + np.random.uniform(-0.5,0.5,no)
data_y = data_y.reshape(no,1)
data_x = data_x.reshape(no,1)
# Make model using keras layers and train
x = tf.placeholder(dtype=tf.float32, shape=[None,1])
y = tf.placeholder(dtype=tf.float32, shape=[None,1])
output = tf.keras.layers.Dense(1, activation=None)(x)
loss = tf.losses.mean_squared_error(data_y, output)
optimizer = tf.contrib.opt.ScipyOptimizerInterface(loss, method="L-BFGS-B")
sess = K.get_session()
sess.run(tf.global_variables_initializer())
tf_dict = {x : data_x, y : data_y}
optimizer.minimize(sess, feed_dict = tf_dict, fetches=[loss], loss_callback=lambda x: print("Loss:", x))
When running this, the loss just does not change at all. When using any other optimizer from tf.train, it works fine. Also, when using tf.layers.Dense() instead of tf.keras.layers.Dense() it does work using the ScipyOptimizerInterface. So really the question is what is the difference between tf.keras.layers.Dense() and tf.layers.Dense(). I saw that the Variables created by tf.layers.Dense() are of type tf.float32_ref while the Variables created by tf.keras.layers.Dense() are of type tf.float32. As far as I now, _ref indicates that this tensor is mutable. So maybe that's the issue? But then again, any other optimizer from tf.train works fine with keras layers.
Thanks
After a lot of digging I was able to find a possible explanation.
ScipyOptimizerInterface uses feed_dicts to simulate the updates of your variables during the optimization process. It only does an assign operation at the very end. In contrast, tf.train optimizers always do assign operations. The code of ScipyOptimizerInterface is not that complex so you can verify this easily.
Now the problem is that assigining variables with feed_dict is working mostly by accident. Here is a link where I learnt about this. In other words, assigning variables via feed dict, which is what ScipyOptimizerInterface does, is a hacky way of doing updates.
Now this hack mostly works, except when it does not. tf.keras.layers.Dense uses ResourceVariables to model the weights of the model. This is an improved version of simple Variables that has cleaner read/write semantics. The problem is that under the new semantics the feed dict update happens after the loss calculation. The link above gives some explanations.
Now tf.layers is currently a thin wrapper around tf.keras.layer so I am not sure why it would work. Maybe there is some compatibility check somewhere in the code.
The solutions to adress this are somewhat simple.
Either avoid using components that use ResourceVariables. This can be kind of difficult.
Patch ScipyOptimizerInterface to do assignments for variables always. This is relatively easy since all the required code is in one file.
There was some effort to make the interface work with eager (that by default uses the ResourceVariables). Check out this link
I think the problem is with the line
output = tf.keras.layers.Dense(1, activation=None)(x)
In this format output is not a layer but rather the output of a layer, which might be preventing the wrapper from collecting the weights and biases of the layer and feed them to the optimizer. Try to write it in two lines e.g.
output = tf.keras.layers.Dense(1, activation=None)
res = output(x)
If you want to keep the original format then you might have to manually collect all trainables and feed them to the optimizer via the var_list option
optimizer = tf.contrib.opt.ScipyOptimizerInterface(loss, var_list = [Trainables], method="L-BFGS-B")
Hope this helps.
I am training a CNN LSTM model using Keras, and after the training was done, I tried to evaluate the model on the testing data like I did when I fine-tuned my CNN, however an error appears this time.
After training was done, I tried to following piece of code to evaluate on my testing set:
x, y = zip(*(testgenerator[i] for i in range(len(testgenerator))))
x_test, y_test = np.vstack(x), np.vstack(y)
loss, acc = Bi_LSTM.evaluate(x_test, y_test, batch_size=9)
print("Accuracy: " ,acc)
print("Loss: ", loss)
I have used this code before to evaluate my fine tuned model and it had no issue, but now I get the following error:
TypeError: object of type 'generator' has no len()
I have tried few solutions online like using len(list(generator)) but it did not work. Is it because I am using a custom generator? How can I do to evaluate model in this case ?
I think this line is the problem
x, y = zip(*(testgenerator[i] for i in range(len(testgenerator))))
because you call len on generator object.
The solution may be if you just create some counter, increment it and use it as index in testgenerator[i]
The way I solved this is by using a different method. In this case I do not need to extract values for x,y:
loss, acc = Bi_LSTM.evaluate_generator(testgenerator, batch_size=9)
I am quite new to tensorflow and in order to learn to use it I am currently trying to implement a very simple DNNRegressor that predicts the movement of an object in 2D but I can't seem to the the predict function to work.
for this purpose I have some Input data - x and y coordinates of the object in a number of previous time steps. I want the output to a reasonable estimation of the location the object if it continues to move in the same direction with the same speed.
I am using tensorflow version 1.8.0
My regressor is defined like this:
CSV_COLUMN_NAMES = ['X_0', 'X_1', 'X_2', 'X_3', 'X_4', 'Y_0', 'Y_1', 'Y_2', 'Y_3', 'Y_4', 'Y_5']
my_feature_columns = []
for key in columnNames:
my_feature_columns.append(tf.feature_column.numeric_column(key=key))
regressor = estimator.DNNRegressor(feature_columns=my_feature_columns,
label_dimension=1,
hidden_units=hidden_layers,
model_dir=MODEL_PATH,
dropout=dropout,
config=test_config)
my input is, like the one in the tensorflow tutorial on premade estimators, a dict with the column as key.
An example for this input can be seen here.
regressor.train(arguments) and regressor.evaluate(arguments) seem to work just fine, but predict does not.
parallel to the code on the tensorflow site I tried to do this:
y_pred = regressor.predict(input_fn=eval_input_fn(X_test, labels=None, batch_size=1))
and it seems like that works as well.
The problem I'm facing now is that I can't get anything from that y_pred object.
when I enter print(y_pred) I get <generator object Estimator.predict at 0x7fd9e8899888> which would suggest to me that should be able to iterate over it but
for elem in y_pred:
print(elem)
results in TypeError: unsupported callable
Again, I'm quite new to this and I am sorry if the answer is obvious but I would be very grateful if someone could tell me what I'm doing wrong here.
The input_fn to regressor.predict should be a function. See the definition:
input_fn: A function that constructs the features.
You need to change your code to:
y_pred = regressor.predict(input_fn=eval_input_fn)
To find the loss during training a model we can use cntk.squared_error() function, like this:
loss = cntk.squared_error(z, l)
But I am interested in finding the loss in terms of absolute error. The below code doesn't work:
loss = cntk.absolute_error(z, l)
It gives error as:
AttributeError: module 'cntk' has no attribute 'absolute_error'
Is there any inbuilt function in CNTK toolkit to find the absolute error? I am new to deep learning so I don't know much. Thanks for help!
There's no out-of-the-box L1 loss function in CNTK, but you can provide a custom one:
def absolute_error(z, l):
return cntk.reduce_mean(cntk.abs(z - l))