I have a dataset containing a matrix of features X and a matrix of labels y of size N where each element y_i belongs to [0,1]. I have the following loss function
where g(.) is a function that depends on the input matrix X.
I know that Keras custom loss function has to be of the form customLoss(y_true,y_predicted), however, I'm having difficulties incorporating the term g(X) in the loss function since this depends on the input matrix.
For each data point in my dataset, my input is of the form X_i = (H, P) where these two parameters are matrices and the function g is defined for each data point as g(X_i) = H x P. Can I pass a = (H, P) in the loss function since this depends on each example or do I need to pass all the matrices at once by concatenating them?
Edit (based on Daniel's answer):
original_model_inputs = keras.layers.Input(shape=X_train.shape[1])
y_true_inputs = keras.layers.Input(shape=y_train.shape[1])
hidden1 = keras.layers.Dense(256, activation="relu")(original_model_inputs)
hidden2 = keras.layers.Dense(128, activation="relu")(hidden1)
output = keras.layers.Dense(K)(hidden2)
def lambdaLoss(x):
yTrue, yPred, alpha = x
return (K.log(yTrue) - K.log(yPred))**2+alpha*yPred
loss = Lambda(lambdaLoss)(y_true_inputs, output, a)
model = Keras.Model(inputs=[original_model_inputs, y_true_inputs], outputs=[output], loss)
def dummyLoss(true, pred):
return pred
model.compile(loss = dummyLoss, optimizer=Adam())
train_model = model.fit([X_train, y_train], None, batch_size = 32,
epochs = 50,
validation_data = ([X_valid, y_valid], None),
callbacks=callbacks)
Fixing the understanding of my answer:
original_model_inputs = keras.layers.Input(shape=X_train.shape[1:]) #must be a tuple, not an int
y_true_inputs = keras.layers.Input(shape=y_train.shape[1:]) #must be a tuple, not an int
hidden1 = keras.layers.Dense(256, activation="relu")(original_model_inputs)
hidden2 = keras.layers.Dense(128, activation="relu")(hidden1)
output = keras.layers.Dense(K)(hidden2)
You need something to do g(X), I have no idea of what it is, but you need to do it somewhere.
And yes, you need to pass the whole tensor at once, you cannot make x_i and everything else.
def g(x):
return something
gResults = Lambda(g)(original_model_inputs)
Continuing my answer:
def lambdaLoss(x):
yTrue, yPred, G = x
.... #wait.... where is Y_true in your loss formula?
loss = Lambda(lambdaLoss)([y_true_inputs, output, gResults]) #must be a list of inputs including G
You need a model for training and another to get the outputs, because we're doing a frankenstein model because of the different loss.
training_model = keras.Model(inputs=[original_model_inputs, y_true_inputs], outputs=loss)
prediction_model = keras.Model(original_model_inputs, output)
Only the training model must be compiled:
def dummyLoss(true, pred):
return pred
training_model.compile(loss = dummyLoss, optimizer=Adam())
training_model = model.fit([X_train, y_train], None, batch_size = 32,
epochs = 50,
validation_data = ([X_valid, y_valid], None),
callbacks=callbacks)
Use the other model to get result data:
results = prediction_model.predict(some_x)
Looks like a GAN of some sort. I will refer to (x) as "x_input", Two methods:
Method 1) Inherit from tf.keras.model class and write your own (not recommended, not shown)
Method 2) Inherit from tf.keras.losses.Loss class. and return tuple of (custom) tf.keras.losses.Loss instance and tf.keras.layers.Layer that does nothing more than act as shell to grab and save a copy of the x_input (x). This layer instance can then be added as the top layer in model. The (custom) tf.keraslosses. Loss instance can then access the input on demand. This method also has best future support throughout the life of Tensorflow.
First, create a custom layer and custom loss class:
class Acrylic_Layer(tf.keras.layers.Layer):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.x_input = None
def build(self, *args, **kwargs):
pass
def call(self, input):
self.x_input = input
return input # Pass input directly through to next layer
class Custom_Loss(tf.keras.losses.Loss):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.input_thief = Acrylic_Layer() # <<< Magic, python is pass by reference!
def __call__(self, y_true, y_pred, sample_weight=None):
x_input = self.input_thief.x_input # <<< x_input pulled from model
Second, add layer and loss function to model
loss_fn = Custom_Loss(*args, **kwargs)
input_thief = loss_fn.input_thief
model = tf.keras.models.Sequential([
input_thief, # <<< transparent layer
Other_layers,
])
model.fit(loss=loss_fn) # <<< loss function
Lastly, I'm the market looking for a ML/python role, giving a shout out.
Related
I've created the following Keras custom model:
import tensorflow as tf
from tensorflow.keras.layers import Layer
class MyModel(tf.keras.Model):
def __init__(self, num_classes):
super(MyModel, self).__init__()
self.dense_layer = tf.keras.layers.Dense(num_classes,activation='softmax')
self.lambda_layer = tf.keras.layers.Lambda(lambda x: tf.math.argmax(x, axis=-1))
def call(self, inputs):
x = self.dense_layer(inputs)
x = self.lambda_layer(x)
return x
# A convenient way to get model summary
# and plot in subclassed api
def build_graph(self, raw_shape):
x = tf.keras.layers.Input(shape=(raw_shape))
return tf.keras.Model(inputs=[x],
outputs=self.call(x))
The task is multi-class classification.
Model consists of a dense layer with softmax activation and a lambda layer as a post-processing unit that converts the dense output vector to a single value (predicted class).
The train targets are a one-hot encoded matrix like so:
[
[0,0,0,0,1]
[0,0,1,0,0]
[0,0,0,1,0]
[0,0,0,0,1]
]
It would be nice if I could define a categorical_crossentropy loss over the dense layer and ignore the lambda layer while still maintaining the functionality and outputting a single value when I call model.predict(x).
Please note
My workspace environment doesn't allow me to use a custom training loop as suggested by #alonetogether excellent answer.
You can try using a custom training loop, which is pretty straightforward IMO:
import tensorflow as tf
from tensorflow.keras.layers import Layer
class MyModel(tf.keras.Model):
def __init__(self, num_classes):
super(MyModel, self).__init__()
self.dense_layer = tf.keras.layers.Dense(num_classes,activation='softmax')
self.lambda_layer = tf.keras.layers.Lambda(lambda x: tf.math.argmax(x, axis=-1))
def call(self, inputs):
x = self.dense_layer(inputs)
x = self.lambda_layer(x)
return x
# A convenient way to get model summary
# and plot in subclassed api
def build_graph(self, raw_shape):
x = tf.keras.layers.Input(shape=(raw_shape))
return tf.keras.Model(inputs=[x],
outputs=self.call(x))
n_classes = 5
model = MyModel(n_classes)
labels = tf.keras.utils.to_categorical(tf.random.uniform((50, 1), maxval=5, dtype=tf.int32))
train_dataset = tf.data.Dataset.from_tensor_slices((tf.random.normal((50, 1)), labels)).batch(2)
optimizer = tf.keras.optimizers.Adam()
loss_fn = tf.keras.losses.CategoricalCrossentropy()
epochs = 2
for epoch in range(epochs):
print("\nStart of epoch %d" % (epoch,))
for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
with tf.GradientTape() as tape:
logits = model.layers[0](x_batch_train)
loss_value = loss_fn(y_batch_train, logits)
grads = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
And prediction:
print(model.predict(tf.random.normal((1, 1))))
[3]
I think there is a Model.predict_classes function that would replace the need for that lambda layer. But if it doesn't work:
There doesn't seem to be a way to do that without using one of these hacks:
Two inputs (one is the groud truth values Y)
Two outputs
Two models
I'm quite convinced there is no other workaround for this.
So, I believe the "two models" version is the best for your case where you seem to "need" a model with single input, single output and fit.
Then I'd do this:
inputs = tf.keras.layers.Input(input_shape_without_batch_size)
loss_outputs = tf.keras.layers.Dense(num_classes,activation='softmax')(inputs)
final_outputs = tf.keras.layers.Lambda(lambda x: tf.math.argmax(x, axis=-1))(loss_outputs)
training_model = tf.keras.models.Model(inputs, loss_outputs)
final_model = tf.keras.models.Model(inputs, final_outputs)
training_model.compile(.....)
training_model.fit(....)
results = final_model.predict(...)
I am trying to use a custom loss-function which depends on some arguments that the model does not have.
The model has two inputs (mel_specs and pred_inp) and expects a labels tensor for training:
def to_keras_example(example):
# Preparing inputs
return (mel_specs, pred_inp), labels
# Is a tf.train.Dataset for model.fit(train_data, ...)
train_data = load_dataset(fp, 'train).map(to_keras_example).repeat()
In my loss function I need to calculate the lengths of mel_specs and pred_inp. This means my loss looks like this:
def rnnt_loss_wrapper(y_true, y_pred, mel_specs_inputs_):
input_lengths = get_padded_length(mel_specs_inputs_[:, :, 0])
label_lengths = get_padded_length(y_true)
return rnnt_loss(
acts=y_pred,
labels=tf.cast(y_true, dtype=tf.int32),
input_lengths=input_lengths,
label_lengths=label_lengths
)
However, no matter which approach I choose, I am facing some issue.
Option 1) Setting the loss-function in model.compile()
If I actually wrap the loss function s.t. it returns a function which takes y_true and y_pred like this:
def rnnt_loss_wrapper(mel_specs_inputs_):
def inner_(y_true, y_pred):
input_lengths = get_padded_length(mel_specs_inputs_[:, :, 0])
label_lengths = get_padded_length(y_true)
return rnnt_loss(
acts=y_pred,
labels=tf.cast(y_true, dtype=tf.int32),
input_lengths=input_lengths,
label_lengths=label_lengths
)
return inner_
model = create_model(hparams)
model.compile(
optimizer=optimizer,
loss=rnnt_loss_wrapper(model.inputs[0]
)
Here I get a _SymbolicException after calling model.fit():
tensorflow.python.eager.core._SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors, but found [...]
Option 2) Using model.add_loss()
The documentation of add_loss() states:
[Adds a..] loss tensor(s), potentially dependent on layer inputs.
..
Arguments:
losses: Loss tensor, or list/tuple of tensors. Rather than tensors, losses
may also be zero-argument callables which create a loss tensor.
inputs: Ignored when executing eagerly. If anything ...
So I tried to do the following:
def rnnt_loss_wrapper(y_true, y_pred, mel_specs_inputs_):
input_lengths = get_padded_length(mel_specs_inputs_[:, :, 0])
label_lengths = get_padded_length(y_true)
return rnnt_loss(
acts=y_pred,
labels=tf.cast(y_true, dtype=tf.int32),
input_lengths=input_lengths,
label_lengths=label_lengths
)
model = create_model(hparams)
model.add_loss(
rnnt_loss_wrapper(
y_true=model.inputs[2],
y_pred=model.outputs[0],
mel_specs_inputs_=model.inputs[0],
),
inputs=True
)
model.compile(
optimizer=optimizer
)
However, calling model.fit() throws a ValueError:
ValueError: No gradients provided for any variable: [...]
Is any of the above options supposed to work?
I have used the add_loss method as follow:
def custom_loss(y_true, y_pred, input_):
# custom loss function
y_estim = input_[...,0]*y_pred
shape = tf.cast(tf.shape(y_true)[1], dtype='float32')
return tf.reduce_mean(1/shape*tf.reduce_sum(tf.pow(y_true-y_estim, 2), axis=1))
mix_input = layers.Input(shape=(301, 257, 4)) # input 1
ref_input = layers.Input(shape=(301, 257, 1)) # input 2
target = layers.Input(shape=(301, 257)) # output target
smss_model = Model(inputs=[mix_input, ref_input], outputs=smss) # my model that accept two inputs
model = Model(inputs=[mix_input, ref_input, target], outputs=smss) # this one used just to train the model, with the additional paramters
model.add_loss(custom_loss(target, smss, mix_input)) # the add_loss where to pass the custom loss function
model.summary()
model.compile(loss=None, optimizer='sgd')
model.fit([mix, ref, y], epochs=1, batch_size=1, verbose=1)
even do I have used this method and works, I still looking for another method, that not involve creating a training model
Did using lambda function work? (https://www.w3schools.com/python/python_lambda.asp)
loss = lambda x1, x2: rnnt_loss(x1, x2, acts, labels, input_lengths,
label_lengths, blank_label=0)
In this way your loss function should be a function accepting parameters x1 and x2, but rnnt_loss can also be aware of acts, labels, input_lengths, label_lengths and blank_label
I am trying to implement the OSME MAMC model describe in article https://arxiv.org/abs/1806.05372.
I'm stuck where I have to add a cost that doesn't depend on y_true and y_pred but on hidden layers and y_true.
It can't be right as tensorflow custom loss, for which we need y_true and y_pred.
I wrote the model into class, then tried to use gradient tape to add NPairLoss to Softmax output loss, but gradient is NaN during training.
I think my approach isn't good, but I have no idea how to design / write it.
Here my model :
class OSME_network(tf.keras.Model):
def __init__(self, nbrclass=10, weight="imagenet",input_tensor=(32,32,3)):
super(OSME_network, self).__init__()
self.nbrclass = nbrclass
self.weight = weight
self.input_tensor=input_tensor
self.Resnet_50=ResNet50(include_top=False, weights=self.weight, input_shape=self.input_tensor)
self.Resnet_50.trainable=False
self.split=Lambda(lambda x: tf.split(x,num_or_size_splits=2,axis=-1))
self.s_1=OSME_Layer(ch=1024,ratio=16)
self.s_2=OSME_Layer(ch=1024,ratio=16)
self.fl1=tf.keras.layers.Flatten()
self.fl2=tf.keras.layers.Flatten()
self.d1=tf.keras.layers.Dense(1024, name='fc1')
self.d2=tf.keras.layers.Dense(1024,name='fc2')
self.fc=Concatenate()
self.preds=tf.keras.layers.Dense(self.nbrclass,activation='softmax')
#tf.function
def call(self,x): #set à construire le model sequentiellement
x=self.Resnet_50(x)
x_1,x_2=self.split(x)
xx_1 = self.s_1(x_1)
xx_2 = self.s_2(x_2)
xxx_1 = self.d1(xx_1)
xxx_2 = self.d2(xx_2)
xxxx_1 = self.fl1(xxx_1)
xxxx_2 = self.fl2(xxx_2)
fc = self.fc([xxxx_1,xxxx_2]) #fc1 + fc2
ret=self.preds(fc)
return xxxx_1,xxxx_2,ret
class OSME_Layer(tf.keras.layers.Layer):
def __init__(self,ch,ratio):
super(OSME_Layer,self).__init__()
self.GloAvePool2D=GlobalAveragePooling2D()
self.Dense1=Dense(ch//ratio,activation='relu')
self.Dense2=Dense(ch,activation='sigmoid')
self.Mult=Multiply()
self.ch=ch
def call(self,inputs):
squeeze=self.GloAvePool2D(inputs)
se_shape = (1, 1, self.ch)
se = Reshape(se_shape)(squeeze)
excitation=self.Dense1(se)
excitation=self.Dense2(excitation)
scale=self.Mult([inputs,excitation])
return scale
class NPairLoss():
def __init__(self):
self._inputs = None
self._y=None
#tf.function
def __call__(self,inputs,y):
targets=tf.argmax(y, axis=1)
b, p, _ = inputs.shape
n = b * p
inputs=tf.reshape(inputs, [n, -1])
targets = tf.repeat(targets,repeats=p)
parts = tf.tile(tf.range(p),[b])
prod=tf.linalg.matmul(inputs,inputs,transpose_a=False,transpose_b=True)
same_class_mask = tf.math.equal(tf.broadcast_to(targets,[n, n]),tf.transpose(tf.broadcast_to(targets,(n, n))))
same_atten_mask = tf.math.equal(tf.broadcast_to(parts,[n, n]),tf.transpose(tf.broadcast_to(parts,(n, n))))
s_sasc = same_class_mask & same_atten_mask
s_sadc = (~same_class_mask) & same_atten_mask
s_dasc = same_class_mask & (~same_atten_mask)
s_dadc = (~same_class_mask) & (~same_atten_mask)
loss_sasc = 0
loss_sadc = 0
loss_dasc = 0
for i in range(n):
#loss_sasc
pos = prod[i][s_sasc[i]]
neg = prod[i][s_sadc[i] | s_dasc[i] | s_dadc[i]]
n_pos=tf.shape(pos)[0]
n_neg=tf.shape(neg)[0]
pos = tf.transpose(tf.broadcast_to(pos,[n_neg,n_pos]))
neg = tf.broadcast_to(neg,[n_pos,n_neg])
exp=tf.clip_by_value(tf.math.exp(neg - pos),clip_value_min=0,clip_value_max=9e6) # need to clip value, else inf
loss_sasc += tf.reduce_sum(tf.math.log(1 + tf.reduce_sum(exp,axis=1)))
#loss_sadc
pos = prod[i][s_sadc[i]]
neg = prod[i][s_dadc[i]]
n_pos = tf.shape(pos)[0]
n_neg = tf.shape(neg)[0]
pos = tf.transpose(tf.broadcast_to(pos,[n_neg,n_pos])) #np.transpose(np.tile(pos,[n_neg,1]))
neg = tf.broadcast_to(neg,[n_pos,n_neg])#np.tile(neg,[n_pos,1])
exp=tf.clip_by_value(tf.math.exp(neg - pos),clip_value_min=0,clip_value_max=9e6)
loss_sadc += tf.reduce_sum(tf.math.log(1 + tf.reduce_sum(exp,axis=1)))
#loss_dasc
pos = prod[i][s_dasc[i]]
neg = prod[i][s_dadc[i]]
n_pos = tf.shape(pos)[0]
n_neg = tf.shape(neg)[0]
pos = tf.transpose(tf.broadcast_to(pos,[n_neg,n_pos])) #np.transpose(np.tile(pos,[n_neg,1]))
neg = tf.broadcast_to(neg,[n_pos,n_neg])#np.tile(neg,[n_pos,1])
exp=tf.clip_by_value(tf.math.exp(neg - pos),clip_value_min=0,clip_value_max=9e6)
loss_dasc += tf.reduce_sum(tf.math.log(1 + tf.reduce_sum(exp,axis=1)))
return (loss_sasc + loss_sadc + loss_dasc) / n
then, for training :
#tf.function
def train_step(x,y):
with tf.GradientTape() as tape:
fc1,fc2,y_pred=model(x,training=True)
stacked=tf.stack([fc1,fc2],axis=1)
layerLoss=npair(stacked,y)
loss=cce(y, y_pred) +0.001*layerLoss
grads=tape.gradient(loss,model.trainable_variables)
opt.apply_gradients(zip(grads,model.trainable_variables))
return loss
model=OSME_network(weight="imagenet",nbrclass=10,input_tensor=(32, 32, 3))
model.compile(optimizer=opt, loss=categorical_crossentropy, metrics=["acc"])
model.build(input_shape=(None,32,32,3))
cce = tf.keras.losses.CategoricalCrossentropy(from_logits=True,name='categorical_crossentropy')
npair=NPairLoss()
for each batch :
x=tf.Variable(x_train[start:end])
y=tf.Variable(y_train[start:end])
train_loss=train_step(x,y)
Thanks for any help :)
You can use tensorflow's add_loss.
model.compile() loss functions in Tensorflow always take two parameters y_true and y_pred. Using model.add_loss() has no such restriction and allows you to write much more complex losses that depend on many other tensors, but it has the inconvenience of being more dependent on the model, whereas the standard loss functions work with just any model.
You can find the official documentation of add_loss here. Add loss tensor(s), potentially dependent on layer inputs. This method can be used inside a subclassed layer or model's call function, in which case losses should be a Tensor or list of Tensors. There are few example in the documentation to explain the add_loss.
This method can also be called directly on a Functional Model during construction. In this case, any loss Tensors passed to this Model must be symbolic and be able to be traced back to the model's Inputs. These losses become part of the model's topology and are tracked in get_config.
Example :
inputs = tf.keras.Input(shape=(10,))
x = tf.keras.layers.Dense(10)(inputs)
outputs = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs, outputs)
# Activity regularization.
model.add_loss(tf.abs(tf.reduce_mean(x)))
You can call self.add_loss(loss_value) from inside the call method of a custom layer. Here's a simple example that adds activity regularization.
Example:
class ActivityRegularizationLayer(layers.Layer):
def call(self, inputs):
self.add_loss(tf.reduce_sum(inputs) * 0.1)
return inputs # Pass-through layer.
inputs = keras.Input(shape=(784,), name='digits')
x = layers.Dense(64, activation='relu', name='dense_1')(inputs)
# Insert activity regularization as a layer
x = ActivityRegularizationLayer()(x)
x = layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = layers.Dense(10, name='predictions')(x)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True))
# The displayed loss will be much higher than before
# due to the regularization component.
model.fit(x_train, y_train,
batch_size=64,
epochs=1)
You can find good example using add_loss here and here with explanations.
Hope this answers your question. Happy Learning.
I want to write a custom loss function in Keras which depends on an attribute of a (custom) layer in the network.
The idea is the following:
I have a custom layer which modifies the input in each epoch based on a random variable
The output labels should be modified based on the same variable
Some example code to make it more clear:
import numpy as np
from keras import losses, layers, models
class MyLayer(layers.Layer):
def call(self, x):
a = np.random.rand()
self.a = a # <-- does this work as expected?
return x+a
def my_loss(layer):
def modified_loss(y_true, y_pred):
a = layer.a
y_true = y_true + a
return losses.mse(y_true, y_pred)
input_layer = layers.Input()
my_layer = MyLayer(input_layer, name="my_layer")
output_layer = layers.Dense(4)(my_layer)
model = models.Model(inputs=input_layer, outputs=output_layer)
model.compile('adam', my_loss(model.get_layer("my_layer")))
I expect that a is changing for every batch and that the same a is used in the layer and loss function.
Right now, it is not working the way I intended. It seems like the a in the loss function is never updated (and maybe not even in the layer).
How do I change the attribute/value of a in the layer at every call and access it in the loss function?
Not quite sure I am following the purpose on this (and I am bothered by the call to np inside the call() of your custom layer - could you not use the tf.random functions instead?) but you can certainly access the a property inside your loss function.
Perhaps something like:
class MyLayer(layers.Layer):
def call(self, x):
a = np.random.rand() # FIXME --> use tf.random
self.a = a
return x+a
input_layer = layers.Input()
my_layer = MyLayer(input_layer, name="my_layer")
output_layer = layers.Dense(4)(my_layer)
model = models.Model(inputs=input_layer, outputs=output_layer)
def my_loss(y_true, y_pred):
y_true = y_true + my_layer.a
return losses.mse(y_true, y_pred)
model.compile('adam', loss=my_loss)
I have an autoencoder set up in Keras. I want to be able to weight the features of the input vector according to a predetermined 'precision' vector. This continuous valued vector has the same length as the input, and each element lies in the range [0, 1], corresponding to the confidence in the corresponding input element, where 1 is completely confident and 0 is no confidence.
I have a precision vector for every example.
I have defined a loss that takes into account this precision vector. Here, reconstructions of low-confidence features are down-weighted.
def MAEpw_wrapper(y_prec):
def MAEpw(y_true, y_pred):
return K.mean(K.square(y_prec * (y_pred - y_true)))
return MAEpw
My issue is that the precision tensor y_prec depends on the batch. I want to be able to update y_prec according to the current batch so that each precision vector is correctly associated with its observation.
I have the done the following:
global y_prec
y_prec = K.variable(P[:32])
Here P is a numpy array containing all precision vectors with the indices corresponding to the examples. I initialize y_prec to have the correct shape for a batch size of 32. I then define the following DataGenerator:
class DataGenerator(Sequence):
def __init__(self, batch_size, y, shuffle=True):
self.batch_size = batch_size
self.y = y
self.shuffle = shuffle
self.on_epoch_end()
def on_epoch_end(self):
self.indexes = np.arange(len(self.y))
if self.shuffle == True:
np.random.shuffle(self.indexes)
def __len__(self):
return int(np.floor(len(self.y) / self.batch_size))
def __getitem__(self, index):
indexes = self.indexes[index * self.batch_size: (index+1) * self.batch_size]
# Set precision vector.
global y_prec
new_y_prec = K.variable(P[indexes])
y_prec = K.update(y_prec, new_y_prec)
# Get training examples.
y = self.y[indexes]
return y, y
Here I am aiming to update y_prec in the same function that generates the batch. This seems to be updating y_prec as expected. I then define my model architecture:
dims = [40, 20, 2]
model2 = Sequential()
model2.add(Dense(dims[0], input_dim=64, activation='relu'))
model2.add(Dense(dims[1], input_dim=dims[0], activation='relu'))
model2.add(Dense(dims[2], input_dim=dims[1], activation='relu', name='bottleneck'))
model2.add(Dense(dims[1], input_dim=dims[2], activation='relu'))
model2.add(Dense(dims[0], input_dim=dims[1], activation='relu'))
model2.add(Dense(64, input_dim=dims[0], activation='linear'))
And finally, I compile and run:
model2.compile(optimizer='adam', loss=MAEpw_wrapper(y_prec))
model2.fit_generator(DataGenerator(32, digits.data), epochs=100)
Where digits.data is a numpy array of observations.
However, this ends up defining separate graphs:
StopIteration: Tensor("Variable:0", shape=(32, 64), dtype=float32_ref) must be from the same graph as Tensor("Variable_4:0", shape=(32, 64), dtype=float32_ref).
I've scoured SO for a solution to my problem but nothing I've found works. Any help on how to do this properly is appreciated.
This autoencoder can be easily implemented using the Keras functional API. This will allow to have an additional input placeholder y_prec_input, which will be fed with the "precision" vector. The full source code can be found here.
Data generator
First, let's reimplement your data generator as follows:
class DataGenerator(Sequence):
def __init__(self, batch_size, y, prec, shuffle=True):
self.batch_size = batch_size
self.y = y
self.shuffle = shuffle
self.prec = prec
self.on_epoch_end()
def on_epoch_end(self):
self.indexes = np.arange(len(self.y))
if self.shuffle:
np.random.shuffle(self.indexes)
def __len__(self):
return int(np.floor(len(self.y) / self.batch_size))
def __getitem__(self, index):
indexes = self.indexes[index * self.batch_size: (index + 1) * self.batch_size]
y = self.y[indexes]
y_prec = self.prec[indexes]
return [y, y_prec], y
Note that I got rid of the global variable. Now, instead, the precision vector P is provided as input argument (prec), and the generator yields an additional input that will be fed to the precision placeholder y_prec_input (see model definition).
Model
Finally, your model can be defined and trained as follows:
y_input = Input(shape=(input_dim,))
y_prec_input = Input(shape=(1,))
h_enc = Dense(dims[0], activation='relu')(y_input)
h_enc = Dense(dims[1], activation='relu')(h_enc)
h_enc = Dense(dims[2], activation='relu', name='bottleneck')(h_enc)
h_dec = Dense(dims[1], activation='relu')(h_enc)
h_dec = Dense(input_dim, activation='relu')(h_dec)
model2 = Model(inputs=[y_input, y_prec_input], outputs=h_dec)
model2.compile(optimizer='adam', loss=MAEpw_wrapper(y_prec_input))
# Train model
model2.fit_generator(DataGenerator(32, digits.data, P), epochs=100)
where input_dim = digits.data.shape[1]. Note that I also changed the output dimension of the decoder to input_dim, since it must match the input dimension.
Try to test your code with worker=0 when you call fit_generator, if it works normally then threading is your problem.
If threading is the cause, try this:
# In the code that executes on the main thread
graph = tf.get_default_graph()
# In code that executes in other threads(e.g. your generator)
with graph.as_default():
...
...
new_y_prec = K.variable(P[indexes])
y_prec = K.update(y_prec, new_y_prec)