Using Model as a Layer in another Model, First model not training - python

I built a Keras model that uses another model as a layer, but the problem is the weights in the other model are not training. How to I get around this?
For more details, I am using a transformer to encode sentences individually, then combining the set of sentences with another transformer.
Here is the pseudo code:
Class:
def build_context_encoder(self):
a = Input(sentences shape)
#function stuff
b = #transformer structure
context_encoder = Model(inputs=[a], outputs=b)
return context encoder
def build_model(self):
list_of _contexts = Input(list of contexts shape)
context_embs = Lambda(lambda x: K.map_fn(fn=self.context_encoder, elems=x, dtype=tf.float32))(list_of_contexts)
c = #rest of the model (context_embs)
model = Model(inputs=[list_of _contexts], outputs=c)
model.compile(loss='mean_squared_error', optimizer='adam', metrics=[])
return model
def __init__():
self.context_encoder = self.build_context_encoder()
self.model = self.build_model()
Why don't the weights in context_encoder update when I call fit? Is it due to the map_fn, or because I'm calling the model? How do I fix this?

Related

Access to layer weights from a tf.keras model

I am trying to replicate a tensorflow subclassed model, but I'm having problems accessing to the weights of a layer included in the model. Here's a summarized definition of the model:
class model():
def __init__(self, dims, size):
self._dims = dims
self.size = size
self.autoencoder= None
self.encoder = None
self.decoder = None
self.model = None
def initialize(self):
self.autoencoder, self.encoder, self.decoder = mlp_autoencoder(self.dims)
output = MyLayer(self.size, name= 'MyLayer')(self.encoder.output)
self.model = Model(inputs= self.autoencoder.input,
outputs= [self.autoencoder.output, output])
mlp_autoencoder defines as many encoder and decoder layers as introduced in dims.
MyLayer's trainable weights are learnt in the encoder's latent space and are then used to return the second output.
There are no issues accessing to the autoencoder weights, the problem is when trying to get MyLayer's weights. The first time it crashes is in the following part of the code:
#property
def layer_weights(self):
return self.model.get_layer(name= 'MyLayer').get_weights()
# ValueError: No such layer: MyLayer.
By building the model this way a different TFOpLambda Layer is created for each transformation made to the encoder.output in the custom layer. I tried getting the weights through the last TFOpLambda layer (the second output of the model) but get_weights returns an empty list. In summary, these weights are never stored in the model.
I checked if MyLayer is well defined by using it separately, and it creates and stores the variables just fine, I had no issues accessing them. The problem appears when trying to use this layer in model.
Can someone more knowledgable in subclassing tell if there is something wrong in the definition of the model? I've considered using build and call as it seems to be the 'standard' way, but there's gotta be a simpler way...
I can provide more details of the program if needed.
Thanks in advance!
A (not very elegant) way to solve it is calling the custom layer in the __init__ method. By doing this, the layer is created as a model attribute making its weights accesible.
def __init__(self, dims, size):
self.dims = dims
self.size = size
self.autoencoder = None
self.encoder = None
self.decoder = None
self.model = None
self.custom_layer = MyLayer(self.size, name= 'MyLayer')
def initialize(self):
self.autoencoder, self.encoder, self.decoder = mlp_autoencoder(self.dims)
h = self.custom_layer(self.encoder.output)
self.model = Model(inputs= self.autoencoder.input,
outputs= [self.autoencoder.output, h])
Getting weights:
def layer_weights(self):
return self.custom_layer.get_weights()[0]

How do I get this model to predict the multi label classification value?

How do I get this model to predict the multi label classification value based on train input and test input? There are 3 classifications, which are good, bad, and ugly. train_input is a dictionary that holds the train dataset. test_input is a variable that holds the value of 241.43 which a value of 'good', 'bad', or 'ugly' should be predicted from, in this case probably the predicted value should be 'bad'.
from keras.layers import Input, Dense
from keras.models import Model
''' 3 multi label classification using deep learning '''
classification_labels_3 = ['good', 'bad', 'ugly']
train_input = {100.23:'good', 234.76:'bad', 500.90:'ugly'}
test_input = 241.43
'''
# Define the keras architecture of your model in 'build_model' and return it. Compilation must be
done in 'compile_model'.
# input_shapes - dictionary of shapes per input as defined train_input dictionary
# n_classes - For classification, number of target classes
'''
def build_model(input_shapes, n_classes=None):
'''
# This input will receive all the train_input features
# sent to 'main'
'''
input_main = Input(shape=input_shapes['main'], name='main')
x = Dense(64, activation='relu')(input_main)
x = Dense(64, activation='relu')(x)
predictions = Dense(n_classes, activation='softmax')(x)
'''
# The 'inputs' parameter of your model must contain the
# full list of inputs used in the architecture
'''
model = Model(inputs=[input_main], outputs=predictions)
return model
'''
# Compile your model and return it
# model - model defined in 'build_model'
'''
def compile_model(model):
'''
# The loss function depends on the type of problem you solve.
# 'categorical_crossentropy' is appropriate for a multiclass classification.
'''
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
return model
'''
#print(build_model(input_shapes=,n_classes=))
#print(compile(source, filename, mode))
'''
For predicting multi label class you have to use softmax activation function. I can see that you are already using it, so you only have to define the number of classes of it's function:
predictions = Dense(n_classes, activation='softmax')(x)
As you can see, n_classes is defined as a variable of the function build_model, so just adjust this variable when you call the function and you will build a model with the outputs you specified.
But first, you have to obtain dataset in a dataframe format. Try do something like that:
val=[]
for e in train_input:
val.append(e)
clas=[]
for e in train_input:
clas.append(train_input[e])
df=pd.DataFrame(val, columns=["values"])
df["clas"]=clas
You will obtain a dataframe like this:
values clas
0 100.23 good
1 234.76 bad
2 500.90 ugl

LSTM Model not having any variance during evaluation

I have a question regarding the evaluation of an LSTM Model. I have trained an LSTM Model and stored it with model.save(...). Now I want load_model and evaluate it on the validation set datasets. Since neural networks are stochastic, I run it several times and compute the mean and the variance of the different metrics I am interested in.
Now I am shocked that after the first run all consecutive runs have the same performance on every metric. I don't think that is right, but I don't know where the error occurs.
So my question is:
what is my mistake in setting up the validation of my model?
and how can I fix that?
Here are the code snippets that should explain what I am doing:
Compile and fit the Model
def compile_and_fit( hparams,
MAX_EPOCHS,
model_path ):
window = WindowGenerator( input_width= hparams[HP_WINDOW_SIZE],
label_width=hparams[HP_WINDOW_SIZE], shift=1,
label_columns=['q_MARI'], batch_size = hparams[HP_BATCH_SIZE])
model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(hparams[HP_NUM_UNITS], return_sequences=True, name="LSTM_1"),
tf.keras.layers.Dropout(hparams[HP_DROPOUT], name="Dropout_1"),
tf.keras.layers.LSTM(hparams[HP_NUM_UNITS], return_sequences=True, name="LSTM_2"),
tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(1))
])
learning_rate = hparams[HP_LEARNING_RATE]
model.compile(loss=tf.losses.MeanSquaredError(),
optimizer=tf.optimizers.Adam(learning_rate=learning_rate),
metrics=get_metrics())
history = model.fit(window.train,
epochs=MAX_EPOCHS,
validation_data=window.val,
callbacks= get_callbacks(model_path))
_, a,_,_,_,_ = model.evaluate(window.val)
return a, model, history
Train and safe it
a, model, history = compile_and_fit( hparams = hparams, MAX_EPOCHS = MAX_EPOCHS, model_path = run_path)
model.save(run_path)
Load and evaluate it
model = tf.keras.models.load_model(os.path.join(hparam_path, model_name),
custom_objects={"max_error": max_error, "median_absolute_error": median_absolute_error, "rev_metric": rev_metric, "nse_metric": nse_metric})
model.compile(loss=tf.losses.MeanSquaredError(), optimizer="adam", metrics=get_metrics())
metric_values = np.empty(shape = (nr_runs, len(metrics)), dtype=float)
for j in range(nr_runs):
window = WindowGenerator(input_width= hparam_vals[i], label_width=hparam_vals[i], shift=1,
label_columns=['q_MARI'])
metric_values[j]= np.array(model.evaluate(window.val))
means = metric_values.mean(axis=0)
varis = metric_values.var(axis=0)
print(f'means: {means}, varis: {varis}')
The results I am getting
For setting up the Training I follow those two guides:
https://www.tensorflow.org/tutorials/structured_data/time_series
https://www.tensorflow.org/tensorboard/hyperparameter_tuning_with_hparams
LSTM is not stochastic. Evaluation results should be the same for the same data.
There are two steps, when you train the model, randomness will influence the model you trained. However, after that, you saved the model, the prediction result would be same if you use the same model.

Loading a keras model with custom loss based on input

I have a custom loss which uses one of the inputs to the model.
def closs(labels,latent_dim):
def loss(y_true,y_pred):
return metric_learning.contrastive_loss(labels=labels,
embeddings_anchor=y_pred[:,:latent_dim],
embeddings_positive=y_pred[:,latent_dim:])
return loss
Where labels is an input to the model. The model architecture is:
def build_model():
left_input = Input(shape=(2900,1))
right_input = Input(shape=(2900,1))
label = Input(shape=(1,))
encoder = build_encoder()
left_embed = encoder(left_input)
right_embed = encoder(right_input)
embeds = Concatenate()([left_embed,right_embed])
model = Model(inputs=[left_input,right_input,label],outputs=[embeds])
return model, label
Then I use the returned "label" to compile the model:
model,label = build_model()
model.compile(optimizer='adam',loss=closs(label,256))
But when I try to load the model, I have to pass this loss as a custom_object, so something like this:
model = load_model('model/cl_model.h5',custom_objects={'loss':closs(xyz,256)})
The issue is that I'm loading the model in a different python script, and so I don't have the "label" input object.
How can I overcome this?
Are you using the weights to retrain the model or just to predict on new data?
In the case of only-predicting you could just use
model.load_weights('model/cl_model.h5')
after defining your model, you won't have to pass the loss function since it's only used to predict.

Is there a simple way to repeatedly call LSTM model instead of passing a sequence in Tensorflow?

I have a simple Keras model which is designed to be a bot for a game.
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(144, activation='sigmoid'),
tf.keras.layers.LSTM(144, return_sequences=True),
tf.keras.layers.LSTM(3, activation='softmax')
])
Its input is the current situation in the game and output is a move to make. The model has LSTM layers inside of it cause it needs a short-term memory. The problem is that Keras LSTM are designed to take whole series of data as an argument, but in my case the next input depends on the previous output. Currently I am appending inputs to a list and calling model on it each step but it seems inefficient.
def __init__(self, model, loss_object, optimizer):
#...
self.model = model
self.inputs = []
def __call__(self, encoded_input):
encoded_input = encode(board, direction)
self.inputs.append(encoded_input)
input_data = np.array(self.inputs)
input_data = tf.convert_to_tensor(input_data.reshape((1, input_data.shape[0], input_data.shape[1])))
output = self.model(input_data)
Is there a simple way in Tensorflow 2.0 to call the model repeatedly and keep its short-term memory between calls instead of constructing a whole series of data?

Categories