Keras Model with 2 inputs during training, but only 1 during inferencing - python

A similar question has been asked earlier by someone but the answer was not satisfactory.
Given a model made using Functional API in keras.
During training the model we have two inputs and one output. One input is image. Another input is an array of costs that is needed for custom loss function.
During inferencing however we will get only image as input and no costs are there. Hence only one input and one output.
How to adapt the same model which has been trained on two inputs for inferencing ?
The model during training is somewhat like this :
input1 = Input(shape=(64,64,3)) #RGB Image
input2 = Input(shape=(4,))#Costs associated with the image, input to the custom loss function
conv1 = Conv2D(16, 3 , padding = 'same', activation = 'relu')(input1)
#Other layers
output = Dense(6)(x) # last layer gives classification output
model = Model(inputs = [input1, input2] , outputs = output)
model.compile(loss = custom_loss_function(input2) , optimizer = 'adam')
This is the model during training.
What to do during inferencing when only one input for the image is needed and no cost inputs are present ?

Just use a "dummy" input if it doesn't affect the forward pass

You can wrap your inference model in a training model.
In pseudo-code:
def make_model():
input1 = Input(...)
conv1 = Conv2D(16, 3 , padding = 'same', activation = 'relu')(input1)
#Other layers
output = Dense(6)(x) # last layer gives classification output
return keras.Model(input1, output)
def make_train_model():
input1 = Input(...)
input2 = Input(...)
m_inner = make_model()
output = m_inner(input1)
model = keras.Model([input1, input2], output)
model.compile(...)
return model, m_inner
You can train using model and save the inner model for inference.

Related

Image classification Using CNN [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 19 hours ago.
Improve this question
I am working on breast cancer classification. I found this online code to train my pre-processed outputs on it. The results was awful but I didn't understand the code, I want to train my own model but I don't how to replace my own code with this one.
Any help would be appreciated.
in_model = tf.keras.applications.DenseNet121(input_shape=(224,224,3),
include_top=False,
weights='imagenet',classes = 2)
in_model.trainable = False
inputs = tf.keras.Input(shape=(224,224,3))
x = in_model(inputs)
flat = Flatten()(x)
dense_1 = Dense(4096,activation = 'relu')(flat)
dense_2 = Dense(4096,activation = 'relu')(dense_1)
prediction = Dense(2,activation = 'softmax')(dense_2)
in_pred = Model(inputs = inputs,outputs = prediction)
#This is a Deep Learning model using Keras.
#the CNN model:
in_model = tf.keras.applications.DenseNet121(input_shape=(224,224,3),
include_top=False,
weights='imagenet',classes = 2)
#First, to all, you need to creates a CNN DenseNet121 model with pre-trained #ImageNet weights. input_shape specifies the shape of the input images to the model. #include_top=False specifies that we don't want to include the last fully-connected #layer in the model. This is because we want to replace the last layer with our own #layers for our specific task. weights='imagenet' specifies that we want to use pre-#trained weights from the ImageNet dataset. Finally, classes = 2 specifies the #number of output classes for our specific task.
in_model.trainable = False
#The model freezes the weights of the pre-trained model, so they will not be updated #during training. This is because we only want to train the new layers that we add #to the model.
inputs = tf.keras.Input(shape=(224,224,3))
the input layer of the model is defined as shape=(224,224,3) specifies the shape of #the input images.
x = in_model(inputs)
#Now is applied the pre-trained model to the input images to extract features.
flat = Flatten()(x)
flattens the output of the pre-trained model into a 1-dimensional array, #so it can be used as input to the fully-connected layers that we will add next.
#Now, in the next two lines add two fully-connected layers with 4096 units each and #ReLU activation functions. These layers are added to learn more complex features #from the flattened output of the pre-trained model.
dense_1 = Dense(4096,activation = 'relu')(flat)
dense_2 = Dense(4096,activation = 'relu')(dense_1)
#The next step involves to adds the output layer of the model. It's a fully-#connected layer with 2 units (one for each output class) and a softmax activation #function. This layer will output the predicted class probabilities for each input #image.
prediction = Dense(2,activation = 'softmax')(dense_2)
#Finally, you create the final model by defining the input and output layers. #inputs and prediction are the input and output layers that we defined earlier. The #resulting in_pred model is a Keras Model object that can be trained on data for a #specific classification task.
in_pred = Model(inputs = inputs,outputs = prediction)
You forgot to preprocess_input:
Note: each Keras Application expects a specific kind of input preprocessing. For DenseNet, call tf.keras.applications.densenet.preprocess_input on your inputs before passing them to the model.
inputs = tf.keras.Input(shape=(224,224,3))
x = tf.keras.applications.densenet.preprocess_input(inputs) # HERE
x = in_model(x)
You can also try to use the default top-net by setting include_top=True or create the same top net:
x = layers.GlobalAveragePooling2D(name="avg_pool")(x)
x = layers.Dense(2, activation='softmax', name='predictions')(x)

How do I interpret model summary of merged models in keras?

I want to built a model with many smaller model's output merged as one. I want 146 network taking 17 input each and giving a probability as output. The output of all these network need to be merged and used as single unit .For which I did something like this:
def build(layer_str,actv):
#take the input layer structure and convert it into a list
layers=layer_str.split("-")
#print(layers)
#convert the strings in the list to integer
layers=list(map(int,layers))
#let's build our model
model= tf.keras.Sequential()
#we add the first layer and the input layer to our network
model.add(Dense(layers[1],input_shape=(layers[0],),activation=actv[0]))
#we add the hidden layers
for (x,i) in enumerate(layers):
if(x>1 and x!=(len(layers)-1)):
model.add(Dense(i,activation=actv[x]))
#then add the final layer
model.add(Dense(layers[-1],activation=actv[-1]))
#return the construtcted model
return model
Then, Merged models like this:
def Merge_model(layer,act,data,label,lr,epochs,batch_size):
model_list=[]
for i in range(146):
model=nn.build(layer,act)
model_list.append(model)
merged_layers = concatenate([model_list[i].output for i in range(146)])
x = merged_layers
out = Activation('sigmoid')(x)
merged_model = Model([model_list[i].input for i in range(146)], [out])
print(merged_model.summary())
merged_model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
result,predictions=nn.train_eval(data,label,merged_model,lr,epochs,batch_size)
data=np.random.rand(10,146,17)
data=[d for d in data]
label=np.random.randint(0,1,(10,146,1))
label=[lb for lb in label]
print(len(label[0]))
lr=0.01
epochs=100
batch_size=16
Merge_model("17-7-1",["relu","sigmoid"],data,label,lr,epochs,batch_size)
I get the model summary as such But do not understand what to make of it. What is supposed to be my trainig data and layer's shape?
https://drive.google.com/file/d/1juffdLY0i9f9rgldKfHG_MYXCK8wBV09/view?usp=sharing

Extract Keras concatenated layer of 3 embedding layers, but it's an empty list

I am constructing a Keras Classification model with Multiple Inputs (3 actually) to predict one single output. Specifically, my 3 inputs are:
Actors
Plot Summary
Relevant Movie Features
Output:
Genre tags
Python Code (create the multiple input keras)
def kera_multy_classification_model():
sentenceLength_actors = 15
vocab_size_frequent_words_actors = 20001
sentenceLength_plot = 23
vocab_size_frequent_words_plot = 17501
sentenceLength_features = 69
vocab_size_frequent_words_features = 20001
model = keras.Sequential(name='Multy-Input Keras Classification model')
actors = keras.Input(shape=(sentenceLength_actors,), name='actors_input')
plot = keras.Input(shape=(sentenceLength_plot,), name='plot_input')
features = keras.Input(shape=(sentenceLength_features,), name='features_input')
emb1 = layers.Embedding(input_dim = vocab_size_frequent_words_actors + 1,
# based on keras documentation input_dim: int > 0. Size of the vocabulary, i.e. maximum integer index + 1.
output_dim = Keras_Configurations_model1.EMB_DIMENSIONS,
# int >= 0. Dimension of the dense embedding
embeddings_initializer = 'uniform',
# Initializer for the embeddings matrix.
mask_zero = False,
input_length = sentenceLength_actors,
name="actors_embedding_layer")(actors)
encoded_layer1 = layers.LSTM(100)(emb1)
emb2 = layers.Embedding(input_dim = vocab_size_frequent_words_plot + 1,
output_dim = Keras_Configurations_model2.EMB_DIMENSIONS,
embeddings_initializer = 'uniform',
mask_zero = False,
input_length = sentenceLength_plot,
name="plot_embedding_layer")(plot)
encoded_layer2 = layers.LSTM(100)(emb2)
emb3 = layers.Embedding(input_dim = vocab_size_frequent_words_features + 1,
output_dim = Keras_Configurations_model3.EMB_DIMENSIONS,
embeddings_initializer = 'uniform',
mask_zero = False,
input_length = sentenceLength_features,
name="features_embedding_layer")(features)
encoded_layer3 = layers.LSTM(100)(emb3)
merged = layers.concatenate([encoded_layer1, encoded_layer2, encoded_layer3])
layer_1 = layers.Dense(Keras_Configurations_model1.BATCH_SIZE, activation='relu')(merged)
output_layer = layers.Dense(Keras_Configurations_model1.TARGET_LABELS, activation='softmax')(layer_1)
model = keras.Model(inputs=[actors, plot, features], outputs=output_layer)
print(model.output_shape)
print(model.summary())
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['sparse_categorical_accuracy'])
Model's Structure
My problem:
After successfully fitting and training the model on some training data, I would like to extract the embeddings of this model for later use. My main approach before using a multiple input keras model, was to train 3 different keras models and extract 3 different embedding layers of shape 100. Now that I have the multiple input keras model, I want to extract the concatenated embedding layer with output shape (None, 300).
Although, when I try to use this python command:
embeddings = model_4.layers[9].get_weights()
print(embeddings)
or
embeddings = model_4.layers[9].get_weights()[0]
print(embeddings)
I get either an empty list (1st code sample) either an IndenError: list index out of range (2nd code sample).
Thank you in advance for any advice or help on this matter. Feel free to ask on the comments any additional information that I may have missed, to make this question more complete.
Note: Python code and model's structure have been also presented to this previously answered question
Concatenate layer does not have any weights (it does not have trainable parameter as you ca see from your model summary) hence your get_weights() output is coming empty. Concatenation is an operation.
For your case you can get weights of your individual embedding layers after training.
model.layers[3].get_weights() # similarly for layer 4 and 5
Alternatively if you want to store your embedding in (None, 300) you can use numpy to concatenate weights.
out_concat = np.concatenate([mdoel.layers[3].get_weights()[0], mdoel.layers[4].get_weights()[0], mdoel.layers[5].get_weights()[0]], axis=-1)
Although you can get output tensor of concatenate layer:
out_tensor = model.layers[9].output
# <tf.Tensor 'concatenate_3_1/concat:0' shape=(?, 300) dtype=float32>

Extract the output of cnn

I have trained a cnn model to classify images of dog and cat
it is giving 98% accuracy
But I want to visualize the output of cnn layer i.e the features from which my cnn is predicting whether it is a dog or a cat
If there any way to visualize the output of cnn?
You can divide your model into two models:
Previous Model:
input = Input(...)
# Your Layers
output = Dense(1)
old_model = Model(inputs=[input], output)
New Model:
input = Input(...)
#Add the first layers and the CNN here
cnn_layer = Conv2D(...)
feature_extraction_model = Model(inputs=[input], outputs=cnn_layer)
input_cnn = Input(...) # The shape of your CNN output
# Add the classification layer here
output = Dense(1)
classifier_model = Model(inputs=[input_cnn], outputs=output)
Now you define the new model as a combination of: feature_extraction_model and classifier_model
new_model = Model(inputs=[input], outputs=classifier_model(input_cnn))
# Train the model
new_model.fit(x, y)
Now you can have access to the CNNlayer post training:
cnn_output = feature_extraction_model.predict(x)

Make fixed timestep length LSTM Keras model free timestep length

I have a Keras LSTM multitask model that performs two tasks. One is a sequence tagging task (so I predict a label per token). The other is a global classification task over the whole sequence using a CNN that is stacked on the hidden states of the LSTM.
In my setup (don't ask why) I only need the CNN task during training, but the labels it predicts have no use on the final product. So, on Keras, one can train a LSTM model without especifiying the input sequence lenght. like this:
l_input = Input(shape=(None,), dtype="int32", name=input_name)
However, if I add the CNN stacked on the LSTM hidden states I need to set a fixed sequence length for the model.
l_input = Input(shape=(timesteps_size,), dtype="int32", name=input_name)
The problem is that once I have trained the model with a fixed timestep_size I can no longer use it to predict longer sequences.
In other frameworks this is not a problem. But in Keras, I cannot get rid of the CNN and change the expected input shape of the model once it has been trained.
Here is a simplified version of the model
l_input = Input(shape=(timesteps_size,), dtype="int32")
l_embs = Embedding(len(input.keys()), 100)(l_input)
l_blstm = Bidirectional(GRU(300, return_sequences=True))(l_embs)
# Sequential output
l_out1 = TimeDistributed(Dense(len(labels.keys()),
activation="softmax"))(l_blstm)
# Global output
conv1 = Conv1D( filters=5 , kernel_size=10 )( l_embs )
conv1 = Flatten()(MaxPooling1D(pool_size=2)( conv1 ))
conv2 = Conv1D( filters=5 , kernel_size=8 )( l_embs )
conv2 = Flatten()(MaxPooling1D(pool_size=2)( conv2 ))
conv = Concatenate()( [conv1,conv2] )
conv = Dense(50, activation="relu")(conv)
l_out2 = Dense( len(global_labels.keys()) ,activation='softmax')(conv)
model = Model(input=input, output=[l_out1, l_out2])
optimizer = Adam()
model.compile(optimizer=optimizer,
loss="categorical_crossentropy",
metrics=["accuracy"])
I would like to know if anyone here has faced this issue, and if there are any solutions to delete layers from a model after training and, more important, how to reshape input layer sizes after training.
Thanks
Variable timesteps length makes a problem not because of using convolution layers (actually the good thing about convolution layers is that they do not depend on the input size). Rather, using Flatten layers cause the problem here since they need an input with specified size. Instead, you can use Global Pooling layers. Further, I think stacking convolution and pooling layers on top of each other might give a better result instead of using two separate convolution layers and merging them (although this depends on the specific problem and dataset you are working on). So considering these two points it might be better to write your model like this:
# Global output
conv1 = Conv1D(filters=16, kernel_size=5)(l_embs)
conv1 = MaxPooling1D(pool_size=2)(conv1)
conv2 = Conv1D(filters=32, kernel_size=5)(conv1)
conv2 = MaxPooling1D(pool_size=2)(conv2)
gpool = GlobalAveragePooling1D()(conv2)
x = Dense(50, activation="relu")(gpool)
l_out2 = Dense(len(global_labels.keys()), activation='softmax')(x)
model = Model(inputs=l_input, outputs=[l_out1, l_out2])
You may need to tune the number of conv+maxpool layers, number of filters, kernel size and even add dropout or batch normalization layers.
As a side note, using TimeDistributed on a Dense layer is redundant as the Dense layer is applied on the last axis.

Categories