failing to load weights into model XCeption-CNN - python

I make the same model, I load the weights. It had worked before but now it does not. Maybe I am stupid but I am thought I checked everything.
def make_model(trainable = False):
xception = Xception(include_top=False, weights='imagenet', input_shape=input_shape)
xception.trainable = trainable
inputs = Input(shape=input_shape, name='xception_input')
x = xception(inputs, training=False)
x = Flatten()(x)
x = Dense(256, activation='swish', name='xception_int', trainable=True)(x)
x = Dropout(0.6)(x)
outputs = Dense(17, activation='softmax', name='xception_output')(x)
model = Model(inputs, outputs)
return model
then I load the weights.
model.load_weights('models/xception2_weights.h5')
It has different things it says at the end of the error message. On Mac it says:
ValueError: Shapes (32,) and (3, 3, 32, 64) are incompatible
on Windows:
ValueError: axes don't match array
from what it says on the Mac, I guess it has something to do with the XCeption part. I had used load_weights in the same way successfully before I don't know why it is like this now.
If anyone can help, that would be great

I identified that when you create the model via tf.keras.applications.Xception as a functional model, has a different shape than including tf.keras.applications.Xception a Sequential model.
It also happens if you remove the top layer, adding a new classifier and then you return back to the original Xception shape with a Dense(1000) output.
If your initial model was generated with a different classifier, you added more layers and went back, it will also mismatch. Please check the axis of the original model and this new one and compare the shapes.

Related

Error Pop Out while Add Layers from Pre-Trained Model

I am trying to fine-tune a pre-trained time-series model with 12 dimension physiological signals as inputs, yet, my dataset is only one dimension. Therefore, I built the first conv1d layer and set the weight from the pre-trained model. After that, I add the rest parts from the pre-trained model. But, an error popout and says the dimension mismatch.
Here are the details of my code and error:
#Load Whole Model and Weight of the First Conv1d Layer
BaseModel = keras.models.load_model('model.hdf5')
input_w = BaseModel.layers[1].get_weights()[0]
input_w_1_lead = input_w[:,1,:].reshape(input_w.shape[0], 1, input_w.shape[2])
#Creat the Input and First Conv1d Layer
model = tf.keras.Sequential([
keras.Input(shape=(4096, 1), dtype=np.float32, name='signal'),
keras.layers.Conv1D(64, 16, padding='same',
use_bias=False, name='conv1d_1',),
])
#Set the Weight from Pre-Trained Model
model.layers[0].set_weights([input_w_1_lead])
#Add the Rest Parts from Pre-Trained Model
for layer in BaseModel.layers[2:]:
model.add(layer)
#And here is the error:
ValueError: Exception encountered when calling layer "add_1" (type Add).
A merge layer should be called on a list of inputs. Received: inputs=Tensor("Placeholder:0", shape=(None, 256, 128), dtype=float32) (not a list of tensors)
Call arguments received by layer "add_1" (type Add):
• inputs=tf.Tensor(shape=(None, 256, 128), dtype=float32)
Then, I looked into the details of my code and found out that I could only add layers until the 11th one.
test_model = tf.keras.Sequential()
for layer in BaseModel.layers[:11]:
#Not work if I set BaseModel.layers[:12]
test_model.add(layer)
With model.summary(), dimension information seems missing after the pooling layer. Here are the outputs for both BaseModel.summary() and test_model.summary():
BaseModel
test_model
However, I couldn't find the solution.
Dear All,
I figure out an alternative solution for my task. I modified the model's input and output layers from source code. Then set the weight from source model.
Here is the solution code:
'Load Pre-Trained Model to Get Weight'
BaseModel = keras.models.load_model('model.hdf5')
input_w = BaseModel.layers[1].get_weights()[0]
input_w_1_lead = input_w[:,1,:].reshape(input_w.shape[0], 1, input_w.shape[2])
BaseModel.summary()
'Build the Fine-Tuned Model'
model = get_model(1)
model.layers[1].set_weights([input_w_1_lead])
'Load Weight from Source Model'
for i in range(2, len(BaseModel.layers) -1):
model.layers[i].set_weights(BaseModel.layers[i].get_weights())
'Check the Weight is Equal to Source Code'
for i in range(2, len(BaseModel.layers) -1):
if np.allclose(
BaseModel.layers[i].get_weights(),
model.layers[i].get_weights()) is False:
print('this way is no correct')
Ta-da! It's work and nothing printed in the console.

How to edit functional model in keras?

I am using the tf.keras.applications.efficientnet_v2.EfficientNetV2L model and I want to edit the last layers of the model to make the model a regression and classification layer. However, I am unsure of how to edit this model because it is not a linear sequential model, and thus I cannot do:
for layer in model.layers[:-2]:
model.add(layer)
as certain layers of the model have multiple inputs. Is there a way of preserving the model except the last layer so the model will diverge before the last layer?
efficentnet[:-2]
|
|
/ \
/ \
/ \
output1 output2
To enable a functional model to have a classification layer and a regression layer, you can change the model as follows. Note, there are various ways to achieve this, and this is one of them.
import tensorflow as tf
from tensorflow import keras
prev_model = keras.applications.EfficientNetV2B0(
input_tensor=keras.Input(shape=(224, 224, 3)),
include_top=False
)
Next, we will write our expected head layers, shown below.
neck_branch = keras.Sequential(
[
# we can add more layers i.e. batch norm, etc.
keras.layers.GlobalAveragePooling2D()
],
name='neck_head'
)
classification_head = keras.Sequential(
[
keras.layers.Dense(10, activation='softmax')
],
name='classification_head'
)
regression_head = keras.Sequential(
[
keras.layers.Dense(1, activation=None)
],
name='regression_head'
)
Now, we can build the desired model.
x = neck_branch(prev_model.output)
output_a = classification_head(x)
output_b = regression_head(x)
final_model = keras.Model(prev_model.inputs, [output_a, output_b])
Test
keras.utils.plot_model(final_model, expand_nested=True)
# OK
final_model(tf.ones(shape=(1, 224, 224, 3)))
# OK
Update
Based on your comments,
how you would tackle the problem if the previous model was imported from a h5 file since there I cannot declare the top layer not to be included?
If I understand your query, you have a saved model (in .h5 format) with top layers. In that case, you don't have include_top params to exclude the top branch. So, what you can do is remove the top branch of your saved model first. Here is how,
# a saved model with top layers
prev_model = keras.models.load_model('model.h5')
prev_model_with_top_remove = keras.Model(
prev_model.input ,
prev_model.layers[-4].output
)
prev_model_with_top_remove.summary()
This prev_model.layers[-4].output will remove the top branch. In the end, you will give similar output as we can get with include_top=True. Check the model summary to visually inspect.
Keras' functional API works by linking Keras tensors (hereby called KTensor) and not your everyday TF tensors.
Therefore, the first thing you need to do is feeding KTensors (created using tf.keras.Input) of proper shapes to the original model. This will trigger the forward chain, prompting the model's layers to produce their own output KTensors that are properly linked to the input KTensors. After the forward pass,
The layers will store their received/produced KTensors in their input and output attributes.
The model itself will also store the KTensors you fed to it and the corresponding final output KTensors in its inputs and outputs attributes (note the s).
Like so,
>>> from tensorflow.keras import Input
>>> from tensorflow.keras.layers import Dense
>>> from tensorflow.keras.models import Sequential, Model
>>> seq_model = Sequential()
>>> seq_model.add(Dense(1))
>>> seq_model.add(Dense(2))
>>> seq_model.add(Dense(3))
>>> hasattr(seq_model.layers[0], 'output')
False
>>> seq_model.inputs is None
True
>>> _ = seq_model(Input(shape=(10,))) # <--- Feed input KTensor to the model
>>> seq_model.layers[0].output
<KerasTensor: shape=(None, 1) dtype=float32 (created by layer 'dense')>
>>> seq_model.inputs
[<KerasTensor: shape=(None, 10) dtype=float32 (created by layer 'dense_input')>]
Once you've obtained these internal KTensors, everything becomes trivial. To extract the KTensor right before the last two layers and forward it to two different branches to form a new functional model, do
>>> intermediate_ktensor = seq_model.layers[-3].output
>>> branch_1_output = Dense(20)(intermediate_ktensor)
>>> branch_2_output = Dense(30)(intermediate_ktensor)
>>> branched_model = Model(inputs=seq_model.inputs, outputs=[branch_1_output, branch_2_output])
Note that the shapes of the KTensors you fed at the very first step must conform to the shape requirements of the layers that receive them. In my example, the input KTensor would be fed to Dense(1) layer. As Dense requires the input shape to be defined in the last dimension, the input KTensor could be of shapes, e.g., (10,) or (None,10) but not (None,) or (10, None).

How to stack Keras models horizontally?

Is it possible to do something like this in Keras? :
Except Model A,B,C are all stacked horizontally into one model? I've seen some solutions that utilize an input layer, but whenever I use an input layer, I seem to get an error when I try to load a model.
Is there a way to load all the models, concatenate them, and save as a single, new, larger model?
EDIT: I already have all the models trained. I want to combine them after the fact.
Here is my idea, let's assume you have these models to stack:
model_1 = tf.keras.models.Model(inputs = model_1.input, outputs = model_1_out)
model_2 = tf.keras.models.Model(inputs = model_2.input, outputs = model_2_out)
model_3 = tf.keras.models.Model(inputs = model_3.input, outputs = model_3_out)
If you want to stack the models, not concatenating their outputs:
models = [model_3 , model_2 , model_1]
stacked_model_input = tf.keras.Input(shape=(x, x, x))
model_outputs = [model(stacked_model_input) for model in models]
stacked_model = tf.keras.models.Model(inputs=stacked_model_input, outputs=model_outputs)
model_outputs gives: (Passed 3 for here.)
[<KerasTensor: shape=(None, 3) dtype=float32 (created by layer 'model_2')>,
<KerasTensor: shape=(None, 3) dtype=float32 (created by layer 'model_1')>,
<KerasTensor: shape=(None, 3) dtype=float32 (created by layer 'model')>]
Produces:
For to save stacked model:
from tf.keras.models import save_model
save_model(stacked_model , 'model.h5')
I am not sure how you can use their seperate outputs but, that's how you can stack them.
Edit: You can use their outputs by defining seperate loss etc. Or since they are stacked and input is shared, you can get each model's outputs to create a new mode with its weights. I don't know if you can cut them from the stacked model, so that's why I said getting each output.
Yes, this is possible in keras, but it would require some advanced knowledge of the API. In particular, you need to think about how you want to compute the loss of each output with respect to the input.
I'd suggest to check out the developer guides, perhaps starting with the functional API and custom training loops.
Below is a sketch of how you'd create this type of network with the functional API.
from tensorflow import keras
input_shape: int = 100
inputs = keras.Input(shape=(input_shape,))
units: int = 64
dense1 = layers.Dense(units)
dense2 = layers.Dense(units)
dense3 = layers.Dense(units)
out1 = dense1(inputs)
out2 = dense2(inputs)
out3 = dense3(inputs)

Get decoder from trained autoencoder model in Keras

I am training a deep autoencoder to map human faces to a 128 dimensional latent space, and then decode them back to its original 128x128x3 format.
I was hoping that after training the autoencoder, I would somehow be able to 'slice' the second half of the autoencoder, i.e. the decoder network responsible for mapping the latent space (128,) to the image space (128, 128, 3) by using the functional Keras API and autoenc_model.get_layer()
Here are the relevant layers of my model:
INPUT_SHAPE=(128,128,3)
input_img = Input(shape=INPUT_SHAPE, name='enc_input')
#1
x = Conv2D(64, (3, 3), padding='same', activation='relu')(input_img)
x = BatchNormalization()(x)
//Many Conv2D, BatchNormalization(), MaxPooling() layers
.
.
.
#Flatten
fc_input = Flatten(name='enc_output')(x)
y = Dropout(DROP_RATE)(fc_input)
y = Dense(128, activation='relu')(y)
y = Dropout(DROP_RATE)(y)
fc_output = Dense(128, activation='linear')(y)
#Reshape
decoder_input = Reshape((8, 8, 2), name='decoder_input')(fc_output)
#Decoder part
#UnPooling-1
z = UpSampling2D()(decoder_input)
//Many Conv2D, BatchNormalization, UpSampling2D layers
.
.
.
#16
decoder_output = Conv2D(3, (3, 3), padding='same', activation='linear', name='decoder_output')(z)
autoenc_model = Model(input_img, decoder_output)
here is the notebook containing the entire model architecture.
To get the decodeer network from the trained autoencoder, I have tried using:
dec_model = Model(inputs=autoenc_model.get_layer('decoder_input').input, outputs=autoenc_model.get_layer('decoder_output').output)
and
dec_model = Model(autoenc_model.get_layer('decoder_input'), autoenc_model.get_layer('decoder_output'))
neither of which seem to work.
I need to extract the decoder layers out of the autoencoder as I want to train the entire autoencoder model first, then use the encoder and the decoder independently.
I could not find a satisfactory answer anywhere else. The Keras blog article on building autoencoders only covers how to extract the decoder for 2 layered autoencoders.
The decoder input/output shape should be: (128, ) and (128, 128, 3), which is the input shape of the 'decoder_input' and output shape of the 'decoder_output' layers respectively.
Couple of changes are needed:
z = UpSampling2D()(decoder_input)
to
direct_input = Input(shape=(8,8,2), name='d_input')
#UnPooling-1
z = UpSampling2D()(direct_input)
and
autoenc_model = Model(input_img, decoder_output)
to
dec_model = Model(direct_input, decoder_output)
autoenc_model = Model(input_img, dec_model(decoder_input))
Now, you can train on the auto encoder and predict using the decoder.
import numpy as np
autoenc_model.fit(np.ones((5,128,128,3)), np.ones((5,128,128,3)))
dec_model.predict(np.ones((1,8,8,2)))
You can also refer this self-contained example:
https://github.com/keras-team/keras/blob/master/examples/variational_autoencoder.py
My solution isn't very elegant, and there are probably better solutions out there, but since no-one replied yet, I'll post it (I was actually hoping someone would so I can improve my own implementation, as you'll see below).
So what I did was built a network that can take a secondary input, directly into the latent space.
Unfortunately, both inputs are obligatory, so I end up with a network that requires dummy arrays full of zeros for the 'unwanted' input (you'll see in a second).
Using Keras functional API:
image_input = Input(shape=image_shape)
conv1 = Conv2D(...,activation='relu')(image_input)
...
dense_encoder = Dense(...)(<layer>)
z_input = Input(shape=n_latent)
decoder_entry = Dense(...,activation='relu')(Add()([dense_encoder,z_input]))
...
decoder_output = Conv2DTranspose(...)
model = Model(inputs=[image_input,z_input], outputs=decoder_output)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
encoder = Model(inputs=image_input,outputs=dense_encoder)
decoder = Model(inputs=[z_input,image_input], outputs=decoder_output)
Note that you shouldn't compile the encoder and decoder.
(some code is either omitted or left with ... for you to fill in your specific needs).
Finally, to train you'll have to provide one empty array. So to train the entire auto-encoder:
images is X in this context
model.fit([images,np.zeros((len(n_latent),...))],images)
And then you can get the latent features using:
latent_features = encoder.predict(images)
Or use the decoder with latent input and dummy variables (note the order of inputs above):
decoder.predict([Z_inputs,np.zeros(shape=images.shape)])
Finally, another solution I haven't tried is build to parallel models, with the same architecture, one the autoencoder, and the second only the decoder part, and then use:
decoder_layer.set_weights(model_layer.get_weights())
It should work, but I haven't confirmed it. It does have the disadvantage of having to copy the weights again every time your train the autoencoder model.
So to conclude, I am aware of the many problems here, but again, I only posted this because I saw no-one else replied, and was hoping this will still be of some use to you.
Please comment if something is not clear.
An option is to define a function which uses get_layer and then reconstruct the decoder part in there. For example, consider a simple autoencoder with the following architecture: [n_inputs, 500, 100, 500, n_outputs]. To be able to run some inputs through the second half (ie run 100 inputs through the layers of 500 and n_outputs.
# Function to get outputs from a given set of bottleneck inputs
def bottleneck_to_outputs(bottleneck_inputs, autoencoder):
# Run bottleneck_inputs (eg 100 units) through decoder layer (eg 500 units)
x = autoencoder.get_layer('decoder')(bottleneck_inputs)
# Run x (eg 500 units) through output layer (n units = n features)
x = autoencoder.get_layer('output')(x)
return x
For your example, this function should work (assuming you have given your layers the names referenced here).
def decoder_part(autoenc_model, image):
#UnPooling-1
z = autoenc_model.get_layer('upsampling1')(image)
#9
z = autoenc_model.get_layer('conv2d1')(z)
z = autoenc_model.get_layer('batchnorm1')(z)
#10
z = autoenc_model.get_layer('conv2d2')(z)
z = autoenc_model.get_layer('batchnorm2')(z)
#UnPooling-2
z = autoenc_model.get_layer('upsampling2')(z)
#11
z = autoenc_model.get_layer('conv2d3')(z)
z = autoenc_model.get_layer('batchnorm3')(z)
#12
z = autoenc_model.get_layer('conv2d4')(z)
z = autoenc_model.get_layer('batchnorm4')(z)
#UnPooling-3
z = autoenc_model.get_layer('upsampling3')(z)
#13
z = autoenc_model.get_layer('conv2d5')(z)
z = autoenc_model.get_layer('batchnorm5')(z)
#14
z = autoenc_model.get_layer('conv2d6')(z)
z = autoenc_model.get_layer('batchnorm6')(z)
#UnPooling-4
z = autoenc_model.get_layer('upsampling4')(z)
#15
z = autoenc_model.get_layer('conv2d7')(z)
z = autoenc_model.get_layer('batchnorm7')(z)
#16
decoder_output = autoenc_model.get_layer('decoder_output')(z)
return decoder_output
Given this function, it would make sense to also have a way to test if it is working correctly. In order to do this, define another model which gets you from inputs to the bottleneck (latent space), such as:
bottleneck_layer = Model(inputs= input_img,outputs=decoder_input)
Then, as a test, run a vector of ones through the first part of the model and obtain the latent space:
import numpy as np
ones_image = np.ones((128,128,3))
bottleneck_ones = bottleneck_layer(ones_image.reshape(1,128,128,3))
And then run that latent space through the function defined above to create a variable which you will test against the output of full network:
decoded_test = decoder_part(autoenc_model, bottleneck_ones)
Now, run the ones_image through the whole network and verify that you get the same results:
model_test = autoenc_model.predict(ones_image.reshape(1,128,128,3))
tf.debugging.assert_equal(model_test, decoder_test, message= 'Tensors are not equivalent')
If the assert_equal line does not throw an error, your decoder is working correctly.

Constructing a keras model

I don't understand what's happening in this code:
def construct_model(use_imagenet=True):
# line 1: how do we keep all layers of this model ?
model = keras.applications.InceptionV3(include_top=False, input_shape=(IMG_SIZE, IMG_SIZE, 3),
weights='imagenet' if use_imagenet else None) # line 1: how do we keep all layers of this model ?
new_output = keras.layers.GlobalAveragePooling2D()(model.output)
new_output = keras.layers.Dense(N_CLASSES, activation='softmax')(new_output)
model = keras.engine.training.Model(model.inputs, new_output)
return model
Specifically, my confusion is, when we call the last constructor
model = keras.engine.training.Model(model.inputs, new_output)
we specify input layer and output layer, but how does it know we want all the other layers to stay?
In other words, we append the new_output layer to the pre-trained model we load in line 1, that is the new_output layer, and then in the final constructor (final line), we just create and return a model with a specified input and output layers, but how does it know what other layers we want in between?
Side question 1): What is the difference between keras.engine.training.Model and keras.models.Model?
Side question 2): What exactly happens when we do new_layer = keras.layers.Dense(...)(prev_layer)? Does the () operation return new layer, what does it do exactly?
This model was created using the Functional API Model
Basically it works like this (perhaps if you go to the "side question 2" below before reading this it may get clearer):
You have an input tensor (you can see it as "input data" too)
You create (or reuse) a layer
You pass the input tensor to a layer (you "call" a layer with an input)
You get an output tensor
You keep working with these tensors until you have created the entire graph.
But this hasn't created a "model" yet. (One you can train and use other things).
All you have is a graph telling which tensors go where.
To create a model, you define it's start end end points.
In the example.
They take an existing model: model = keras.applications.InceptionV3(...)
They want to expand this model, so they get its output tensor: model.output
They pass this tensor as the input of a GlobalAveragePooling2D layer
They get this layer's output tensor as new_output
They pass this as input to yet another layer: Dense(N_CLASSES, ....)
And get its output as new_output (this var was replaced as they are not interested in keeping its old value...)
But, as it works with the functional API, we don't have a model yet, only a graph. In order to create a model, we use Model defining the input tensor and the output tensor:
new_model = Model(old_model.inputs, new_output)
Now you have your model.
If you use it in another var, as I did (new_model), the old model will still exist in model. And these models are sharing the same layers, in a way that whenever you train one of them, the other gets updated as well.
Question: how does it know what other layers we want in between?
When you do:
outputTensor = SomeLayer(...)(inputTensor)
you have a connection between the input and output. (Keras will use the inner tensorflow mechanism and add these tensors and nodes to the graph). The output tensor cannot exist without the input. The entire InceptionV3 model is connected from start to end. Its input tensor goes through all the layers to yield an ouptut tensor. There is only one possible way for the data to follow, and the graph is the way.
When you get the output of this model and use it to get further outputs, all your new outputs are connected to this, and thus to the first input of the model.
Probably the attribute _keras_history that is added to the tensors is closely related to how it tracks the graph.
So, doing Model(old_model.inputs, new_output) will naturally follow the only way possible: the graph.
If you try doing this with tensors that are not connected, you will get an error.
Side question 1
Prefer to import from "keras.models". Basically, this module will import from the other module:
https://github.com/keras-team/keras/blob/master/keras/models.py
Notice that the file keras/models.py imports Model from keras.engine.training. So, it's the same thing.
Side question 2
It's not new_layer = keras.layers.Dense(...)(prev_layer).
It is output_tensor = keras.layers.Dense(...)(input_tensor).
You're doing two things in the same line:
Creating a layer - with keras.layers.Dense(...)
Calling the layer with an input tensor to get an output tensor
If you wanted to use the same layer with different inputs:
denseLayer = keras.layers.Dense(...) #creating a layer
output1 = denseLayer(input1) #calling a layer with an input and getting an output
output2 = denseLayer(input2) #calling the same layer on another input
output3 = denseLayer(input3) #again
Bonus - Creating a functional model that is equal to a sequential model
If you create this sequential model:
model = Sequential()
model.add(Layer1(...., input_shape=some_shape))
model.add(Layer2(...))
model.add(Layer3(...))
You're doing exactly the same as:
inputTensor = Input(some_shape)
outputTensor = Layer1(...)(inputTensor)
outputTensor = Layer2(...)(outputTensor)
outputTensor = Layer3(...)(outputTensor)
model = Model(inputTensor,outputTensor)
What is the difference?
Well, functional API models are totally free to be build anyway you want. You can create branches:
out1 = Layer1(..)(inputTensor)
out2 = Layer2(..)(inputTensor)
You can join tensors:
joinedOut = Concatenate()([out1,out2])
With this, you can create anything you want with all kinds of fancy stuff, branches, gates, concatenations, additions, etc., which you can't do with a sequential model.
In fact, a Sequential model is also a Model, but created for a quick use in models without branches.
There's this way of building a model from a pretrained one that you may build upon.
See https://keras.io/applications/#fine-tune-inceptionv3-on-a-new-set-of-classes:
base_model = InceptionV3(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(200, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
Each time a layer is added by an op like "x=Dense(...", information about the computational graph is updated. You can type this interactively to see what it contains:
x.graph.__dict__
You can see there's all kinds of attributes, including about previous and next layers. These are internal implementation details and possibly change over time.

Categories