I'm trying to create a function that splits a Keras model on a user specified layer. I have the following code:
def return_split_models(model, layer):
model_f, model_h = Sequential(), Sequential()
for current_layer in range(0, layer+1):
model_f.add(model.layers[current_layer])
for current_layer in range(layer+1, len(model.layers)):
model_h.add(model.layers[current_layer])
return model_f, model_h
However, when we return model_h and call a summary, we will see a ValueError that the model has never been called. From looking at other posts it seems like this has to do with the inputs for model_h, however I cannot find examples which generalize to any specified layer. Does anyone have any guidance?
You need to add InputLayer to model_h.
from keras.layers import InputLayer
def return_split_models(model, layer):
model_f, model_h = Sequential(), Sequential()
for current_layer in range(0, layer+1):
model_f.add(model.layers[current_layer])
# add input layer
model_h.add(InputLayer(input_shape=model.layers[layer+1].input_shape[1:]))
for current_layer in range(layer+1, len(model.layers)):
model_h.add(model.layers[current_layer])
return model_f, model_h
An example:
model = Sequential()
model.add(Dense(50,input_shape=(100,)))
model.add(Dense(40))
model.add(Dense(30))
model.add(Dense(20))
model.add(Dense(10))
model_f, model_h = return_split_models(model, 2)
print(model_f.summary())
print(model_h.summary())
# print
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 50) 5050
_________________________________________________________________
dense_2 (Dense) (None, 40) 2040
_________________________________________________________________
dense_3 (Dense) (None, 30) 1230
=================================================================
Total params: 8,320
Trainable params: 8,320
Non-trainable params: 0
_________________________________________________________________
None
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_4 (Dense) (None, 20) 620
_________________________________________________________________
dense_5 (Dense) (None, 10) 210
=================================================================
Total params: 830
Trainable params: 830
Non-trainable params: 0
_________________________________________________________________
None
Related
In the example from 3b1b's video about Neural Network (the video), the model has 784 "neurons" in the input layer, followed by two 16-neuron dense layers, and a 10-neuron dense layer. (Please refer to the screenshot of the video provided below). This makes sense, because for example the first neuron in the input layer will have 16 'weights' (as in xw) so the number of weights is 784 * 16. And followed by 1616, and 16*10. There are also biases, which is same as the number of neurons in the dense layers.
Then I made the same model in Tensorflow, and the model.summary() shows the following:
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 784, 1)] 0
dense_8 (Dense) (None, 784, 16) 32
dense_9 (Dense) (None, 784, 16) 272
dense_10 (Dense) (None, 784, 10) 170
=================================================================
Total params: 474
Trainable params: 474
Non-trainable params: 0
_________________________________________________________________
Code used to produce the above:
#I'm using Keras through Julia so the code may look different?
input_shape = (784,1)
inputs = layers.Input(input_shape)
outputs = layers.Dense(16)(inputs)
outputs = layers.Dense(16)(outputs)
outputs = layers.Dense(10)(outputs)
model = keras.Model(inputs, outputs)
model.summary()
Which does not reflect the input shape at all? So I made another model with input_shape=(1,1), and I get the same Total Params:
Model: "model_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_10 (InputLayer) [(None, 1, 1)] 0
dense_72 (Dense) (None, 1, 16) 32
dense_73 (Dense) (None, 1, 16) 272
dense_74 (Dense) (None, 1, 10) 170
=================================================================
Total params: 474
Trainable params: 474
Non-trainable params: 0
_________________________________________________________________
I don't think it's a bug, but I probably just don't understand what these mean / how Params are calculated.
Any help will be very appreciated.
A Dense layer is applied to the last dimension of your input. In your case it is 1, instead of 784. What you actually want is:
import tensorflow as tf
input_shape = (784, )
inputs = tf.keras.layers.Input(input_shape)
outputs = tf.keras.layers.Dense(16)(inputs)
outputs = tf.keras.layers.Dense(16)(outputs)
outputs = tf.keras.layers.Dense(10)(outputs)
model = tf.keras.Model(inputs, outputs)
model.summary()
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 784)] 0
dense_3 (Dense) (None, 16) 12560
dense_4 (Dense) (None, 16) 272
dense_5 (Dense) (None, 10) 170
=================================================================
Total params: 13,002
Trainable params: 13,002
Non-trainable params: 0
_________________________________________________________________
From the TF docs:
Note: If the input to the layer has a rank greater than 2, then Dense
computes the dot product between the inputs and the kernel along the
last axis of the inputs and axis 0 of the kernel (using tf.tensordot).
For example, if input has dimensions (batch_size, d0, d1), then we
create a kernel with shape (d1, units), and the kernel operates along
axis 2 of the input, on every sub-tensor of shape (1, 1, d1) (there
are batch_size * d0 such sub-tensors). The output in this case will
have shape (batch_size, d0, units).
I want to use keras.layers.Embedding in a customized sub-model. But output shape is 'multiple'.Then I try to write a demo and test it
The results of the two methods are different,and I'm not sure if it's my problem.
model1 return me a clear output shape
but model2 give me a 'multiple'
here is the full demo code:
import tensorflow as tf
import tensorflow.keras as keras
import numpy as np
myPath = "./data/"
embedding_matrix = np.load(myPath + "embeddings_matrix.npy",allow_pickle=True)
model1=keras.Sequential([
keras.layers.InputLayer(5),
keras.layers.Embedding(len(embedding_matrix), 100,weights=[embedding_matrix],trainable=False,mask_zero=True,name="embedding_layer"),
keras.layers.Dense(200)
],name="model1")
model1.build((None,5))
model1.summary()
class myModel(keras.Model):
def __init__(self,
embedding_matrix,
embedding_out=100,
**kwargs):
super(myModel, self).__init__(name='model2')
self.input_x = keras.layers.InputLayer(input_shape=5)
self.embedding_layer = keras.layers.Embedding(len(embedding_matrix), embedding_out,weights=[embedding_matrix],trainable=False,mask_zero=True,name="embedding",input_length=5)
self.dense = keras.layers.Dense(200)
def call(self,x):
x = self.input_x(x)
y = self.embedding_layer(x)
y = self.dense(y)
return y
model2 = myModel(embedding_matrix)
model2.build((None,5))
model2.summary()
Here is the model.summary() info:
Model: "model1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_layer (Embedding) (None, 5, 100) 5000400
_________________________________________________________________
dense (Dense) (None, 5, 200) 20200
=================================================================
Total params: 5,020,600
Trainable params: 20,200
Non-trainable params: 5,000,400
_________________________________________________________________
Model: "model2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 5)] 0
_________________________________________________________________
embedding (Embedding) multiple 5000400
_________________________________________________________________
dense_1 (Dense) multiple 20200
=================================================================
Total params: 5,020,600
Trainable params: 20,200
Non-trainable params: 5,000,400
_________________________________________________________________
I want to know the right way to use a keras.layers.Embedding in a customized sub-model
If I have:
self.model.add(LSTM(lstm1_size, input_shape=(seq_length, feature_dim), return_sequences=True))
self.model.add(BatchNormalization())
self.model.add(Dropout(0.2))
then my seq_length specifies how many slices of data I want to process at once. If it matters, my model is a sequence-to-sequence (same size).
But if I have:
self.model.add(Bidirectional(LSTM(lstm1_size, input_shape=(seq_length, feature_dim), return_sequences=True)))
self.model.add(BatchNormalization())
self.model.add(Dropout(0.2))
then is that doubling the sequence size? Or at each time step, is it getting seq_length / 2 before and after that timestep?
Using a bidirectional LSTM layer has no effect on the sequence length.
I tested this with the following code:
from keras.models import Sequential
from keras.layers import Bidirectional,LSTM,BatchNormalization,Dropout,Input
model = Sequential()
lstm1_size = 50
seq_length = 128
feature_dim = 20
model.add(Bidirectional(LSTM(lstm1_size, input_shape=(seq_length, feature_dim), return_sequences=True)))
model.add(BatchNormalization())
model.add(Dropout(0.2))
batch_size = 32
model.build(input_shape=(batch_size,seq_length, feature_dim))
model.summary()
This resulted in the following output for bidirectional
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
bidirectional_1 (Bidirection (32, 128, 100) 28400
_________________________________________________________________
batch_normalization_1 (Batch (32, 128, 100) 400
_________________________________________________________________
dropout_1 (Dropout) (32, 128, 100) 0
=================================================================
Total params: 28,800
Trainable params: 28,600
Non-trainable params: 200
No bidirectional layer:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_1 (LSTM) (None, 128, 50) 14200
_________________________________________________________________
batch_normalization_1 (Batch (None, 128, 50) 200
_________________________________________________________________
dropout_1 (Dropout) (None, 128, 50) 0
=================================================================
Total params: 14,400
Trainable params: 14,300
Non-trainable params: 100
_________________________________________________________________
I'm trying to select the last Conv2D layer for a given (general) model. model.summary() provides a list of layers with their type, but how can I access this to find the layer of that type?
Output from model.summary():
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 224, 224, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
...
_________________________________________________________________
predictions (Dense) (None, 1000) 4097000
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
I think what you are looking for is this:
layer.__class__.__name__
where layer is in
model.layers
You can iterate over model.layers in reverse order and check layer types via isinstance:
next(x for x in model.layers[::-1] if isinstance(x, keras.layers.Conv2D))
Here I have a GoogleNet model for Keras. Is there any possibile way to block the changing of the individual layers of network? I want to block the first two layers of pretrained model from changes.
By 'block the changing of the individual layers' I am assuming you don't want to train those layers, that is you don't want to modify the loaded weights(possibly learnt in previous training).
if so you can pass trainable=False to the layer and the parameters wont be used for the training update rule.
Example:
from keras.models import Sequential
from keras.layers import Dense, Activation
model = Sequential([
Dense(32, input_dim=100),
Dense(output_dim=10),
Activation('sigmoid'),
])
model.summary()
model2 = Sequential([
Dense(32, input_dim=100,trainable=False),
Dense(output_dim=10),
Activation('sigmoid'),
])
model2.summary()
You can see in the model summary for the 2nd model the parameters are counted as Non-trainable ones.
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
dense_1 (Dense) (None, 32) 3232 dense_input_1[0][0]
____________________________________________________________________________________________________
dense_2 (Dense) (None, 10) 330 dense_1[0][0]
____________________________________________________________________________________________________
activation_1 (Activation) (None, 10) 0 dense_2[0][0]
====================================================================================================
Total params: 3,562
Trainable params: 3,562
Non-trainable params: 0
____________________________________________________________________________________________________
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
dense_3 (Dense) (None, 32) 3232 dense_input_2[0][0]
____________________________________________________________________________________________________
dense_4 (Dense) (None, 10) 330 dense_3[0][0]
____________________________________________________________________________________________________
activation_2 (Activation) (None, 10) 0 dense_4[0][0]
====================================================================================================
Total params: 3,562
Trainable params: 330
Non-trainable params: 3,232