I am not sure how to get output of an intermediate layer in Keras. I have read the other questions on stackoverflow but they seem to be functions with a single sample as input. I want to get output features(at intermediate layer) in batches as well. Here is my model:
model = Sequential()
model.add(ResNet50(include_top = False, pooling = RESNET50_POOLING_AVERAGE, weights = resnet_weights_path)) #None
model.add(Dense(784, activation = 'relu'))
model.add(Dense(NUM_CLASSES, activation = DENSE_LAYER_ACTIVATION))
model.layers[0].trainable = True
After training the model, in my code I want to get the output after the first dense layer (784 dimensional). Is this the right way to do it?
pred = model.layers[1].predict_generator(data_generator, steps = len(data_generator), verbose = 1)
I am new to Keras so I am a little unsure. Do I need to compile the model again after training?
No, you don't need to compile again after training.
Based on your Sequential model.
Layer 0 :: model.add(ResNet50(include_top = False, pooling = RESNET50_POOLING_AVERAGE, weights = resnet_weights_path)) #None
Layer 1 :: model.add(Dense(784, activation = 'relu'))
Layer 2 :: model.add(Dense(NUM_CLASSES, activation = DENSE_LAYER_ACTIVATION))
Accessing the layers, may differ if used Functional API approach.
Using Tensorflow 2.1.0, you could try this approach when you want to access intermediate outputs.
model_dense_784 = Model(inputs=model.input, outputs = model.layers[1].output)
pred_dense_784 = model_dense_784.predict(train_data_gen, steps = 1) # predict_generator is deprecated
print(pred_dense_784.shape) # Use this to check Output Shape
It is highly advisable to use the model.predict() method, rather than model.predict_generator() as it is already deprecated.
You could also use shape() method to check whether the output generated is the same as indicated on the model.summary().
Related
I am using the tf.keras.applications.efficientnet_v2.EfficientNetV2L model and I want to edit the last layers of the model to make the model a regression and classification layer. However, I am unsure of how to edit this model because it is not a linear sequential model, and thus I cannot do:
for layer in model.layers[:-2]:
model.add(layer)
as certain layers of the model have multiple inputs. Is there a way of preserving the model except the last layer so the model will diverge before the last layer?
efficentnet[:-2]
|
|
/ \
/ \
/ \
output1 output2
To enable a functional model to have a classification layer and a regression layer, you can change the model as follows. Note, there are various ways to achieve this, and this is one of them.
import tensorflow as tf
from tensorflow import keras
prev_model = keras.applications.EfficientNetV2B0(
input_tensor=keras.Input(shape=(224, 224, 3)),
include_top=False
)
Next, we will write our expected head layers, shown below.
neck_branch = keras.Sequential(
[
# we can add more layers i.e. batch norm, etc.
keras.layers.GlobalAveragePooling2D()
],
name='neck_head'
)
classification_head = keras.Sequential(
[
keras.layers.Dense(10, activation='softmax')
],
name='classification_head'
)
regression_head = keras.Sequential(
[
keras.layers.Dense(1, activation=None)
],
name='regression_head'
)
Now, we can build the desired model.
x = neck_branch(prev_model.output)
output_a = classification_head(x)
output_b = regression_head(x)
final_model = keras.Model(prev_model.inputs, [output_a, output_b])
Test
keras.utils.plot_model(final_model, expand_nested=True)
# OK
final_model(tf.ones(shape=(1, 224, 224, 3)))
# OK
Update
Based on your comments,
how you would tackle the problem if the previous model was imported from a h5 file since there I cannot declare the top layer not to be included?
If I understand your query, you have a saved model (in .h5 format) with top layers. In that case, you don't have include_top params to exclude the top branch. So, what you can do is remove the top branch of your saved model first. Here is how,
# a saved model with top layers
prev_model = keras.models.load_model('model.h5')
prev_model_with_top_remove = keras.Model(
prev_model.input ,
prev_model.layers[-4].output
)
prev_model_with_top_remove.summary()
This prev_model.layers[-4].output will remove the top branch. In the end, you will give similar output as we can get with include_top=True. Check the model summary to visually inspect.
Keras' functional API works by linking Keras tensors (hereby called KTensor) and not your everyday TF tensors.
Therefore, the first thing you need to do is feeding KTensors (created using tf.keras.Input) of proper shapes to the original model. This will trigger the forward chain, prompting the model's layers to produce their own output KTensors that are properly linked to the input KTensors. After the forward pass,
The layers will store their received/produced KTensors in their input and output attributes.
The model itself will also store the KTensors you fed to it and the corresponding final output KTensors in its inputs and outputs attributes (note the s).
Like so,
>>> from tensorflow.keras import Input
>>> from tensorflow.keras.layers import Dense
>>> from tensorflow.keras.models import Sequential, Model
>>> seq_model = Sequential()
>>> seq_model.add(Dense(1))
>>> seq_model.add(Dense(2))
>>> seq_model.add(Dense(3))
>>> hasattr(seq_model.layers[0], 'output')
False
>>> seq_model.inputs is None
True
>>> _ = seq_model(Input(shape=(10,))) # <--- Feed input KTensor to the model
>>> seq_model.layers[0].output
<KerasTensor: shape=(None, 1) dtype=float32 (created by layer 'dense')>
>>> seq_model.inputs
[<KerasTensor: shape=(None, 10) dtype=float32 (created by layer 'dense_input')>]
Once you've obtained these internal KTensors, everything becomes trivial. To extract the KTensor right before the last two layers and forward it to two different branches to form a new functional model, do
>>> intermediate_ktensor = seq_model.layers[-3].output
>>> branch_1_output = Dense(20)(intermediate_ktensor)
>>> branch_2_output = Dense(30)(intermediate_ktensor)
>>> branched_model = Model(inputs=seq_model.inputs, outputs=[branch_1_output, branch_2_output])
Note that the shapes of the KTensors you fed at the very first step must conform to the shape requirements of the layers that receive them. In my example, the input KTensor would be fed to Dense(1) layer. As Dense requires the input shape to be defined in the last dimension, the input KTensor could be of shapes, e.g., (10,) or (None,10) but not (None,) or (10, None).
I want to use the Segmentation_Models UNet (with ResNet34 Backbone) for uncertainty estimation, so i want to add some Dropout Layers into the upsampling part. The Model is not Sequential, so i think i have to reconnect some outputs to the new Dropout Layers and the following layer inputs to the output of Dropout.
I'm not sure, whats the right way to do this. I'm currently trying this:
# create model
model = sm.Unet('resnet34', classes=1, activation='sigmoid', encoder_weights='imagenet')
# define optimizer, loss and metrics
optim = tf.keras.optimizers.Adam(0.001)
total_loss = sm.losses.binary_focal_dice_loss # or sm.losses.categorical_focal_dice_loss
metrics = ['accuracy', sm.metrics.IOUScore(threshold=0.5), sm.metrics.FScore(threshold=0.5)]
# get input layer
updated_model_layers = model.layers[0]
# iterate over old model and add Dropout after given Convolutions
for layer in model.layers[1:]:
# take old layer and add to new Model
updated_model_layers = layer(updated_model_layers.output)
# after some convolutions, add Dropout
if layer.name in ['decoder_stage0b_conv', 'decoder_stage0a_conv', 'decoder_stage1a_conv', 'decoder_stage1b_conv', 'decoder_stage2a_conv',
'decoder_stage2b_conv', 'decoder_stage3a_conv', 'decoder_stage3b_conv', 'decoder_stage4a_conv']:
if (uncertain):
# activate dropout in predictions
next_layer = Dropout(0.1) (updated_model_layers, training=True)
else:
# add dropout layer
next_layer = Dropout(0.1) (updated_model_layers)
# add reconnected Droput Layer
updated_model_layers = next_layer
model = Model(model.layers[0], updated_model_layers)
This throws the following Error: AttributeError: 'KerasTensor' object has no attribute 'output'
But I think I'm doing something wrong. Does anybody have a Solution for this?
There is a problem with the Resnet model you are using. It is complex and has Add and Concatenate layers (residual layers, I guess), which take as input a list of tensors from several "subnetworks". In other words, the network is not linear, so you can't walk through the model with a simple loop.
Regarding your error, in the loop of your code: layer is a layer and updated_model_layers is a tensor (functional API). Therefore, updated_model_layers.output does not exist. You confuse the two a bit
In building a model that uses TensorFlow 2.0 Attention I followed the example given in the TF docs. https://www.tensorflow.org/api_docs/python/tf/keras/layers/Attention
The last line in the example is
input_layer = tf.keras.layers.Concatenate()(
[query_encoding, query_value_attention])
Then the example has the comment
# Add DNN layers, and create Model.
# ...
So it seemed logical to do this
model = tf.keras.Sequential()
model.add(input_layer)
This produces the error
TypeError: The added layer must be an instance of class Layer.
Found: Tensor("concatenate/Identity:0", shape=(None, 200), dtype=float32)
UPDATE (after #thushv89 response)
What I am trying to do in the end is add an attention layer in the following model which works well (or convert it to an attention model).
model = tf.keras.Sequential()
model.add(layers.Embedding(vocab_size, embedding_nodes, input_length=max_length))
model.add(layers.LSTM(20))
#add attention here?
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error', metrics=['accuracy'])
My data looks like this
4912,5059,5079,0
4663,5145,5146,0
4663,5145,5146,0
4840,5117,5040,0
Where the first three columns are the inputs and the last column is binary and the goal is classification. The data was prepared similarly to this example with a similar purpose, binary classification. https://machinelearningmastery.com/use-word-embedding-layers-deep-learning-keras/
So, first thing is Keras has three APIs when it comes to creating models.
Sequential - (Which is what you're doing here)
Functional - (Which is what I'm using in the solution)
Subclassing - Creating Python classes to represent custom models/layers
The way the model created in the tutorial is not to be used with sequential models but a model from the Functional API. So you got to do the following. Note that, I've taken the liberty of defining the dense layers with arbitrary parameters (e.g. number of output classes, which you can change as needed).
import tensorflow as tf
# Variable-length int sequences.
query_input = tf.keras.Input(shape=(None,), dtype='int32')
value_input = tf.keras.Input(shape=(None,), dtype='int32')
# ... the code in the middle
# Concatenate query and document encodings to produce a DNN input layer.
input_layer = tf.keras.layers.Concatenate()(
[query_encoding, query_value_attention])
# Add DNN layers, and create Model.
# ...
dense_out = tf.keras.layers.Dense(50, activation='relu')(input_layer)
pred = tf.keras.layers.Dense(10, activation='softmax')(dense_out)
model = tf.keras.models.Model(inputs=[query_input, value_input], outputs=pred)
model.summary()
I have a Keras LSTM multitask model that performs two tasks. One is a sequence tagging task (so I predict a label per token). The other is a global classification task over the whole sequence using a CNN that is stacked on the hidden states of the LSTM.
In my setup (don't ask why) I only need the CNN task during training, but the labels it predicts have no use on the final product. So, on Keras, one can train a LSTM model without especifiying the input sequence lenght. like this:
l_input = Input(shape=(None,), dtype="int32", name=input_name)
However, if I add the CNN stacked on the LSTM hidden states I need to set a fixed sequence length for the model.
l_input = Input(shape=(timesteps_size,), dtype="int32", name=input_name)
The problem is that once I have trained the model with a fixed timestep_size I can no longer use it to predict longer sequences.
In other frameworks this is not a problem. But in Keras, I cannot get rid of the CNN and change the expected input shape of the model once it has been trained.
Here is a simplified version of the model
l_input = Input(shape=(timesteps_size,), dtype="int32")
l_embs = Embedding(len(input.keys()), 100)(l_input)
l_blstm = Bidirectional(GRU(300, return_sequences=True))(l_embs)
# Sequential output
l_out1 = TimeDistributed(Dense(len(labels.keys()),
activation="softmax"))(l_blstm)
# Global output
conv1 = Conv1D( filters=5 , kernel_size=10 )( l_embs )
conv1 = Flatten()(MaxPooling1D(pool_size=2)( conv1 ))
conv2 = Conv1D( filters=5 , kernel_size=8 )( l_embs )
conv2 = Flatten()(MaxPooling1D(pool_size=2)( conv2 ))
conv = Concatenate()( [conv1,conv2] )
conv = Dense(50, activation="relu")(conv)
l_out2 = Dense( len(global_labels.keys()) ,activation='softmax')(conv)
model = Model(input=input, output=[l_out1, l_out2])
optimizer = Adam()
model.compile(optimizer=optimizer,
loss="categorical_crossentropy",
metrics=["accuracy"])
I would like to know if anyone here has faced this issue, and if there are any solutions to delete layers from a model after training and, more important, how to reshape input layer sizes after training.
Thanks
Variable timesteps length makes a problem not because of using convolution layers (actually the good thing about convolution layers is that they do not depend on the input size). Rather, using Flatten layers cause the problem here since they need an input with specified size. Instead, you can use Global Pooling layers. Further, I think stacking convolution and pooling layers on top of each other might give a better result instead of using two separate convolution layers and merging them (although this depends on the specific problem and dataset you are working on). So considering these two points it might be better to write your model like this:
# Global output
conv1 = Conv1D(filters=16, kernel_size=5)(l_embs)
conv1 = MaxPooling1D(pool_size=2)(conv1)
conv2 = Conv1D(filters=32, kernel_size=5)(conv1)
conv2 = MaxPooling1D(pool_size=2)(conv2)
gpool = GlobalAveragePooling1D()(conv2)
x = Dense(50, activation="relu")(gpool)
l_out2 = Dense(len(global_labels.keys()), activation='softmax')(x)
model = Model(inputs=l_input, outputs=[l_out1, l_out2])
You may need to tune the number of conv+maxpool layers, number of filters, kernel size and even add dropout or batch normalization layers.
As a side note, using TimeDistributed on a Dense layer is redundant as the Dense layer is applied on the last axis.
Imagine a fully-connected neural network with its last two layers of the following structure:
[Dense]
units = 612
activation = softplus
[Dense]
units = 1
activation = sigmoid
The output value of the net is 1, but I'd like to know what the input x to the sigmoidal function was (must be some high number, since sigm(x) is 1 here).
Folllowing indraforyou's answer I managed to retrieve the output and weights of Keras layers:
outputs = [layer.output for layer in model.layers[-2:]]
functors = [K.function( [model.input]+[K.learning_phase()], [out] ) for out in outputs]
test_input = np.array(...)
layer_outs = [func([test_input, 0.]) for func in functors]
print layer_outs[-1][0] # -> array([[ 1.]])
dense_0_out = layer_outs[-2][0] # shape (612, 1)
dense_1_weights = model.layers[-1].weights[0].get_value() # shape (1, 612)
dense_1_bias = model.layers[-1].weights[1].get_value()
x = np.dot(dense_0_out, dense_1_weights) + dense_1_bias
print x # -> -11.7
How can x be a negative number? In that case the last layers output should be a number closer to 0.0 than 1.0. Are dense_0_out or dense_1_weights the wrong outputs or weights?
Since you're using get_value(), I'll assume that you're using Theano backend. To get the value of the node before the sigmoid activation, you can traverse the computation graph.
The graph can be traversed starting from outputs (the result of some computation) down to its inputs using the owner field.
In your case, what you want is the input x of the sigmoid activation op. The output of the sigmoid op is model.output. Putting these together, the variable x is model.output.owner.inputs[0].
If you print out this value, you'll see Elemwise{add,no_inplace}.0, which is an element-wise addition op. It can be verified from the source code of Dense.call():
def call(self, inputs):
output = K.dot(inputs, self.kernel)
if self.use_bias:
output = K.bias_add(output, self.bias)
if self.activation is not None:
output = self.activation(output)
return output
The input to the activation function is the output of K.bias_add().
With a small modification of your code, you can get the value of the node before activation:
x = model.output.owner.inputs[0]
func = K.function([model.input] + [K.learning_phase()], [x])
print func([test_input, 0.])
For anyone using TensorFlow backend: use x = model.output.op.inputs[0] instead.
I can see a simple way just changing a little the model structure. (See at the end how to use the existing model and change only the ending).
The advantages of this method are:
You don't have to guess if you're doing the right calculations
You don't need to care about the dropout layers and how to implement a dropout calculation
This is a pure Keras solution (applies to any backend, either Theano or Tensorflow).
There are two possible solutions below:
Option 1 - Create a new model from start with the proposed structure
Option 2 - Reuse an existing model changing only its ending
Model structure
You could just have the last dense separated in two layers at the end:
[Dense]
units = 612
activation = softplus
[Dense]
units = 1
#no activation
[Activation]
activation = sigmoid
Then you simply get the output of the last dense layer.
I'd say you should create two models, one for training, the other for checking this value.
Option 1 - Building the models from the beginning:
from keras.models import Model
#build the initial part of the model the same way you would
#add the Dense layer without an activation:
#if using the functional Model API
denseOut = Dense(1)(outputFromThePreviousLayer)
sigmoidOut = Activation('sigmoid')(denseOut)
#if using the sequential model - will need the functional API
model.add(Dense(1))
sigmoidOut = Activation('sigmoid')(model.output)
Create two models from that, one for training, one for checking the output of dense:
#if using the functional API
checkingModel = Model(yourInputs, denseOut)
#if using the sequential model:
checkingModel = model
trainingModel = Model(checkingModel.inputs, sigmoidOut)
Use trianingModel for training normally. The two models share weights, so training one is training the other.
Use checkingModel just to see the outputs of the Dense layer, using checkingModel.predict(X)
Option 2 - Building this from an existing model:
from keras.models import Model
#find the softplus dense layer and get its output:
softplusOut = oldModel.layers[indexForSoftplusLayer].output
#or should this be the output from the dropout? Whichever comes immediately after the last Dense(1)
#recreate the dense layer
outDense = Dense(1, name='newDense', ...)(softPlusOut)
#create the new model
checkingModel = Model(oldModel.inputs,outDense)
It's important, since you created a new Dense layer, to get the weights from the old one:
wgts = oldModel.layers[indexForDense].get_weights()
checkingModel.get_layer('newDense').set_weights(wgts)
In this case, training the old model will not update the last dense layer in the new model, so, let's create a trainingModel:
outSigmoid = Activation('sigmoid')(checkingModel.output)
trainingModel = Model(checkingModel.inputs,outSigmoid)
Use checkingModel for checking the values you want with checkingModel.predict(X). And train the trainingModel.
So this is for fellow googlers, the working of the keras API has changed significantly since the accepted answer was posted. The working code for extracting a layer's output before activation (for tensorflow backend) is:
model = Your_Keras_Model()
the_tensor_you_need = model.output.op.inputs[0] #<- this is indexable, if there are multiple inputs to this node then you can find it with indexing.
In my case, the final layer was a dense layer with activation softmax, so the tensor output I needed was <tf.Tensor 'predictions/BiasAdd:0' shape=(?, 1000) dtype=float32>.
(TF backend)
Solution for Conv layers.
I had the same question, and to rewrite a model's configuration was not an option.
The simple hack would be to perform the call function manually. It gives control over the activation.
Copy-paste from the Keras source, with self changed to layer. You can do the same with any other layer.
def conv_no_activation(layer, inputs, activation=False):
if layer.rank == 1:
outputs = K.conv1d(
inputs,
layer.kernel,
strides=layer.strides[0],
padding=layer.padding,
data_format=layer.data_format,
dilation_rate=layer.dilation_rate[0])
if layer.rank == 2:
outputs = K.conv2d(
inputs,
layer.kernel,
strides=layer.strides,
padding=layer.padding,
data_format=layer.data_format,
dilation_rate=layer.dilation_rate)
if layer.rank == 3:
outputs = K.conv3d(
inputs,
layer.kernel,
strides=layer.strides,
padding=layer.padding,
data_format=layer.data_format,
dilation_rate=layer.dilation_rate)
if layer.use_bias:
outputs = K.bias_add(
outputs,
layer.bias,
data_format=layer.data_format)
if activation and layer.activation is not None:
outputs = layer.activation(outputs)
return outputs
Now we need to modify the main function a little. First, identify the layer by its name. Then retrieve activations from the previous layer. And at last, compute the output from the target layer.
def get_output_activation_control(model, images, layername, activation=False):
"""Get activations for the input from specified layer"""
inp = model.input
layer_id, layer = [(n, l) for n, l in enumerate(model.layers) if l.name == layername][0]
prev_layer = model.layers[layer_id - 1]
conv_out = conv_no_activation(layer, prev_layer.output, activation=activation)
functor = K.function([inp] + [K.learning_phase()], [conv_out])
return functor([images])
Here is a tiny test. I'm using VGG16 model.
a_relu = get_output_activation_control(vgg_model, img, 'block4_conv1', activation=True)[0]
a_no_relu = get_output_activation_control(vgg_model, img, 'block4_conv1', activation=False)[0]
print(np.sum(a_no_relu < 0))
> 245293
Set all negatives to zero to compare with the results retrieved after an embedded in VGG16 ReLu operation.
a_no_relu[a_no_relu < 0] = 0
print(np.allclose(a_relu, a_no_relu))
> True
easy way to define new layer with new activation function:
def change_layer_activation(layer):
if isinstance(layer, keras.layers.Conv2D):
config = layer.get_config()
config["activation"] = "linear"
new = keras.layers.Conv2D.from_config(config)
elif isinstance(layer, keras.layers.Dense):
config = layer.get_config()
config["activation"] = "linear"
new = keras.layers.Dense.from_config(config)
weights = [x.numpy() for x in layer.weights]
return new, weights
I had the same problem but none of the other answers worked for me. Im using a newer version of Keras with Tensorflow so some answers dont work now. Also the structure of the model is given so i can't change it easely. The general idea is to create a copy of the original model that will work exactly like the original one but spliting the activation from the outputs layers. Once this is done we can easely access the outputs values before the activation is applied.
First we will create a copy of the original model but with no activation on the outputs layers. This will be done using Keras clone_model function (See Docs).
from tensorflow.keras.models import clone_model
from tensorflow.keras.layers import Activation
original_model = get_model()
def f(layer):
config = layer.get_config()
if not isinstance(layer, Activation) and layer.name in original_model.output_names:
config.pop('activation', None)
layer_copy = layer.__class__.from_config(config)
return layer_copy
copy_model = clone_model(model, clone_function=f)
This alone will only make a clone with new weights so we must copy the original_model weights to the new one:
copy_model.build(original_model.input_shape)
copy_model.set_weights(original_model.get_weights())
Now we will add the activations layers:
from tensorflow.keras.models import Model
old_outputs = [ original_model.get_layer(name=name) for name in copy_model.output_names ]
new_outputs = [ Activation(old_output.activation)(output) if old_output.activation else output
for output, old_output in zip(copy_model.outputs, old_outputs) ]
copy_model = Model(copy_model.inputs, new_outputs)
Finally we could create a new model whose evaluation will be the outputs with no activation applied:
no_activation_outputs = [ copy_model.get_layer(name=name).output for name in original_model.output_names ]
no_activation_model = Model(copy.inputs, no_activation_outputs)
Now we could use copy_model like the original_model and no_activation_model to access pre-activation outputs. Actually you could even modify the code to split a custom set of layers instead of the outputs.