Let us say I have a simple NN model classifying MNIST like so:
model = models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
And I extract the output tensor from the 2nd layer, i.e. Dense layer (after activation relu) like so:
mid_layer = model.get_layer("dense")
get_output = K.function([model.layers[0].input], [mid_layer.output])
For some particular data input x, I extract the output tensor activation values now:
mid_layer_outputs = get_output([test_images[0:1]])
And make some changes to it:
mid_layer_outputs = ...
And now, I would like the model to continue from this layer's modified output values and predict results accordingly. How would I do that?
I tried to construct another K.function([mid_layer_outputs], model.layers[-1].output) starting from this layer to the end, but I got the following error:
AttributeError: 'numpy.ndarray' object has no attribute 'op'
which is understandable as it cannot continue predictions with a NP array object instead of a model layer object. How do I do this?
I figured it out - I need to specify the next layer inputs in the second K.function and then pass in the modified outputs to it:
get_pred = K.function([model.layers[2].input], model.layers[-1].output)
pred = get_pred([mid_layer_outputs])
According to this answer, K.function runs the computation graph we created in the code, taking input from the first parameter and extracting the number of outputs as per the layers mentioned. So K.function([mid_layer_outputs], model.layers[-1].output) was not working earlier because instead of passing the layer for K.function to work with and connect to the output, I was passing in the values directly.
Related
I am using the tf.keras.applications.efficientnet_v2.EfficientNetV2L model and I want to edit the last layers of the model to make the model a regression and classification layer. However, I am unsure of how to edit this model because it is not a linear sequential model, and thus I cannot do:
for layer in model.layers[:-2]:
model.add(layer)
as certain layers of the model have multiple inputs. Is there a way of preserving the model except the last layer so the model will diverge before the last layer?
efficentnet[:-2]
|
|
/ \
/ \
/ \
output1 output2
To enable a functional model to have a classification layer and a regression layer, you can change the model as follows. Note, there are various ways to achieve this, and this is one of them.
import tensorflow as tf
from tensorflow import keras
prev_model = keras.applications.EfficientNetV2B0(
input_tensor=keras.Input(shape=(224, 224, 3)),
include_top=False
)
Next, we will write our expected head layers, shown below.
neck_branch = keras.Sequential(
[
# we can add more layers i.e. batch norm, etc.
keras.layers.GlobalAveragePooling2D()
],
name='neck_head'
)
classification_head = keras.Sequential(
[
keras.layers.Dense(10, activation='softmax')
],
name='classification_head'
)
regression_head = keras.Sequential(
[
keras.layers.Dense(1, activation=None)
],
name='regression_head'
)
Now, we can build the desired model.
x = neck_branch(prev_model.output)
output_a = classification_head(x)
output_b = regression_head(x)
final_model = keras.Model(prev_model.inputs, [output_a, output_b])
Test
keras.utils.plot_model(final_model, expand_nested=True)
# OK
final_model(tf.ones(shape=(1, 224, 224, 3)))
# OK
Update
Based on your comments,
how you would tackle the problem if the previous model was imported from a h5 file since there I cannot declare the top layer not to be included?
If I understand your query, you have a saved model (in .h5 format) with top layers. In that case, you don't have include_top params to exclude the top branch. So, what you can do is remove the top branch of your saved model first. Here is how,
# a saved model with top layers
prev_model = keras.models.load_model('model.h5')
prev_model_with_top_remove = keras.Model(
prev_model.input ,
prev_model.layers[-4].output
)
prev_model_with_top_remove.summary()
This prev_model.layers[-4].output will remove the top branch. In the end, you will give similar output as we can get with include_top=True. Check the model summary to visually inspect.
Keras' functional API works by linking Keras tensors (hereby called KTensor) and not your everyday TF tensors.
Therefore, the first thing you need to do is feeding KTensors (created using tf.keras.Input) of proper shapes to the original model. This will trigger the forward chain, prompting the model's layers to produce their own output KTensors that are properly linked to the input KTensors. After the forward pass,
The layers will store their received/produced KTensors in their input and output attributes.
The model itself will also store the KTensors you fed to it and the corresponding final output KTensors in its inputs and outputs attributes (note the s).
Like so,
>>> from tensorflow.keras import Input
>>> from tensorflow.keras.layers import Dense
>>> from tensorflow.keras.models import Sequential, Model
>>> seq_model = Sequential()
>>> seq_model.add(Dense(1))
>>> seq_model.add(Dense(2))
>>> seq_model.add(Dense(3))
>>> hasattr(seq_model.layers[0], 'output')
False
>>> seq_model.inputs is None
True
>>> _ = seq_model(Input(shape=(10,))) # <--- Feed input KTensor to the model
>>> seq_model.layers[0].output
<KerasTensor: shape=(None, 1) dtype=float32 (created by layer 'dense')>
>>> seq_model.inputs
[<KerasTensor: shape=(None, 10) dtype=float32 (created by layer 'dense_input')>]
Once you've obtained these internal KTensors, everything becomes trivial. To extract the KTensor right before the last two layers and forward it to two different branches to form a new functional model, do
>>> intermediate_ktensor = seq_model.layers[-3].output
>>> branch_1_output = Dense(20)(intermediate_ktensor)
>>> branch_2_output = Dense(30)(intermediate_ktensor)
>>> branched_model = Model(inputs=seq_model.inputs, outputs=[branch_1_output, branch_2_output])
Note that the shapes of the KTensors you fed at the very first step must conform to the shape requirements of the layers that receive them. In my example, the input KTensor would be fed to Dense(1) layer. As Dense requires the input shape to be defined in the last dimension, the input KTensor could be of shapes, e.g., (10,) or (None,10) but not (None,) or (10, None).
Suppose that I have a Keras Input Layer "input_layer_A" (in Python)
input_layer_A = tf.keras.layers.Input(shape=(10, ), name="input_A") # shape = (None, 10)
I also have a dense layer
dense_layer_1 = tf.keras.layers.Dense(units=1, activation='tanh', name="dense_layer_1")
Then I create an output as:
out1 = dense_layer_1(input_layer_A)
I want to modify/edit a copy of "input_layer_A" (as an example) as follows: (syntax on how to edit the Input Layer properly is a part of the question)
input_layer_B = input_layer_A #Copy the input_layer_A to create new input_layer_B
input_layer_B[0][5] = input_layer_B[0][4] #Replace 5th element with its 4th element
input_layer_B[0][4] = out1 #Replace the 4th element with previous output out1
Is it the correct way to edit/modify the input Layer? If not what is the proper way?
Then I feed “input_layer_B” to “dense_layer_1” to create a new output “out2” (kind of a recursion and it can be applied in a for loop as well)
out2 = dense_layer_1(input_layer_B)
Finally create the model
model = Model(inputs=[input_layer_A], outputs = [out1, out2])
When I run the model I get an error in the form of (related with out2: this is where the graph becomes disconnected, no problem for out1, which means there is something wrong about how I modify the "input_layer_B")
ValueError: Graph disconnected: cannot obtain value for tensor
What is the correct way to modify/edit an Input Layer so that I can still use it to feed successfully to "dense_layer_1" to end up with a connected and proper graph?
Any help, suggestion or an alternative method to accomplish what I want is appreciated.
Thank you
PS: My motivation for building such a recursive network is to model a NARX Neural Network with feedback to estimate multi-step ahead output of a dynamical system. So there can be n-many outputs in theory
Try this code:
import tensorflow as tf
input_layer_A = tf.keras.layers.Input(shape=(10, ), name="input_A") # shape = (None, 10)
dense_layer_1 = tf.keras.layers.Dense(units=1, activation='tanh', name="dense_layer_1")
out1 = dense_layer_1(input_layer_A)
out2 = tf.concat([input_layer_A[:, :4], out1, input_layer_A[:, 4][..., tf.newaxis], input_layer_A[:, 6:]], -1)
out2 = dense_layer_1(out2)
model = tf.keras.Model(inputs=[input_layer_A], outputs = [out1, out2])
inp = tf.random.uniform((1, 10))
pred = model(inp)
Note, that I've changed dimension of Dense layer (it should be 1 to get the following op possible):
input_layer_B[0][4] = out1
In building a model that uses TensorFlow 2.0 Attention I followed the example given in the TF docs. https://www.tensorflow.org/api_docs/python/tf/keras/layers/Attention
The last line in the example is
input_layer = tf.keras.layers.Concatenate()(
[query_encoding, query_value_attention])
Then the example has the comment
# Add DNN layers, and create Model.
# ...
So it seemed logical to do this
model = tf.keras.Sequential()
model.add(input_layer)
This produces the error
TypeError: The added layer must be an instance of class Layer.
Found: Tensor("concatenate/Identity:0", shape=(None, 200), dtype=float32)
UPDATE (after #thushv89 response)
What I am trying to do in the end is add an attention layer in the following model which works well (or convert it to an attention model).
model = tf.keras.Sequential()
model.add(layers.Embedding(vocab_size, embedding_nodes, input_length=max_length))
model.add(layers.LSTM(20))
#add attention here?
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error', metrics=['accuracy'])
My data looks like this
4912,5059,5079,0
4663,5145,5146,0
4663,5145,5146,0
4840,5117,5040,0
Where the first three columns are the inputs and the last column is binary and the goal is classification. The data was prepared similarly to this example with a similar purpose, binary classification. https://machinelearningmastery.com/use-word-embedding-layers-deep-learning-keras/
So, first thing is Keras has three APIs when it comes to creating models.
Sequential - (Which is what you're doing here)
Functional - (Which is what I'm using in the solution)
Subclassing - Creating Python classes to represent custom models/layers
The way the model created in the tutorial is not to be used with sequential models but a model from the Functional API. So you got to do the following. Note that, I've taken the liberty of defining the dense layers with arbitrary parameters (e.g. number of output classes, which you can change as needed).
import tensorflow as tf
# Variable-length int sequences.
query_input = tf.keras.Input(shape=(None,), dtype='int32')
value_input = tf.keras.Input(shape=(None,), dtype='int32')
# ... the code in the middle
# Concatenate query and document encodings to produce a DNN input layer.
input_layer = tf.keras.layers.Concatenate()(
[query_encoding, query_value_attention])
# Add DNN layers, and create Model.
# ...
dense_out = tf.keras.layers.Dense(50, activation='relu')(input_layer)
pred = tf.keras.layers.Dense(10, activation='softmax')(dense_out)
model = tf.keras.models.Model(inputs=[query_input, value_input], outputs=pred)
model.summary()
I don't understand what's happening in this code:
def construct_model(use_imagenet=True):
# line 1: how do we keep all layers of this model ?
model = keras.applications.InceptionV3(include_top=False, input_shape=(IMG_SIZE, IMG_SIZE, 3),
weights='imagenet' if use_imagenet else None) # line 1: how do we keep all layers of this model ?
new_output = keras.layers.GlobalAveragePooling2D()(model.output)
new_output = keras.layers.Dense(N_CLASSES, activation='softmax')(new_output)
model = keras.engine.training.Model(model.inputs, new_output)
return model
Specifically, my confusion is, when we call the last constructor
model = keras.engine.training.Model(model.inputs, new_output)
we specify input layer and output layer, but how does it know we want all the other layers to stay?
In other words, we append the new_output layer to the pre-trained model we load in line 1, that is the new_output layer, and then in the final constructor (final line), we just create and return a model with a specified input and output layers, but how does it know what other layers we want in between?
Side question 1): What is the difference between keras.engine.training.Model and keras.models.Model?
Side question 2): What exactly happens when we do new_layer = keras.layers.Dense(...)(prev_layer)? Does the () operation return new layer, what does it do exactly?
This model was created using the Functional API Model
Basically it works like this (perhaps if you go to the "side question 2" below before reading this it may get clearer):
You have an input tensor (you can see it as "input data" too)
You create (or reuse) a layer
You pass the input tensor to a layer (you "call" a layer with an input)
You get an output tensor
You keep working with these tensors until you have created the entire graph.
But this hasn't created a "model" yet. (One you can train and use other things).
All you have is a graph telling which tensors go where.
To create a model, you define it's start end end points.
In the example.
They take an existing model: model = keras.applications.InceptionV3(...)
They want to expand this model, so they get its output tensor: model.output
They pass this tensor as the input of a GlobalAveragePooling2D layer
They get this layer's output tensor as new_output
They pass this as input to yet another layer: Dense(N_CLASSES, ....)
And get its output as new_output (this var was replaced as they are not interested in keeping its old value...)
But, as it works with the functional API, we don't have a model yet, only a graph. In order to create a model, we use Model defining the input tensor and the output tensor:
new_model = Model(old_model.inputs, new_output)
Now you have your model.
If you use it in another var, as I did (new_model), the old model will still exist in model. And these models are sharing the same layers, in a way that whenever you train one of them, the other gets updated as well.
Question: how does it know what other layers we want in between?
When you do:
outputTensor = SomeLayer(...)(inputTensor)
you have a connection between the input and output. (Keras will use the inner tensorflow mechanism and add these tensors and nodes to the graph). The output tensor cannot exist without the input. The entire InceptionV3 model is connected from start to end. Its input tensor goes through all the layers to yield an ouptut tensor. There is only one possible way for the data to follow, and the graph is the way.
When you get the output of this model and use it to get further outputs, all your new outputs are connected to this, and thus to the first input of the model.
Probably the attribute _keras_history that is added to the tensors is closely related to how it tracks the graph.
So, doing Model(old_model.inputs, new_output) will naturally follow the only way possible: the graph.
If you try doing this with tensors that are not connected, you will get an error.
Side question 1
Prefer to import from "keras.models". Basically, this module will import from the other module:
https://github.com/keras-team/keras/blob/master/keras/models.py
Notice that the file keras/models.py imports Model from keras.engine.training. So, it's the same thing.
Side question 2
It's not new_layer = keras.layers.Dense(...)(prev_layer).
It is output_tensor = keras.layers.Dense(...)(input_tensor).
You're doing two things in the same line:
Creating a layer - with keras.layers.Dense(...)
Calling the layer with an input tensor to get an output tensor
If you wanted to use the same layer with different inputs:
denseLayer = keras.layers.Dense(...) #creating a layer
output1 = denseLayer(input1) #calling a layer with an input and getting an output
output2 = denseLayer(input2) #calling the same layer on another input
output3 = denseLayer(input3) #again
Bonus - Creating a functional model that is equal to a sequential model
If you create this sequential model:
model = Sequential()
model.add(Layer1(...., input_shape=some_shape))
model.add(Layer2(...))
model.add(Layer3(...))
You're doing exactly the same as:
inputTensor = Input(some_shape)
outputTensor = Layer1(...)(inputTensor)
outputTensor = Layer2(...)(outputTensor)
outputTensor = Layer3(...)(outputTensor)
model = Model(inputTensor,outputTensor)
What is the difference?
Well, functional API models are totally free to be build anyway you want. You can create branches:
out1 = Layer1(..)(inputTensor)
out2 = Layer2(..)(inputTensor)
You can join tensors:
joinedOut = Concatenate()([out1,out2])
With this, you can create anything you want with all kinds of fancy stuff, branches, gates, concatenations, additions, etc., which you can't do with a sequential model.
In fact, a Sequential model is also a Model, but created for a quick use in models without branches.
There's this way of building a model from a pretrained one that you may build upon.
See https://keras.io/applications/#fine-tune-inceptionv3-on-a-new-set-of-classes:
base_model = InceptionV3(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(200, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
Each time a layer is added by an op like "x=Dense(...", information about the computational graph is updated. You can type this interactively to see what it contains:
x.graph.__dict__
You can see there's all kinds of attributes, including about previous and next layers. These are internal implementation details and possibly change over time.
I have an ANN model and i am trying to get the activation values of all the hidden layers. I have trained the network with a 90dim matrix and i have 1 hidden layers which is 150dim. My model structure is one 90dim input layer, one hidden layer of 150dim and one output of 90dim. I have trained and tested the data. After that i m using the .predict() function to predict output using my test dataset. I am feeding the predicted output as the next input and so on. Now i want to get the activation value of the hidden layers of the predict function. I am using the following code to achieve it but its not working:
write_predict_data = pd.ExcelWriter("/home/workstation/ANN/prediction_data_2.xlsx",engine="xlsxwriter")
write_activations_data = pd.ExcelWriter("/home/rianzaman/Downloads/activition_of_hidden_node_2.xlsx",engine="xlsxwriter")
for i in range(0, 200):
print("Predicting ...",)
next_prediction = my_model.predict(X_test, 1,)
output_file_data = pd.DataFrame(next_prediction)
output_file_data.to_excel(write_predict_data, sheet_name='Sheet1')
#To get activation
get_activations = theano.function([my_model.layers[0].input], my_model.layers[1].get_output(train=False),
allow_input_downcast=True)
activations = get_activations(next_prediction)
output_file_data_activation = pd.DataFrame(activations)
output_file_data_activation.to_excel(write_activations_data, sheet_name='Sheet1')
X_test = next_prediction
write_predict_data.save()
When i m running the code i m getting a 90dim output which basically i think is the output layers dataset.
Can anybody tell me what's wrong with the code?
get_activations(next_prediction) should be get_activations(X_test) - you want to pass inputs to get_activations, not labels.