I'm new in Keras and Neural Networks. I'm writing a thesis and trying to create a SimpleRNN in Keras as it is illustrated below:
As it is shown in the picture, I need to create a model with 4 inputs + 2 outputs and with any number of neurons in the hidden layer.
This is my code:
model = Sequential()
model.add(SimpleRNN(4, input_shape=(1, 4), activation='sigmoid', return_sequences=True))
model.add(Dense(2))
model.compile(loss='mean_absolute_error', optimizer='adam')
model.fit(data, target, epochs=5000, batch_size=1, verbose=2)
predict = model.predict(data)
1) Does my model implement the graph?
2) Is it possible to specify connections between neurons Input and Hidden layers or Output and Input layers?
Explanation:
I am going to use backpropagation to train my network.
I have input and target values
Input is a 10*4 array and target is a 10*2 array which I then reshape:
input = input.reshape((10, 1, 4))
target = target.reshape((10, 1, 2))
It is crucial for to able to specify connections between neurons as they can be different. For instance, here you can have an example:
1) Not really. But I'm not sure about what exactly you want in that graph. (Let's see how Keras recurrent layers work below)
2) Yes, it's possible to connect every layer to every layer, but you can't use Sequential for that, you must use Model.
This answer may not be what you're looking for. What exactly do you want to achieve? What kind of data you have, what output you expect, what is the model supposed to do? etc...
1 - How does a recurrent layer work?
Documentation
Recurrent layers in keras work with an "input sequence" and may output a single result or a sequence result. It's recurrency is totally contained in it and doesn't interact with other layers.
You should have inputs with shape (NumberOrExamples, TimeStepsInSequence, DimensionOfEachStep). This means input_shape=(TimeSteps,Dimension).
The recurrent layer will work internally with each time step. The cycles happen from step to step and this behavior is totally invisible. The layer seems to work just like any other layer.
This doesn't seem to be what you want. Unless you have a "sequence" to input. The only way I know if using recurrent layers in Keras that is similar to you graph is when you have a segment of a sequence and want to predict the next step. If that's the case, see some examples by searching for "predicting the next element" in Google.
2 - How to connect layers using Model:
Instead of adding layers to a sequential model (which will always follow a straight line), start using the layers independently, starting from an input tensor:
from keras.layers import *
from keras.models import Model
inputTensor = Input(shapeOfYourInput) #it seems the shape is "(2,)", but we must see your data.
#A dense layer with 2 outputs:
myDense = Dense(2, activation=ItsAGoodIdeaToUseAnActivation)
#The output tensor of that layer when you give it the input:
denseOut1 = myDense(inputTensor)
#You can do as many cycles as you want here:
denseOut2 = myDense(denseOut1)
#you can even make a loop:
denseOut = Activation(ItsAGoodIdeaToUseAnActivation)(inputTensor) #you may create a layer and call it with the input tensor in just one line if you're not going to reuse the layer
#I'm applying this activation layer here because since we defined an activation for the dense layer and we're going to cycle it, it's not going to behave very well receiving huge values in the first pass and small values the next passes....
for i in range(n):
denseOut = myDense(denseOut)
This kind of usage allows you to create any kind of model, with branches, alternative ways, connections from anywhere to anywhere, provided you respect the shape rules. For a cycle like that, inputs and outputs must have the same shape.
At the end, you must define a model from one or many inputs to one or many outputs (you must have training data to match all inputs and outputs you choose):
model = Model(inputTensor,denseOut)
But notice that this model is static. If you want to change the number of cycles, you will have to create a new model.
In this case, it would be as simple as repeating the loop step denseOut = myDense(denseOut) and creating another model2=Model(inputTensor,denseOut).
3 - Trying to create something like the image below:
I am supposing C and F will participate in all iterations. If not,
Since there are four actual inputs, and we are going to treat them all separately, let's create 4 inputs instead, all like (1,).
Your input array should be divided in 4 arrays, all being (10,1).
from keras.models import Model
from keras.layers import *
inputA = Input((1,))
inputB = Input((1,))
inputC = Input((1,))
inputF = Input((1,))
Now the layers N2 and N3, that will be used only once, since C and F are constant:
outN2 = Dense(1)(inputC)
outN3 = Dense(1)(inputF)
Now the recurrent layer N1, without giving it the tensors yet:
layN1 = Dense(1)
For the loop, let's create outA and outB. They start as actual inputs and will be given to the layer N1, but in the loop they will be replaced
outA = inputA
outB = inputB
Now in the loop, let's do the "passes":
for i in range(n):
#unite A and B in one
inputAB = Concatenate()([outA,outB])
#pass through N1
outN1 = layN1(inputAB)
#sum results of N1 and N2 into A
outA = Add()([outN1,outN2])
#this is constant for all the passes except the first
outB = outN3 #looks like B is never changing in your image....
Now the model:
finalOut = Concatenate()([outA,outB])
model = Model([inputA,inputB,inputC,inputF], finalOut)
Related
I am trying to test many ML models using keras.models.Sequential.
My idea is that once I have an iterator that looks like [num_layers, num_units_per_layers], for example [(1, 64),(2, (64,128))], to create a script using a kind of for loop running the iterator to be able to create a keras sequential model with the number of layers and units in each step of the iterator.
This is what I am trying:
it = [[(1, 128),(2, (64,128)), (3,(128,64,256))]]
for layers, units in it:
model = keras.Sequential([
layers.Dense(units[0])
#How to get another layers here when layers > 1.
])
But I am stuck when adding new layers automatically. To sum up, what I want in each step of the iterator is the keras model represented by its values.
Is there any way to do this?
For example, when layers = 2 and units = (64,128) the code should look like:
model = keras.Sequential([
layers.Dense(64),
layers.Dense(128)
])
If layers = 1 and units = 128 the code must be:
model = keras.Sequential([
layers.Dense(128)
])
Well the first issue is the way you set up it. The way you're doing it makes it a single list, where you want a list of n lists (here n is 3). If you define it as follows, you can extract layers, units the way you are looking for.
it = [[1,[128]],[2,(64,128)],[3,(128,64,256)]]
If you want a model with one layer, you need to put the number of units in brackets, or it won't work well with the other architectures (because of indexing). Next, there are some necessary tweaks to the code that I would suggest. First I would use a different way to build a Sequential model (shown below). Then, you would need to define your input shape otherwise the model will not know how to build. Finally, just create an output layer for each model outside the hidden layer generator loop.
I wrote this toy problem to fit your idea of iterating through models for 10 training samples and one input dimension and one output dimension.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import numpy as np
x = np.random.rand(10,1)
y = np.random.rand(10,1)
it = [[1,[128]],[2,(64,128)],[3,(128,64,256)]]
for layers, units in it:
model = Sequential()
for i in range(layers):
model.add(Dense(units[i],input_shape=(1,)))
model.add(Dense(1))
model.summary()
model.compile(loss='mse',optimizer='Adam')
model.fit(x,y,batch_size=1,epochs=1)
Is there a way to use the already trained RNN (SimpleRNN or LSTM) model to generate new sequences in Keras?
I'm trying to modify an exercise from the Coursera Deep Learning Specialization - Sequence Models course, where you train an RNN to generate dinosaurus's names. In the exercise you build the RNN using only numpy, but I want to use Keras.
One of the problems is different lengths of the sequences (dino names), so I used padding and set sequence length to the max size appearing in the dataset (I padded with 0, which is also the code for '\n').
My question is how to generate the actual sequence once training is done? In the numpy version of the exercise you take the softmax output of the previous cell and use it as a distribution to sample a new input for the next cell. But is there a way to connect the output of the previous cell as the input of the next cell in Keras, during testing/generation time?
Also - some additional side-question:
Since I'm using padding, I suspect the accuracy is way too optimistic. Is there a way to tell Keras not to include the padding values in its accuracy calculations?
Am I even doing this right? Is there a better way to use Keras with sequences of different lengths?
You can check my (WIP) code here.
Inferring from a model that has been trained on a sequence
So it's a pretty common thing to do in RNN models and in Keras the best way (at least from what I know) is to create two different models.
One model for training (which uses sequences instead of individual items)
Another model for predicting (which uses a single element instead of a sequence)
So let's see an example. Suppose you have the following model.
from tensorflow.keras import models, layers
n_chars = 26
timesteps = 10
inp = layers.Input(shape=(timesteps, n_chars))
lstm = layers.LSTM(100, return_sequences=True)
out1 = lstm(inp)
dense = layers.Dense(n_chars, activation='softmax')
out2 = layers.TimeDistributed(dense)(out1)
model = models.Model(inp, out2)
model.summary()
Now to infer from this model, you create another model which looks like the one below.
inp_infer = layers.Input(shape=(1, n_chars))
# Inputs to feed LSTM states back in
h_inp_infer = layers.Input(shape=(100,))
c_inp_infer = layers.Input(shape=(100,))
# We need return_state=True so we are creating a new layer
lstm_infer = layers.LSTM(100, return_state=True, return_sequences=True)
out1_infer, h, c = lstm_infer(inp_infer, initial_state=[h_inp_infer, c_inp_infer])
out2_infer = layers.TimeDistributed(dense)(out1_infer)
# Our model takes the previous states as inputs and spits out new states as outputs
model_infer = models.Model([inp_infer, h_inp_infer, c_inp_infer], [out2_infer, h, c])
# We are setting the weights from the trained model
lstm_infer.set_weights(lstm.get_weights())
model_infer.summary()
So what's different. You see that we have defined a new input layer which accepts an input which has only one timestep (or in other words, just a single item). Then the model outputs an output which has a single timestep (technically we don't need the TimeDistributedLayer. But I've kept that for consistency). Other than that we take the previous LSTM state output as an input and produces the new state as the output. More specifically we have the following inference model.
Input: [(None, 1, n_chars) (None, 100), (None, 100)] list of tensor
Output: [(None, 1, n_chars), (None, 100), (None, 100)] list of Tensor
Note that I'm updating the weights of the new layers from the trained model or using the existing layers from the training model. It will be a pretty useless model if you don't reuse the trained layers and weights.
Now we can write inference logic.
import numpy as np
x = np.random.randint(0,2,size=(1, 1, n_chars))
h = np.zeros(shape=(1, 100))
c = np.zeros(shape=(1, 100))
seq_len = 10
for _ in range(seq_len):
print(x)
y_pred, h, c = model_infer.predict([x, h, c])
y_pred = x[:,0,:]
y_onehot = np.zeros(shape=(x.shape[0],n_chars))
y_onehot[np.arange(x.shape[0]),np.argmax(y_pred,axis=1)] = 1.0
x = np.expand_dims(y_onehot, axis=1)
This part starts with an initial x, h, c. Gets the prediction y_pred, h, c and convert that to an input in the following lines and assign it back to x, h, c. So you keep going for n iterations of your choice.
About masking zeros
Keras does offer a Masking layer which can be used for this purpose. And the second answer in this question seems to be what you're looking for.
I don't understand what's happening in this code:
def construct_model(use_imagenet=True):
# line 1: how do we keep all layers of this model ?
model = keras.applications.InceptionV3(include_top=False, input_shape=(IMG_SIZE, IMG_SIZE, 3),
weights='imagenet' if use_imagenet else None) # line 1: how do we keep all layers of this model ?
new_output = keras.layers.GlobalAveragePooling2D()(model.output)
new_output = keras.layers.Dense(N_CLASSES, activation='softmax')(new_output)
model = keras.engine.training.Model(model.inputs, new_output)
return model
Specifically, my confusion is, when we call the last constructor
model = keras.engine.training.Model(model.inputs, new_output)
we specify input layer and output layer, but how does it know we want all the other layers to stay?
In other words, we append the new_output layer to the pre-trained model we load in line 1, that is the new_output layer, and then in the final constructor (final line), we just create and return a model with a specified input and output layers, but how does it know what other layers we want in between?
Side question 1): What is the difference between keras.engine.training.Model and keras.models.Model?
Side question 2): What exactly happens when we do new_layer = keras.layers.Dense(...)(prev_layer)? Does the () operation return new layer, what does it do exactly?
This model was created using the Functional API Model
Basically it works like this (perhaps if you go to the "side question 2" below before reading this it may get clearer):
You have an input tensor (you can see it as "input data" too)
You create (or reuse) a layer
You pass the input tensor to a layer (you "call" a layer with an input)
You get an output tensor
You keep working with these tensors until you have created the entire graph.
But this hasn't created a "model" yet. (One you can train and use other things).
All you have is a graph telling which tensors go where.
To create a model, you define it's start end end points.
In the example.
They take an existing model: model = keras.applications.InceptionV3(...)
They want to expand this model, so they get its output tensor: model.output
They pass this tensor as the input of a GlobalAveragePooling2D layer
They get this layer's output tensor as new_output
They pass this as input to yet another layer: Dense(N_CLASSES, ....)
And get its output as new_output (this var was replaced as they are not interested in keeping its old value...)
But, as it works with the functional API, we don't have a model yet, only a graph. In order to create a model, we use Model defining the input tensor and the output tensor:
new_model = Model(old_model.inputs, new_output)
Now you have your model.
If you use it in another var, as I did (new_model), the old model will still exist in model. And these models are sharing the same layers, in a way that whenever you train one of them, the other gets updated as well.
Question: how does it know what other layers we want in between?
When you do:
outputTensor = SomeLayer(...)(inputTensor)
you have a connection between the input and output. (Keras will use the inner tensorflow mechanism and add these tensors and nodes to the graph). The output tensor cannot exist without the input. The entire InceptionV3 model is connected from start to end. Its input tensor goes through all the layers to yield an ouptut tensor. There is only one possible way for the data to follow, and the graph is the way.
When you get the output of this model and use it to get further outputs, all your new outputs are connected to this, and thus to the first input of the model.
Probably the attribute _keras_history that is added to the tensors is closely related to how it tracks the graph.
So, doing Model(old_model.inputs, new_output) will naturally follow the only way possible: the graph.
If you try doing this with tensors that are not connected, you will get an error.
Side question 1
Prefer to import from "keras.models". Basically, this module will import from the other module:
https://github.com/keras-team/keras/blob/master/keras/models.py
Notice that the file keras/models.py imports Model from keras.engine.training. So, it's the same thing.
Side question 2
It's not new_layer = keras.layers.Dense(...)(prev_layer).
It is output_tensor = keras.layers.Dense(...)(input_tensor).
You're doing two things in the same line:
Creating a layer - with keras.layers.Dense(...)
Calling the layer with an input tensor to get an output tensor
If you wanted to use the same layer with different inputs:
denseLayer = keras.layers.Dense(...) #creating a layer
output1 = denseLayer(input1) #calling a layer with an input and getting an output
output2 = denseLayer(input2) #calling the same layer on another input
output3 = denseLayer(input3) #again
Bonus - Creating a functional model that is equal to a sequential model
If you create this sequential model:
model = Sequential()
model.add(Layer1(...., input_shape=some_shape))
model.add(Layer2(...))
model.add(Layer3(...))
You're doing exactly the same as:
inputTensor = Input(some_shape)
outputTensor = Layer1(...)(inputTensor)
outputTensor = Layer2(...)(outputTensor)
outputTensor = Layer3(...)(outputTensor)
model = Model(inputTensor,outputTensor)
What is the difference?
Well, functional API models are totally free to be build anyway you want. You can create branches:
out1 = Layer1(..)(inputTensor)
out2 = Layer2(..)(inputTensor)
You can join tensors:
joinedOut = Concatenate()([out1,out2])
With this, you can create anything you want with all kinds of fancy stuff, branches, gates, concatenations, additions, etc., which you can't do with a sequential model.
In fact, a Sequential model is also a Model, but created for a quick use in models without branches.
There's this way of building a model from a pretrained one that you may build upon.
See https://keras.io/applications/#fine-tune-inceptionv3-on-a-new-set-of-classes:
base_model = InceptionV3(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(200, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
Each time a layer is added by an op like "x=Dense(...", information about the computational graph is updated. You can type this interactively to see what it contains:
x.graph.__dict__
You can see there's all kinds of attributes, including about previous and next layers. These are internal implementation details and possibly change over time.
Imagine a fully-connected neural network with its last two layers of the following structure:
[Dense]
units = 612
activation = softplus
[Dense]
units = 1
activation = sigmoid
The output value of the net is 1, but I'd like to know what the input x to the sigmoidal function was (must be some high number, since sigm(x) is 1 here).
Folllowing indraforyou's answer I managed to retrieve the output and weights of Keras layers:
outputs = [layer.output for layer in model.layers[-2:]]
functors = [K.function( [model.input]+[K.learning_phase()], [out] ) for out in outputs]
test_input = np.array(...)
layer_outs = [func([test_input, 0.]) for func in functors]
print layer_outs[-1][0] # -> array([[ 1.]])
dense_0_out = layer_outs[-2][0] # shape (612, 1)
dense_1_weights = model.layers[-1].weights[0].get_value() # shape (1, 612)
dense_1_bias = model.layers[-1].weights[1].get_value()
x = np.dot(dense_0_out, dense_1_weights) + dense_1_bias
print x # -> -11.7
How can x be a negative number? In that case the last layers output should be a number closer to 0.0 than 1.0. Are dense_0_out or dense_1_weights the wrong outputs or weights?
Since you're using get_value(), I'll assume that you're using Theano backend. To get the value of the node before the sigmoid activation, you can traverse the computation graph.
The graph can be traversed starting from outputs (the result of some computation) down to its inputs using the owner field.
In your case, what you want is the input x of the sigmoid activation op. The output of the sigmoid op is model.output. Putting these together, the variable x is model.output.owner.inputs[0].
If you print out this value, you'll see Elemwise{add,no_inplace}.0, which is an element-wise addition op. It can be verified from the source code of Dense.call():
def call(self, inputs):
output = K.dot(inputs, self.kernel)
if self.use_bias:
output = K.bias_add(output, self.bias)
if self.activation is not None:
output = self.activation(output)
return output
The input to the activation function is the output of K.bias_add().
With a small modification of your code, you can get the value of the node before activation:
x = model.output.owner.inputs[0]
func = K.function([model.input] + [K.learning_phase()], [x])
print func([test_input, 0.])
For anyone using TensorFlow backend: use x = model.output.op.inputs[0] instead.
I can see a simple way just changing a little the model structure. (See at the end how to use the existing model and change only the ending).
The advantages of this method are:
You don't have to guess if you're doing the right calculations
You don't need to care about the dropout layers and how to implement a dropout calculation
This is a pure Keras solution (applies to any backend, either Theano or Tensorflow).
There are two possible solutions below:
Option 1 - Create a new model from start with the proposed structure
Option 2 - Reuse an existing model changing only its ending
Model structure
You could just have the last dense separated in two layers at the end:
[Dense]
units = 612
activation = softplus
[Dense]
units = 1
#no activation
[Activation]
activation = sigmoid
Then you simply get the output of the last dense layer.
I'd say you should create two models, one for training, the other for checking this value.
Option 1 - Building the models from the beginning:
from keras.models import Model
#build the initial part of the model the same way you would
#add the Dense layer without an activation:
#if using the functional Model API
denseOut = Dense(1)(outputFromThePreviousLayer)
sigmoidOut = Activation('sigmoid')(denseOut)
#if using the sequential model - will need the functional API
model.add(Dense(1))
sigmoidOut = Activation('sigmoid')(model.output)
Create two models from that, one for training, one for checking the output of dense:
#if using the functional API
checkingModel = Model(yourInputs, denseOut)
#if using the sequential model:
checkingModel = model
trainingModel = Model(checkingModel.inputs, sigmoidOut)
Use trianingModel for training normally. The two models share weights, so training one is training the other.
Use checkingModel just to see the outputs of the Dense layer, using checkingModel.predict(X)
Option 2 - Building this from an existing model:
from keras.models import Model
#find the softplus dense layer and get its output:
softplusOut = oldModel.layers[indexForSoftplusLayer].output
#or should this be the output from the dropout? Whichever comes immediately after the last Dense(1)
#recreate the dense layer
outDense = Dense(1, name='newDense', ...)(softPlusOut)
#create the new model
checkingModel = Model(oldModel.inputs,outDense)
It's important, since you created a new Dense layer, to get the weights from the old one:
wgts = oldModel.layers[indexForDense].get_weights()
checkingModel.get_layer('newDense').set_weights(wgts)
In this case, training the old model will not update the last dense layer in the new model, so, let's create a trainingModel:
outSigmoid = Activation('sigmoid')(checkingModel.output)
trainingModel = Model(checkingModel.inputs,outSigmoid)
Use checkingModel for checking the values you want with checkingModel.predict(X). And train the trainingModel.
So this is for fellow googlers, the working of the keras API has changed significantly since the accepted answer was posted. The working code for extracting a layer's output before activation (for tensorflow backend) is:
model = Your_Keras_Model()
the_tensor_you_need = model.output.op.inputs[0] #<- this is indexable, if there are multiple inputs to this node then you can find it with indexing.
In my case, the final layer was a dense layer with activation softmax, so the tensor output I needed was <tf.Tensor 'predictions/BiasAdd:0' shape=(?, 1000) dtype=float32>.
(TF backend)
Solution for Conv layers.
I had the same question, and to rewrite a model's configuration was not an option.
The simple hack would be to perform the call function manually. It gives control over the activation.
Copy-paste from the Keras source, with self changed to layer. You can do the same with any other layer.
def conv_no_activation(layer, inputs, activation=False):
if layer.rank == 1:
outputs = K.conv1d(
inputs,
layer.kernel,
strides=layer.strides[0],
padding=layer.padding,
data_format=layer.data_format,
dilation_rate=layer.dilation_rate[0])
if layer.rank == 2:
outputs = K.conv2d(
inputs,
layer.kernel,
strides=layer.strides,
padding=layer.padding,
data_format=layer.data_format,
dilation_rate=layer.dilation_rate)
if layer.rank == 3:
outputs = K.conv3d(
inputs,
layer.kernel,
strides=layer.strides,
padding=layer.padding,
data_format=layer.data_format,
dilation_rate=layer.dilation_rate)
if layer.use_bias:
outputs = K.bias_add(
outputs,
layer.bias,
data_format=layer.data_format)
if activation and layer.activation is not None:
outputs = layer.activation(outputs)
return outputs
Now we need to modify the main function a little. First, identify the layer by its name. Then retrieve activations from the previous layer. And at last, compute the output from the target layer.
def get_output_activation_control(model, images, layername, activation=False):
"""Get activations for the input from specified layer"""
inp = model.input
layer_id, layer = [(n, l) for n, l in enumerate(model.layers) if l.name == layername][0]
prev_layer = model.layers[layer_id - 1]
conv_out = conv_no_activation(layer, prev_layer.output, activation=activation)
functor = K.function([inp] + [K.learning_phase()], [conv_out])
return functor([images])
Here is a tiny test. I'm using VGG16 model.
a_relu = get_output_activation_control(vgg_model, img, 'block4_conv1', activation=True)[0]
a_no_relu = get_output_activation_control(vgg_model, img, 'block4_conv1', activation=False)[0]
print(np.sum(a_no_relu < 0))
> 245293
Set all negatives to zero to compare with the results retrieved after an embedded in VGG16 ReLu operation.
a_no_relu[a_no_relu < 0] = 0
print(np.allclose(a_relu, a_no_relu))
> True
easy way to define new layer with new activation function:
def change_layer_activation(layer):
if isinstance(layer, keras.layers.Conv2D):
config = layer.get_config()
config["activation"] = "linear"
new = keras.layers.Conv2D.from_config(config)
elif isinstance(layer, keras.layers.Dense):
config = layer.get_config()
config["activation"] = "linear"
new = keras.layers.Dense.from_config(config)
weights = [x.numpy() for x in layer.weights]
return new, weights
I had the same problem but none of the other answers worked for me. Im using a newer version of Keras with Tensorflow so some answers dont work now. Also the structure of the model is given so i can't change it easely. The general idea is to create a copy of the original model that will work exactly like the original one but spliting the activation from the outputs layers. Once this is done we can easely access the outputs values before the activation is applied.
First we will create a copy of the original model but with no activation on the outputs layers. This will be done using Keras clone_model function (See Docs).
from tensorflow.keras.models import clone_model
from tensorflow.keras.layers import Activation
original_model = get_model()
def f(layer):
config = layer.get_config()
if not isinstance(layer, Activation) and layer.name in original_model.output_names:
config.pop('activation', None)
layer_copy = layer.__class__.from_config(config)
return layer_copy
copy_model = clone_model(model, clone_function=f)
This alone will only make a clone with new weights so we must copy the original_model weights to the new one:
copy_model.build(original_model.input_shape)
copy_model.set_weights(original_model.get_weights())
Now we will add the activations layers:
from tensorflow.keras.models import Model
old_outputs = [ original_model.get_layer(name=name) for name in copy_model.output_names ]
new_outputs = [ Activation(old_output.activation)(output) if old_output.activation else output
for output, old_output in zip(copy_model.outputs, old_outputs) ]
copy_model = Model(copy_model.inputs, new_outputs)
Finally we could create a new model whose evaluation will be the outputs with no activation applied:
no_activation_outputs = [ copy_model.get_layer(name=name).output for name in original_model.output_names ]
no_activation_model = Model(copy.inputs, no_activation_outputs)
Now we could use copy_model like the original_model and no_activation_model to access pre-activation outputs. Actually you could even modify the code to split a custom set of layers instead of the outputs.
a question concerning keras regression with multiple outputs:
Could you explain the difference beteween this net:
two inputs -> two outputs
input = Input(shape=(2,), name='bla')
hidden = Dense(hidden, activation='tanh', name='bla')(input)
output = Dense(2, activation='tanh', name='bla')(hidden)
and: two single inputs -> two single outputs:
input = Input(shape=(2,), name='speed_input')
hidden = Dense(hidden_dim, activation='tanh', name='hidden')(input)
output = Dense(1, activation='tanh', name='bla')(hidden)
input_2 = Input(shape=(1,), name='angle_input')
hidden_2 = Dense(hidden_dim, activation='tanh', name='hidden')(input_2)
output_2 = Dense(1, activation='tanh', name='bla')(hidden_2)
model = Model(inputs=[speed_input, angle_input], outputs=[speed_output, angle_output])
They behave very similar. Other when I completly seperate them, then the two nets behave like they re supposed to.
And is it normal that two single output nets behave much more intelligible than a bigger one with two outputs, I didnt think the difference could be huge like I experienced.
Thanks a lot :)
This goes back to how neural networks operate. In your first model, each hidden neuron receives 2 input values (as it is a 'Dense' layer, the input propagates to every neuron). In your second model, you have twice as many neurons, but each of these only receive either speed_input or angle_input, and only works with that data instead of the entire data.
So, if speed_input and angle_input are 2 completely unrelated attributes, you'll likely see better performance from splitting the 2 models, since neurons aren't receiving what is basically noise input (they don't know that your outputs correspond to your inputs, they can only try to optimize your loss function). Essentially, you're creating 2 seperate models.
But in most cases, you want to feed the model relevant attributes that combine to draw a prediction. So splitting the model then would not make sense, since you're just stripping neccessary information away.