Keras backend function: InvalidArgumentError - python

I can't get keras.backend.function to work properly. I'm trying to follow this post:
How to calculate prediction uncertainty using Keras?
In this post they create a function f:
f = K.function([model.layers[0].input],[model.layers[-1].output]) #(I actually simplified the function a little bit).
In my neural network I have 3 inputs. When I try to compute f([[3], [23], [0.0]]) I get this error:
InvalidArgumentError: You must feed a value for placeholder tensor 'input_3' with dtype float and shape [?,1]
[[{{node input_3}} = Placeholder[dtype=DT_FLOAT, shape=[?,1], _device="/job:localhost/replica:0/task:0/device:CPU:0"]
Now I know using [[3], [23], [0.0]] as an input in my model doesn't give me an error during the testing phase. Can anyone tell me where I'm going wrong?
This is what my model looks like if it matters:
home_in = Input(shape=(1,))
away_in = Input(shape=(1,))
time_in = Input(shape = (1,))
embed_home = Embedding(input_dim = in_dim, output_dim = out_dim, input_length = 1)
embed_away = Embedding(input_dim = in_dim, output_dim = out_dim, input_length = 1)
embedding_home = Flatten()(embed_home(home_in))
embedding_away = Flatten()(embed_away(away_in))
keras.backend.set_learning_phase(1) #this will keep dropout on during the testing phase
model_layers = Dense(units=2)\
(Dropout(0.3)\
(Dense(units=64, activation = "relu")\
(Dropout(0.3)\
(Dense(units=64, activation = "relu")\
(Dropout(0.3)\
(Dense(units=64, activation = "relu")\
(concatenate([embedding_home, embedding_away, time_in]))))))))
model = Model(inputs=[home_in, away_in, time_in], outputs=model_layers)`

The function you have defined is only using one of the input layers (i.e. model.layers[0].input) as its input. Instead, it must use all the inputs so the model could be run. There are inputs and outputs attributes for the model which you can use to include all the inputs and outputs with less verbosity:
f = K.function(model.inputs, model.outputs)
Update: The shape of all the input arrays must be (num_samples, 1). Therefore, you need to pass a list of lists (e.g. [[3]]) instead of a list (e.g. [3]):
outs = f([[[3]], [[23]], [[0.0]]])

Related

How to use a batch_size of Keras tensor at the model building time?

I want to use an external program as a custom operation.
Because automatic gradient would be not available, I wrote the code to provide gradients by using numerical methods. However, because it have to compute the batch_size number of derivatives,
I wrote it to get batch_size from the shape of x.
Following is an example using numpy function as an external program
f(x) = np.sum(x**2)
(In fact, for this simple numpy function, no loop over batch_size is necessary. But, it is written for general external function.)
#tf.custom_gradient
def custom_op(x):
# without using numpy, use external function
# assume x shape = (batch_size,3)
batch_size= x.shape[0]
input_length = x.shape[1]
# assert input_length==3
yout=[] # shape should be (batch_size,1)
gout=[] # shape should be (batch_size,3)
for i in range(batch_size):
inputs = x[i,:] # shape (3,)
y = np.sum(inputs**2) # shape (3,)
yout.append(y) # shape (1,)
# compute differences
dy = []
for j in range(len(inputs)):
delta = np.zeros_like(inputs)
delta[j] = np.abs(inputs[j])*0.001
yplus = np.sum((inputs + delta)**2) # change only j-th input
grad = (yplus-y)/delta[j] #shape (1,)
dy.append(grad)
gout.append(dy)
yout = tf.convert_to_tensor(yout,dtype='float32') # (batch_size,)
yout = tf.reshape(yout,shape=(batch_size,1)) # (batch_size,1)
gout = tf.convert_to_tensor(gout,dtype='float32') # (batch_size,)
gout = tf.reshape(gout,shape=(batch_size,input_length)) # (batch_size,1)
def grad(upstream):
return upstream*gout
return yout, grad
x = tf.Variable([[1.,2.,3.],[2.,3.,4.]],dtype='float32')
with tf.GradientTape() as tape:
y = custom_op(x)
tape.gradient(y,x)
and found it works.
However, when I tried to use it in the keras model , for example,
def construct_model():
inputs = tf.keras.Input(shape=(3,)) #input array
x = tf.keras.layers.Dense(1)(inputs)
outputs = custom_op(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
optimizer = 'adam'
model.compile(loss='mean_squared_error',
optimizer=optimizer,
metrics=['mean_absolute_error', 'mean_squared_error'])
return model
model = construct_model()
it gives errors
because kerasTensor "inputs" does not have specified batch_size.
I tried to specify batch_size as "tf.keras.Input(shape=(3,),batch_size=2)".
However, it also raises errors because of the use of kerasTensor.
How should I change the custom_op to be compatible with keras?

Create an LSTM layer with Attention in Keras for multi-label text classification neural network

Greetings dear members of the community. I am creating a neural network to predict a multi-label y. Specifically, the neural network takes 5 inputs (list of actors, plot summary, movie features, movie reviews, title) and tries to predict the sequence of movie genres. In the neural network I use Embeddings Layer and Global Max Pooling layers.
However, I recently discovered the Recurrent Layers with Attention, which are a very interesting topic these days in machine learning translation. So, I wondered if I could use one of those layers but only the Plot Summary input. Note that I don't do ml translation but rather text classification.
My neural network in its current state
def create_fit_keras_model(hparams,
version_data_control,
optimizer_name,
validation_method,
callbacks,
optimizer_version = None):
sentenceLength_actors = X_train_seq_actors.shape[1]
vocab_size_frequent_words_actors = len(actors_tokenizer.word_index)
sentenceLength_plot = X_train_seq_plot.shape[1]
vocab_size_frequent_words_plot = len(plot_tokenizer.word_index)
sentenceLength_features = X_train_seq_features.shape[1]
vocab_size_frequent_words_features = len(features_tokenizer.word_index)
sentenceLength_reviews = X_train_seq_reviews.shape[1]
vocab_size_frequent_words_reviews = len(reviews_tokenizer.word_index)
sentenceLength_title = X_train_seq_title.shape[1]
vocab_size_frequent_words_title = len(title_tokenizer.word_index)
model = keras.Sequential(name='{0}_{1}dim_{2}batchsize_{3}lr_{4}decaymultiplier_{5}'.format(sequential_model_name,
str(hparams[HP_EMBEDDING_DIM]),
str(hparams[HP_HIDDEN_UNITS]),
str(hparams[HP_LEARNING_RATE]),
str(hparams[HP_DECAY_STEPS_MULTIPLIER]),
version_data_control))
actors = keras.Input(shape=(sentenceLength_actors,), name='actors_input')
plot = keras.Input(shape=(sentenceLength_plot,), batch_size=hparams[HP_HIDDEN_UNITS], name='plot_input')
features = keras.Input(shape=(sentenceLength_features,), name='features_input')
reviews = keras.Input(shape=(sentenceLength_reviews,), name='reviews_input')
title = keras.Input(shape=(sentenceLength_title,), name='title_input')
emb1 = layers.Embedding(input_dim = vocab_size_frequent_words_actors + 2,
output_dim = 16, #hparams[HP_EMBEDDING_DIM], hyperparametered or fixed sized.
embeddings_initializer = 'uniform',
mask_zero = True,
input_length = sentenceLength_actors,
name="actors_embedding_layer")(actors)
# encoded_layer1 = layers.GlobalAveragePooling1D(name="globalaveragepooling_actors_layer")(emb1)
encoded_layer1 = layers.GlobalMaxPooling1D(name="globalmaxpooling_actors_layer")(emb1)
emb2 = layers.Embedding(input_dim = vocab_size_frequent_words_plot + 2,
output_dim = hparams[HP_EMBEDDING_DIM],
embeddings_initializer = 'uniform',
mask_zero = True,
input_length = sentenceLength_plot,
name="plot_embedding_layer")(plot)
# (Option 1)
# encoded_layer2 = layers.GlobalMaxPooling1D(name="globalmaxpooling_plot_summary_Layer")(emb2)
# (Option 2)
emb2 = layers.Bidirectional(layers.LSTM(hparams[HP_EMBEDDING_DIM], return_sequences=True))(emb2)
avg_pool = layers.GlobalAveragePooling1D()(emb2)
max_pool = layers.GlobalMaxPooling1D()(emb2)
conc = layers.concatenate([avg_pool, max_pool])
# (Option 3)
# emb2 = layers.Bidirectional(layers.LSTM(hparams[HP_EMBEDDING_DIM], return_sequences=True))(emb2)
# emb2 = layers.Bidirectional(layers.LSTM(hparams[HP_EMBEDDING_DIM], return_sequences=True))(emb2)
# emb2 = AttentionWithContext()(emb2)
emb3 = layers.Embedding(input_dim = vocab_size_frequent_words_features + 2,
output_dim = hparams[HP_EMBEDDING_DIM],
embeddings_initializer = 'uniform',
mask_zero = True,
input_length = sentenceLength_features,
name="features_embedding_layer")(features)
# encoded_layer3 = layers.GlobalAveragePooling1D(name="globalaveragepooling_movie_features_layer")(emb3)
encoded_layer3 = layers.GlobalMaxPooling1D(name="globalmaxpooling_movie_features_layer")(emb3)
emb4 = layers.Embedding(input_dim = vocab_size_frequent_words_reviews + 2,
output_dim = hparams[HP_EMBEDDING_DIM],
embeddings_initializer = 'uniform',
mask_zero = True,
input_length = sentenceLength_reviews,
name="reviews_embedding_layer")(reviews)
# encoded_layer4 = layers.GlobalAveragePooling1D(name="globalaveragepooling_user_reviews_layer")(emb4)
encoded_layer4 = layers.GlobalMaxPooling1D(name="globalmaxpooling_user_reviews_layer")(emb4)
emb5 = layers.Embedding(input_dim = vocab_size_frequent_words_title + 2,
output_dim = hparams[HP_EMBEDDING_DIM],
embeddings_initializer = 'uniform',
mask_zero = True,
input_length = sentenceLength_title,
name="title_embedding_layer")(title)
# encoded_layer5 = layers.GlobalAveragePooling1D(name="globalaveragepooling_movie_title_layer")(emb5)
encoded_layer5 = layers.GlobalMaxPooling1D(name="globalmaxpooling_movie_title_layer")(emb5)
merged = layers.concatenate([encoded_layer1, conc, encoded_layer3, encoded_layer4, encoded_layer5], axis=-1) #(Option 2)
# merged = layers.concatenate([encoded_layer1, emb2, encoded_layer3, encoded_layer4, encoded_layer5], axis=-1) #(Option 3)
dense_layer_1 = layers.Dense(hparams[HP_HIDDEN_UNITS],
kernel_regularizer=regularizers.l2(neural_network_parameters['l2_regularization']),
activation=neural_network_parameters['dense_activation'],
name="1st_dense_hidden_layer_concatenated_inputs")(merged)
layers.Dropout(neural_network_parameters['dropout_rate'])(dense_layer_1)
output_layer = layers.Dense(neural_network_parameters['number_target_variables'],
activation=neural_network_parameters['output_activation'],
name='output_layer')(dense_layer_1)
model = keras.Model(inputs=[actors, plot, features, reviews, title], outputs=output_layer, name='{0}_{1}dim_{2}batchsize_{3}lr_{4}decaymultiplier_{5}'.format(sequential_model_name,
str(hparams[HP_EMBEDDING_DIM]),
str(hparams[HP_HIDDEN_UNITS]),
str(hparams[HP_LEARNING_RATE]),
str(hparams[HP_DECAY_STEPS_MULTIPLIER]),
version_data_control))
print(model.summary())
# pruning_schedule = tfmot.sparsity.keras.PolynomialDecay(initial_sparsity=0.0,
# final_sparsity=0.4,
# begin_step=600,
# end_step=1000)
# model_for_pruning = tfmot.sparsity.keras.prune_low_magnitude(model, pruning_schedule=pruning_schedule)
if optimizer_name=="adam" and optimizer_version is None:
optimizer = optimizer_adam_v2(hparams)
elif optimizer_name=="sgd" and optimizer_version is None:
optimizer = optimizer_sgd_v1(hparams, "no decay")
elif optimizer_name=="rmsprop" and optimizer_version is None:
optimizer = optimizer_rmsprop_v1(hparams)
print("here: {0}".format(optimizer.lr))
lr_metric = [get_lr_metric(optimizer)]
if type(get_lr_metric(optimizer)) in (float, int):
print("Learning Rate's type is Float or Integer")
model.compile(optimizer=optimizer,
loss=neural_network_parameters['model_loss'],
metrics=neural_network_parameters['model_metric'] + lr_metric, )
else:
print("Learning Rate's type is not Float or Integer, but rather {0}".format(type(lr_metric)))
model.compile(optimizer=optimizer,
loss=neural_network_parameters['model_loss'],
metrics=neural_network_parameters['model_metric'], ) #+ lr_metric
You will see in the above structure that I have 5 input layers, 5 Embedding layers, then I apply a Bidirectional layer on LSTM only in the Plot Summary input.
However, with the current bidirectional approach on Plot summary, I got the following error. My problem is how I can utilize the attention in text classification and not solve the error below. So, don't comment solution on this error.
My question is about suggesting ways on how to create a recurrent layer with attention for the plot summary (input 2). Also, do not hesitate to write in comments any article that might help me on achieving this in Keras.
I remain at your disposal if any additional information is required regarding the structure of the neural network.
If you find the above neural network complicated I can make a simple version of it. However, the above is my original neural network, so I want any proposals do be based on that nn.
EDIT: 14.12.2020
Find here the colab notebook with the code I want to execute. The code has included two answers, one proposed in the comments (from an already answered question, and the other written as an official answer to my question.
The first approach proposed by #MarcoCerliani works. Although, I would like also the second approach to work. The approach of #Allohvk (both approaches are implemented in the Runtime cell [21] of the attached colab). The latter does not work at the moment. The latest error I get is:
ValueError: Input 0 of layer globalmaxpooling_plot_summary_Layer is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 100]
I solved the latest error of my edit by removing the globalmaxpooling_plot_summary_Layer from my neural's network structure.
Let me summarize the intent. You want to add attention to your code. Yours is a sequence classification task and not a seq-seq translator. You dont really care much about the way it is done, so you are ok with not debugging the error above, but just need a working piece of code. Our main input here is the movie reviews consisting of 'n' words for which you want to add attention.
Assume you embed the reviews and pass it to an LSTM layer. Now you want to 'attend' to all the hidden states of the LSTM layer and then generate a classification (instead of just using the last hidden state of the encoder). So an attention layer needs to be inserted. A barebones implementation would look like this:
def __init__(self):
##Nothing special to be done here
super(peel_the_layer, self).__init__()
def build(self, input_shape):
##Define the shape of the weights and bias in this layer
##This is a 1 unit layer.
units=1
##last index of the input_shape is the number of dimensions of the prev
##RNN layer. last but 1 index is the num of timesteps
self.w=self.add_weight(name="att_weights", shape=(input_shape[-1], units), initializer="normal") #name property is useful for avoiding RuntimeError: Unable to create link.
self.b=self.add_weight(name="att_bias", shape=(input_shape[-2], units), initializer="zeros")
super(peel_the_layer,self).build(input_shape)
def call(self, x):
##x is the input tensor..each word that needs to be attended to
##Below is the main processing done during training
##K is the Keras Backend import
e = K.tanh(K.dot(x,self.w)+self.b)
a = K.softmax(e, axis=1)
output = x*a
##return the ouputs. 'a' is the set of attention weights
##the second variable is the 'attention adjusted o/p state' or context
return a, K.sum(output, axis=1)
Now call the above Attention layer after your LSTM and before your Dense output layer.
a, context = peel_the_layer()(lstm_out)
##context is the o/p which be the input to your classification layer
##a is the set of attention weights and you may want to route them to a display
You can build on top of this as you seem to want to use other features apart for the movie reviews to come up with the final sentiment. Attention largely applies to reviews..and benefits are to be seen if the sentences are very long.
For more specific details, please refer https://towardsdatascience.com/create-your-own-custom-attention-layer-understand-all-flavours-2201b5e8be9e

How to get logits from a sequential model in keras/tensorflow? [duplicate]

I have trained a binary classification model with CNN, and here is my code
model = Sequential()
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
border_mode='valid',
input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
# (16, 16, 32)
model.add(Convolution2D(nb_filters*2, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters*2, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
# (8, 8, 64) = (2048)
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(2)) # define a binary classification problem
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
verbose=1,
validation_data=(x_test, y_test))
And here, I wanna get the output of each layer just like TensorFlow, how can I do that?
You can easily get the outputs of any layer by using: model.layers[index].output
For all layers use this:
from keras import backend as K
inp = model.input # input placeholder
outputs = [layer.output for layer in model.layers] # all layer outputs
functors = [K.function([inp, K.learning_phase()], [out]) for out in outputs] # evaluation functions
# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = [func([test, 1.]) for func in functors]
print layer_outs
Note: To simulate Dropout use learning_phase as 1. in layer_outs otherwise use 0.
Edit: (based on comments)
K.function creates theano/tensorflow tensor functions which is later used to get the output from the symbolic graph given the input.
Now K.learning_phase() is required as an input as many Keras layers like Dropout/Batchnomalization depend on it to change behavior during training and test time.
So if you remove the dropout layer in your code you can simply use:
from keras import backend as K
inp = model.input # input placeholder
outputs = [layer.output for layer in model.layers] # all layer outputs
functors = [K.function([inp], [out]) for out in outputs] # evaluation functions
# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = [func([test]) for func in functors]
print layer_outs
Edit 2: More optimized
I just realized that the previous answer is not that optimized as for each function evaluation the data will be transferred CPU->GPU memory and also the tensor calculations needs to be done for the lower layers over-n-over.
Instead this is a much better way as you don't need multiple functions but a single function giving you the list of all outputs:
from keras import backend as K
inp = model.input # input placeholder
outputs = [layer.output for layer in model.layers] # all layer outputs
functor = K.function([inp, K.learning_phase()], outputs ) # evaluation function
# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = functor([test, 1.])
print layer_outs
From https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer
One simple way is to create a new Model that will output the layers that you are interested in:
from keras.models import Model
model = ... # include here your original model
layer_name = 'my_layer'
intermediate_layer_model = Model(inputs=model.input,
outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model.predict(data)
Alternatively, you can build a Keras function that will return the output of a certain layer given a certain input, for example:
from keras import backend as K
# with a Sequential model
get_3rd_layer_output = K.function([model.layers[0].input],
[model.layers[3].output])
layer_output = get_3rd_layer_output([x])[0]
Based on all the good answers of this thread, I wrote a library to fetch the output of each layer. It abstracts all the complexity and has been designed to be as user-friendly as possible:
https://github.com/philipperemy/keract
It handles almost all the edge cases.
Hope it helps!
Following looks very simple to me:
model.layers[idx].output
Above is a tensor object, so you can modify it using operations that can be applied to a tensor object.
For example, to get the shape model.layers[idx].output.get_shape()
idx is the index of the layer and you can find it from model.summary()
This answer is based on: https://stackoverflow.com/a/59557567/2585501
To print the output of a single layer:
from tensorflow.keras import backend as K
layerIndex = 1
func = K.function([model.get_layer(index=0).input], model.get_layer(index=layerIndex).output)
layerOutput = func([input_data]) # input_data is a numpy array
print(layerOutput)
To print output of every layer:
from tensorflow.keras import backend as K
for layerIndex, layer in enumerate(model.layers):
func = K.function([model.get_layer(index=0).input], layer.output)
layerOutput = func([input_data]) # input_data is a numpy array
print(layerOutput)
I wrote this function for myself (in Jupyter) and it was inspired by indraforyou's answer. It will plot all the layer outputs automatically. Your images must have a (x, y, 1) shape where 1 stands for 1 channel. You just call plot_layer_outputs(...) to plot.
%matplotlib inline
import matplotlib.pyplot as plt
from keras import backend as K
def get_layer_outputs():
test_image = YOUR IMAGE GOES HERE!!!
outputs = [layer.output for layer in model.layers] # all layer outputs
comp_graph = [K.function([model.input]+ [K.learning_phase()], [output]) for output in outputs] # evaluation functions
# Testing
layer_outputs_list = [op([test_image, 1.]) for op in comp_graph]
layer_outputs = []
for layer_output in layer_outputs_list:
print(layer_output[0][0].shape, end='\n-------------------\n')
layer_outputs.append(layer_output[0][0])
return layer_outputs
def plot_layer_outputs(layer_number):
layer_outputs = get_layer_outputs()
x_max = layer_outputs[layer_number].shape[0]
y_max = layer_outputs[layer_number].shape[1]
n = layer_outputs[layer_number].shape[2]
L = []
for i in range(n):
L.append(np.zeros((x_max, y_max)))
for i in range(n):
for x in range(x_max):
for y in range(y_max):
L[i][x][y] = layer_outputs[layer_number][x][y][i]
for img in L:
plt.figure()
plt.imshow(img, interpolation='nearest')
From: https://github.com/philipperemy/keras-visualize-activations/blob/master/read_activations.py
import keras.backend as K
def get_activations(model, model_inputs, print_shape_only=False, layer_name=None):
print('----- activations -----')
activations = []
inp = model.input
model_multi_inputs_cond = True
if not isinstance(inp, list):
# only one input! let's wrap it in a list.
inp = [inp]
model_multi_inputs_cond = False
outputs = [layer.output for layer in model.layers if
layer.name == layer_name or layer_name is None] # all layer outputs
funcs = [K.function(inp + [K.learning_phase()], [out]) for out in outputs] # evaluation functions
if model_multi_inputs_cond:
list_inputs = []
list_inputs.extend(model_inputs)
list_inputs.append(0.)
else:
list_inputs = [model_inputs, 0.]
# Learning phase. 0 = Test mode (no dropout or batch normalization)
# layer_outputs = [func([model_inputs, 0.])[0] for func in funcs]
layer_outputs = [func(list_inputs)[0] for func in funcs]
for layer_activations in layer_outputs:
activations.append(layer_activations)
if print_shape_only:
print(layer_activations.shape)
else:
print(layer_activations)
return activations
Previous solutions were not working for me. I handled this issue as shown below.
layer_outputs = []
for i in range(1, len(model.layers)):
tmp_model = Model(model.layers[0].input, model.layers[i].output)
tmp_output = tmp_model.predict(img)[0]
layer_outputs.append(tmp_output)
Wanted to add this as a comment (but don't have high enough rep.) to #indraforyou's answer to correct for the issue mentioned in #mathtick's comment. To avoid the InvalidArgumentError: input_X:Y is both fed and fetched. exception, simply replace the line outputs = [layer.output for layer in model.layers] with outputs = [layer.output for layer in model.layers][1:], i.e.
adapting indraforyou's minimal working example:
from keras import backend as K
inp = model.input # input placeholder
outputs = [layer.output for layer in model.layers][1:] # all layer outputs except first (input) layer
functor = K.function([inp, K.learning_phase()], outputs ) # evaluation function
# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = functor([test, 1.])
print layer_outs
p.s. my attempts trying things such as outputs = [layer.output for layer in model.layers[1:]] did not work.
Assuming you have:
1- Keras pre-trained model.
2- Input x as image or set of images. The resolution of image should be compatible with dimension of the input layer. For example 80*80*3 for 3-channels (RGB) image.
3- The name of the output layer to get the activation. For example, "flatten_2" layer. This should be include in the layer_names variable, represents name of layers of the given model.
4- batch_size is an optional argument.
Then you can easily use get_activation function to get the activation of the output layer for a given input x and pre-trained model:
import six
import numpy as np
import keras.backend as k
from numpy import float32
def get_activations(x, model, layer, batch_size=128):
"""
Return the output of the specified layer for input `x`. `layer` is specified by layer index (between 0 and
`nb_layers - 1`) or by name. The number of layers can be determined by counting the results returned by
calling `layer_names`.
:param x: Input for computing the activations.
:type x: `np.ndarray`. Example: x.shape = (80, 80, 3)
:param model: pre-trained Keras model. Including weights.
:type model: keras.engine.sequential.Sequential. Example: model.input_shape = (None, 80, 80, 3)
:param layer: Layer for computing the activations
:type layer: `int` or `str`. Example: layer = 'flatten_2'
:param batch_size: Size of batches.
:type batch_size: `int`
:return: The output of `layer`, where the first dimension is the batch size corresponding to `x`.
:rtype: `np.ndarray`. Example: activations.shape = (1, 2000)
"""
layer_names = [layer.name for layer in model.layers]
if isinstance(layer, six.string_types):
if layer not in layer_names:
raise ValueError('Layer name %s is not part of the graph.' % layer)
layer_name = layer
elif isinstance(layer, int):
if layer < 0 or layer >= len(layer_names):
raise ValueError('Layer index %d is outside of range (0 to %d included).'
% (layer, len(layer_names) - 1))
layer_name = layer_names[layer]
else:
raise TypeError('Layer must be of type `str` or `int`.')
layer_output = model.get_layer(layer_name).output
layer_input = model.input
output_func = k.function([layer_input], [layer_output])
# Apply preprocessing
if x.shape == k.int_shape(model.input)[1:]:
x_preproc = np.expand_dims(x, 0)
else:
x_preproc = x
assert len(x_preproc.shape) == 4
# Determine shape of expected output and prepare array
output_shape = output_func([x_preproc[0][None, ...]])[0].shape
activations = np.zeros((x_preproc.shape[0],) + output_shape[1:], dtype=float32)
# Get activations with batching
for batch_index in range(int(np.ceil(x_preproc.shape[0] / float(batch_size)))):
begin, end = batch_index * batch_size, min((batch_index + 1) * batch_size, x_preproc.shape[0])
activations[begin:end] = output_func([x_preproc[begin:end]])[0]
return activations
In case you have one of the following cases:
error: InvalidArgumentError: input_X:Y is both fed and fetched
case of multiple inputs
You need to do the following changes:
add filter out for input layers in outputs variable
minnor change on functors loop
Minimum example:
from keras.engine.input_layer import InputLayer
inp = model.input
outputs = [layer.output for layer in model.layers if not isinstance(layer, InputLayer)]
functors = [K.function(inp + [K.learning_phase()], [x]) for x in outputs]
layer_outputs = [fun([x1, x2, xn, 1]) for fun in functors]
Well, other answers are very complete, but there is a very basic way to "see", not to "get" the shapes.
Just do a model.summary(). It will print all layers and their output shapes. "None" values will indicate variable dimensions, and the first dimension will be the batch size.
Generally, output size can be calculated as
[(W−K+2P)/S]+1
where
W is the input volume - in your case you have not given us this
K is the Kernel size - in your case 2 == "filter"
P is the padding - in your case 2
S is the stride - in your case 3
Another, prettier formulation:

How to get attention weights in hierarchical model

Model :
sequence_input = Input(shape=(MAX_SENT_LENGTH,), dtype='int32')
words = embedding_layer(sequence_input)
h_words = Bidirectional(GRU(200, return_sequences=True,dropout=0.2,recurrent_dropout=0.2))(words)
sentence = Attention()(h_words) #with return true
#sentence = Dropout(0.2)(sentence)
sent_encoder = Model(sequence_input, sentence[0])
print(sent_encoder.summary())
document_input = Input(shape=(None, MAX_SENT_LENGTH), dtype='int32')
document_enc = TimeDistributed(sent_encoder)(document_input)
h_sentences = Bidirectional(GRU(100, return_sequences=True))(document_enc)
preds = Dense(7, activation='softmax')(h_sentences)
model = Model(document_input, preds)
Attention layer used:
https://gist.github.com/cbaziotis/6428df359af27d58078ca5ed9792bd6d
with return_attention=True
How can I visualise attention weights for a new input once the model is trained.
What I am trying:
get_3rd_layer_output = K.function([model.layers[0].input,K.learning_phase()],
[model.layers[1].layer.layers[3].output])
and passing a new input but it is giving me error.
Possible reasons:
model.layers() only gives me last layers. I want ot get weights from the Timedistributed part.
You can use the following to display all the layers in your model:
print(model.layers)
Once you know what index number is your Time Distributed layer, say, 3, then use the following to get the config and the layer weights.
g = model_name.layers[3].get_config()
h = model_name.layers[3].get_weights()
print(g)
print(h)

Iterate over a tensor dimension in Tensorflow

I am trying to develop a seq2seq model from a low level perspective (creating by myself all the tensors needed). I am trying to feed the model with a sequence of vectors as a two-dimensional tensor, however, i can't iterate over one dimension of the tensor to extract vector by vector. Does anyone know what could I do to feed a batch of vectors and later get them one by one?
This is my code:
batch_size = 100
hidden_dim = 5
input_dim = embedding_dim
time_size = 5
input_sentence = tf.placeholder(dtype=tf.float64, shape=[embedding_dim,None], name='input')
output_sentence = tf.placeholder(dtype=tf.float64, shape=[embedding_dim,None], name='output')
input_array = np.asarray(input_sentence)
output_array = np.asarray(output_sentence)
gru_layer1 = GRU(input_array, input_dim, hidden_dim) #This is a class created by myself
for i in range(input_array.shape[-1]):
word = input_array[:,i]
previous_state = gru_encoder.h_t
gru_layer1.forward_pass(previous_state,word)
And this is the error that I get
TypeError: Expected binary or unicode string, got <tf.Tensor 'input_7:0' shape=(10, ?) dtype=float64>
Tensorflow does deferred execution.
You usually can't know how big the vector will be (words in a sentance, audio samples, etc...). The common thing to do is to cap it at some reasonably large value and then pad the shorter sequences with an empty token.
Once you do this you can select the data for a time slice with the slice operator:
data = tf.placeholder(shape=(batch_size, max_size, numer_of_inputs))
....
for i in range(max_size):
time_data = data[:, i, :]
DoStuff(time_data)
Also lookup tf.transpose for swapping batch and time indices. It can help with performance in certain cases.
Alternatively consider something like tf.nn.static_rnn or tf.nn.dynamic_rnn to do the boilerplate stuff for you.
Finally I found an approach that solves my problem. It worked using tf.scan() instead of a loop, which doesn't require the input tensor to have a defined number in the second dimension. Consecuently you hace to prepare the input tensor previously to be parsed as you want throught tf.san(). In my case this is the code:
batch_size = 100
hidden_dim = 5
input_dim = embedding_dim
time_size = 5
input_sentence = tf.placeholder(dtype=tf.float64, shape=[embedding_dim,None], name='input')
output_sentence = tf.placeholder(dtype=tf.float64, shape=[embedding_dim,None], name='output')
input_array = np.asarray(input_sentence)
output_array = np.asarray(output_sentence)
x_t = tf.transpose(input_array, [1, 0], name='x_t')
h_0 = tf.convert_to_tensor(h_0, dtype=tf.float64)
h_t_transposed = tf.scan(forward_pass, x_t, h_0, name='h_t_transposed')
h_t = tf.transpose(h_t_transposed, [1, 0], name='h_t')

Categories