Keras Conv2D layer outputs array filled with NaN - python

I built a keras model that takes an image as input and performs several convolutions and a pooling operation, then performs a specialized convolution layer with pre-initialized weights. When run on an image, this model outputs an array of the correct shape, but with all the elements as NaN.
The first part of the model is the first "block" of the pretrained VGG16 model for keras. The specialized layer (keras.layers.Conv2D) takes its weights as a set of filters corresponding to certain features I want to extract from the image. It does not matter if i flip the filters (to do cross-correlation), or if i change the image, always NaN. Any ideas?
EDIT: here is code. Takes a numpy image array as input.
def make_model(features, layer_name="block2_conv1"):
vgg = VGG16(include_top=False)
layer = vgg.get_layer(layer_name)
x = layer.output
num_chars, char_w, char_h, char_filters = features.shape
filters = features.transpose((1, 2, 3, 0)).astype(int)
filters = filters / np.sqrt(np.sum(np.square(filters), axis=(0, 1), keepdims=True))
x = BatchNormalization()(x)
specialized_layer = Conv2D(num_chars, (char_w, char_h))
x = specialized_layer(x)
biases = np.zeros((num_chars, ))
specialized_layer.set_weights([filters, biases])
model = Model(inputs=vgg.input, outputs=x)
return model

Related

How to get the latent vector as an output from a cnn model before training to the fully connected layer?

I am working on CNN model using Tensorflow frames in google collab. I am unable to extract the latent vectors from the convolutional layers. I want to extract the output of the convolutional layers, the layers before fully connected layer.
I have tried with the following code
a = dropout()(classifier_model.output)
print(a)
I am unable to understand the solution suggested on the link Stackoverflow solution to print the value of tensorflow object after applying a-conv-pool-layer
Anyone with any suggestion?
You can use get_layer method of the Model class to get a layer by its name, find bellow an example with a dummy 1D CNN and a binary classifier :
timesteps = 100
nfeatures = 2
# build the model using the functional API
# example of a 1D CNN inspired by the your stack overflow link, but using a model instead of successive *raw* layers
# the values of the Conv1D filters and kernels are different
input = Input((timesteps, nfeatures))
p = Conv1D(filters=16, kernel_size=10)(input)
p = ReLU()(p)
p = MaxPool1D(pool_size=2)(p)
p = Conv1D(filters=32, kernel_size=10)(p)
p = ReLU()(p)
p = MaxPool1D(pool_size=2)(p)
p = Conv1D(filters=64, kernel_size=10)(p)
p = ReLU()(p)
p = MaxPool1D(pool_size=2, name='conv1Dfeat')(p) # give a name to the CNN output
# fully connected part
p = Flatten()(p)
p = Dense(10)(p)
# could add a dropout layer to ease optimization
finaloutput = Dense(1, activation='sigmoid')(p)
# full model
model = Model(inputs=input, outputs=finaloutput)
# compile network, i.e. define optimizer, loss and metrics
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()
You need to train the model using the fit method with some data. Then you can get the output of the layer which name is conv1Dfeat (the last layer of the convolutive part) by defining the model:
modelCNN = Model(inputs=input, outputs=model.get_layer('conv1Dfeat').output)
modelCNN.summary()
If you want to get the output of the convolutive part, let's say based on a single numpy input array of shape (timesteps, nfeatures), you can use the predict of the Model class on batched data:
data = np.random.normal(size=(timesteps, nfeatures)) # dummy data
data_tf = tf.expand_dims(data, axis=0) # convert to TF tensor and add batch dimension at the same time
cnn_out_np = modelCNN.predict(data_tf)
cnn_out_np = np.squeeze(cnn_out_np, axis=0) # remove batch dimension
print(cnn_out_np.shape)
(4, 64)

Image + float array as input in a Keras model

I have an image as input on my model, but I need to input some floats as well as support information about the image, but I don´t want it to go through all the convolutions, I want it to go directly to my dense layers as information on how to train it. I know about the concatenate layer but I don´t know how to use it in the input, or if that is how it should be done.
Assuming you have a backbone which can be any convolutional neural nets (VGG, ResNet, etc.). Before the dense layer, you usually have a Flatten() one (or, in modern neural nets, you usually have a pooling layer like GAP or GeM) which prepares a 1D vector as input to your Dense layer. That's where you can concatenate with your floats.
Code example using Functional API:
class MyModel(tf.keras.Model):
def __init__(self, num_output_classes):
super().__init__()
self.backbone = tf.keras.applications.ResNet50(
input_shape=(224, 224, 3), include_top=False)
self.pool = tf.keras.layers.GlobalAveragePooling2D()
self.concat = tf.keras.layers.Concatenate(axis=-1)
self.dense = tf.keras.layers.Dense(num_output_classes)
def call(self, inputs):
# Unpack the inputs. `additional_floats` should be 1D
image, additional_floats = inputs
# Run image through backbone and get a feature vector
x = self.backbone(image)
x = self.pool(x)
# Concatenate with your additional floats
x = self.concat([x, additional_inputs])
# Classification, or whatever you might need on top
return self.dense(x, activation='softmax')

Keras: Share a layer of weights across Training Examples (Not between layers)

The problem is the following. I have a categorical prediction task of vocabulary size 25K. On one of them (input vocab 10K, output dim i.e. embedding 50), I want to introduce a trainable weight matrix for a matrix multiplication between the input embedding (shape 1,50) and the weights (shape(50,128)) (no bias) and the resulting vector score is an input for a prediction task along with other features.
The crux is, I think that the trainable weight matrix varies for each input, if I simply add it in. I want this weight matrix to be common across all inputs.
I should clarify - by input here I mean training examples. So all examples would learn some example specific embedding and be multiplied by a shared weight matrix.
After every so many epochs, I intend to do a batch update to learn these common weights (or use other target variables to do multiple output prediction)
LSTM? Is that something I should look into here?
With the exception of an Embedding layer, layers apply to all examples in the batch.
Take as an example a very simple network:
inp = Input(shape=(4,))
h1 = Dense(2, activation='relu', use_bias=False)(inp)
out = Dense(1)(h1)
model = Model(inp, out)
This a simple network with 1 input layer, 1 hidden layer and an output layer. If we take the hidden layer as an example; this layer has a weights matrix of shape (4, 2,). At each iteration the input data which is a matrix of shape (batch_size, 4) is multiplied by the hidden layer weights (feed forward phase). Thus h1 activation is dependent on all samples. The loss is also computed on a per batch_size basis. The output layer has a shape (batch_size, 1). Given that in the forward phase all the batch samples affected the values of the weights, the same is true for backdrop and gradient updates.
When one is dealing with text, often the problem is specified as predicting a specific label from a sequence of words. This is modelled as a shape of (batch_size, sequence_length, word_index). Lets take a very basic example:
from tensorflow import keras
from tensorflow.keras.layers import *
from tensorflow.keras.models import Model
sequence_length = 80
emb_vec_size = 100
vocab_size = 10_000
def make_model():
inp = Input(shape=(sequence_length, 1))
emb = Embedding(vocab_size, emb_vec_size)(inp)
emb = Reshape((sequence_length, emb_vec_size))(emb)
h1 = Dense(64)(emb)
recurrent = LSTM(32)(h1)
output = Dense(1)(recurrent)
model = Model(inp, output)
model.compile('adam', 'mse')
return model
model = make_model()
model.summary()
You can copy and paste this into colab and see the summary.
What this example is doing is:
Transform a sequence of word indices into a sequence of word embedding vectors.
Applying a Dense layer called h1 to all the batches (and all the elements in the sequence); this layer reduces the dimensions of the embedding vector. It is not a typical element of a network to process text (in isolation). But this seemed to match your question.
Using a recurrent layer to reduce the sequence into a single vector per example.
Predicting a single label from the "sentence" vector.
If I get the problem correctly you can reuse layers or even models inside another model.
Example with a Dense layer. Let's say you have 10 Inputs
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
# defining 10 inputs in a List with (X,) shape
inputs = [Input(shape = (X,),name='input_{}'.format(k)) for k in
range(10)]
# defining a common Dense layer
D = Dense(64, name='one_layer_to_rule_them_all')
nets = [D(inp) for inp in inputs]
model = Model(inputs = inputs, outputs = nets)
model.compile(optimizer='adam', loss='categorical_crossentropy')
This code is not going to work if the inputs have different shapes. The first call to D defines its properties. In this example, outputs are set directly to nets. But of course you can concatenate, stack, or whatever you want.
Now if you have some trainable model you can use it instead of the D:
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
# defining 10 inputs in a List with (X,) shape
inputs = [Input(shape = (X,),name='input_{}'.format(k)) for k in
range(10)]
# defining a shared model with the same weights for all inputs
nets = [special_model(inp) for inp in inputs]
model = Model(inputs = inputs, outputs = nets)
model.compile(optimizer='adam', loss='categorical_crossentropy')
The weights of this model are shared among all inputs.

How to feed tensor to pre-trained model in the computational graph with keras?

I want to train a specific conditional GAN with some deterministic constraints at the end of my generator with Keras and to do so I need first to compute the embeddings of my Generator outputs with VGG-16 pre-trained model.
I'm using python 3.6.
In my computational Graph, I want to feed my Generator outputs img to a pre-trained VGG-16 model in order to get the embeddings.
My img is then a tensor of shape (None,224,224,3) since I am in the computational Graph. Thing is if i compile the following i get the error
When feeding symbolic tensors to a model, we expect the tensors to
have a static batch size. Got tensor with shape: (None, 224, 224, 3)
self.vgg = self.build_vgg()
def build_vgg(self):
vgg16_model = keras.applications.vgg16.VGG16()
return Model(inputs=vgg16_model.input,outputs=vgg16_model.get_layer('fc2').output)
#-------------------------------
# Construct Computational Graph
# for Generator
#-------------------------------
# For the generator we freeze the critic's layers
self.critic.trainable = False
self.generator.trainable = True
self.vgg.trainable = False
# Sampled noise for input to generator
noise = Input(shape=(self.latent_dim,))
# Input Embedding:
embedding = Input(shape=(self.embedding,))
# Generate images based of noise
img = self.generator([noise,embedding])
# Discriminator determines validity
valid = self.critic(img)
# Get the embeddings from vgg-16:
X = self.vgg.predict(img)
Obviously, I can't loop along the first axis since it's None index. I tried to apply a function to this 'img' tensor using the tensorflow function 'tf.map_fn' like the following :
def Embedding(self,img):
fn = lambda x: self.vgg.predict(preprocess_input(np.expand_dims(x, axis=0))).flatten()
embedding = tf.map_fn(fn,img,dtype=tf.float32)
return embedding
#-------------------------------
# Construct Computational Graph
# for Generator
#-------------------------------
# For the generator we freeze the critic's layers
self.critic.trainable = False
self.generator.trainable = True
self.vgg.trainable = False
# Sampled noise for input to generator
noise = Input(shape=(self.latent_dim,))
# Input Embedding:
embedding = Input(shape=(self.embedding,))
# Generate images based of noise
img = self.generator([noise,embedding])
# Discriminator determines validity
valid = self.critic(img)
# Get the embeddings from VGG16
X = self.Embedding(img)
But i get the following error:
ValueError: setting an array element with a sequence.
To recap, I want to apply a pre-trained VGG-16 model on a tensor with shape (None,224,224,3) along the Batch_Size Axis (0) in the computational graph in Keras. What i explained to you before is what I already tried...
Does anyone have any suggestion to this ?

Make fixed timestep length LSTM Keras model free timestep length

I have a Keras LSTM multitask model that performs two tasks. One is a sequence tagging task (so I predict a label per token). The other is a global classification task over the whole sequence using a CNN that is stacked on the hidden states of the LSTM.
In my setup (don't ask why) I only need the CNN task during training, but the labels it predicts have no use on the final product. So, on Keras, one can train a LSTM model without especifiying the input sequence lenght. like this:
l_input = Input(shape=(None,), dtype="int32", name=input_name)
However, if I add the CNN stacked on the LSTM hidden states I need to set a fixed sequence length for the model.
l_input = Input(shape=(timesteps_size,), dtype="int32", name=input_name)
The problem is that once I have trained the model with a fixed timestep_size I can no longer use it to predict longer sequences.
In other frameworks this is not a problem. But in Keras, I cannot get rid of the CNN and change the expected input shape of the model once it has been trained.
Here is a simplified version of the model
l_input = Input(shape=(timesteps_size,), dtype="int32")
l_embs = Embedding(len(input.keys()), 100)(l_input)
l_blstm = Bidirectional(GRU(300, return_sequences=True))(l_embs)
# Sequential output
l_out1 = TimeDistributed(Dense(len(labels.keys()),
activation="softmax"))(l_blstm)
# Global output
conv1 = Conv1D( filters=5 , kernel_size=10 )( l_embs )
conv1 = Flatten()(MaxPooling1D(pool_size=2)( conv1 ))
conv2 = Conv1D( filters=5 , kernel_size=8 )( l_embs )
conv2 = Flatten()(MaxPooling1D(pool_size=2)( conv2 ))
conv = Concatenate()( [conv1,conv2] )
conv = Dense(50, activation="relu")(conv)
l_out2 = Dense( len(global_labels.keys()) ,activation='softmax')(conv)
model = Model(input=input, output=[l_out1, l_out2])
optimizer = Adam()
model.compile(optimizer=optimizer,
loss="categorical_crossentropy",
metrics=["accuracy"])
I would like to know if anyone here has faced this issue, and if there are any solutions to delete layers from a model after training and, more important, how to reshape input layer sizes after training.
Thanks
Variable timesteps length makes a problem not because of using convolution layers (actually the good thing about convolution layers is that they do not depend on the input size). Rather, using Flatten layers cause the problem here since they need an input with specified size. Instead, you can use Global Pooling layers. Further, I think stacking convolution and pooling layers on top of each other might give a better result instead of using two separate convolution layers and merging them (although this depends on the specific problem and dataset you are working on). So considering these two points it might be better to write your model like this:
# Global output
conv1 = Conv1D(filters=16, kernel_size=5)(l_embs)
conv1 = MaxPooling1D(pool_size=2)(conv1)
conv2 = Conv1D(filters=32, kernel_size=5)(conv1)
conv2 = MaxPooling1D(pool_size=2)(conv2)
gpool = GlobalAveragePooling1D()(conv2)
x = Dense(50, activation="relu")(gpool)
l_out2 = Dense(len(global_labels.keys()), activation='softmax')(x)
model = Model(inputs=l_input, outputs=[l_out1, l_out2])
You may need to tune the number of conv+maxpool layers, number of filters, kernel size and even add dropout or batch normalization layers.
As a side note, using TimeDistributed on a Dense layer is redundant as the Dense layer is applied on the last axis.

Categories