Popping upper layers in keras - python

Suppose I have the following pretrained model:
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(3, activation='relu', input_dim=5))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam')
When I run it through the following data (X), I get the shape as expected:
import numpy as np
X = np.random.rand(20, 5)
model.predict(X).shape
giving the shape (20,1)
However, for transfer learning purposes I wish to pop the top layer and run it through the same data.
model.layers.pop()
model.summary()
>>>
Layer (type) Output Shape Param #
=================================================================
dense_3 (Dense) (None, 3) 18
=================================================================
Total params: 18
Trainable params: 18
Non-trainable params: 0
Looking at model.summary() after model.layers.pop() seems to have popped off the top layer. However, running model.predict(X).shape still results in a (20,1) shape and not (20,3) as expected.
Question: How am I supposed to correctly pop off the last few layers. This is an artificial example. In my case I need to delete the last 3 layers.

Found the answer here: https://github.com/keras-team/keras/issues/8909
The following is the answer that is needed. A second model had to be created unfortunately, and for some reason #Eric's answer doesn't seem to work anymore as suggested in the other github issue.
model.layers.pop()
model2 = Model(model.input, model.layers[-1].output)
model2.predict(X).shape

loaded_model = keras.models.load_model(fname)
# remove the last 2 layers
sliced_loaded_model = Sequential(loaded_model.layers[:-2])
# set trainable=Fasle for the layers from loaded_model
for layer in sliced_loaded_model.layers:
layer.trainable = False
# add new layers
sliced_loaded_model.add(Dense(32, activation='relu')) # trainable=True is default
sliced_loaded_model.add(Dense(1))
# compile
sliced_loaded_model.compile(loss='mse', optimizer='adam', metrics=[])
# fit
...
Simply, you can reconstruct the Sequential model

Related

remove only last(dense) layer of an already trained model, keeping all the weights of the model intact, add a different dense layer

I want to remove only the last dense layer from an already saved model in .h5 file and add a new dense layer.
Information about the saved model:
I used transfer learning on the EfficientNet B0 model and added a dropout with 2 dense layers. The last dense layer had 3 nodes equal to my number of classes, as shown below:
inputs = tf.keras.layers.Input(shape=(IMAGE_HEIGHT, IMAGE_WIDTH, 3))
x = img_augmentation(inputs)
model = tf.keras.applications.EfficientNetB0(include_top=False, input_tensor=x, weights="imagenet")
# Freeze the pretrained weights
model.trainable = False
# Rebuild top
x = tf.keras.layers.GlobalAveragePooling2D(name="avg_pool")(model.output)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(5, activation=tf.nn.relu)(x)
outputs = tf.keras.layers.Dense(len(class_names), activation="softmax", name="pred")(x)
After training, I saved my model as my_h5_model.h5
Main Task: I want to use the saved model architecture with its weights and replace only the last dense layer with 4 nodes dense layer.
I tried many things as suggested by the StackOverflow community as:
Iterate over all the layers except the last layer and add them to a separate already defined sequential model
new_model = Sequential()
for layer in (model.layers[:-1]):
new_model.add(layer)
But it gives an error which state:
ValueError: Exception encountered when calling layer "block1a_se_excite" (type Multiply).
A merge layer should be called on a list of inputs. Received: inputs=Tensor("Placeholder:0", shape=(None, 1, 1, 32), dtype=float32) (not a list of tensors)
Call arguments received:
• inputs=tf.Tensor(shape=(None, 1, 1, 32), dtype=float32)
I also tried the functional approach as:
input_layer = model.input
for layer in (model.layers[:-1]):
x = layer(input_layer)
which throws an as mention below:
ValueError: Exception encountered when calling layer "stem_bn" (type BatchNormalization).
Dimensions must be equal, but are 3 and 32 for '{{node stem_bn/FusedBatchNormV3}} = FusedBatchNormV3[T=DT_FLOAT, U=DT_FLOAT, data_format="NHWC", epsilon=0.001, exponential_avg_factor=1, is_training=false](Placeholder, stem_bn/ReadVariableOp, stem_bn/ReadVariableOp_1, stem_bn/FusedBatchNormV3/ReadVariableOp, stem_bn/FusedBatchNormV3/ReadVariableOp_1)' with input shapes: [?,224,224,3], [32], [32], [32], [32].
Call arguments received:
• inputs=tf.Tensor(shape=(None, 224, 224, 3), dtype=float32)
• training=False
Lastly, I did something that came to my mind
inputs = tf.keras.layers.Input(shape=(IMAGE_HEIGHT, IMAGE_WIDTH, 3))
x = img_augmentation(inputs)
x = model.layers[:-1](x)
x = keras.layers.Dense(5, name="compress_1")(x)
which simply gave an error as:
'list' object is not callable
I did some more experiments and was able to remove the last layer and add the new dense layer
# imported a pretained saved model
from tensorflow import keras
import tensorflow as tf
model = keras.models.load_model('/content/my_h5_model.h5')
# selected all layers except last one
x= model.layers[-2].output
outputs = tf.keras.layers.Dense(4, activation="softmax", name="predictions")(x)
model = tf.keras.Model(inputs = model.input, outputs = outputs)
model.summary()
In the saved model, I had 3 nodes on dense layers, but in the current model, I added 4 layers. The last layer summary is shown below:
dropout_3 (Dropout) (None, 1280) 0 ['batch_normalization_4[0][0]']
dense_3 (Dense) (None, 5) 6405 ['dropout_3[0][0]']
predictions (Dense) (None, 4) 24 ['dense_3[0][0]']
==================================================================================================
Have you tried switching between import keras and tensorflow.keras in your import? This has worked in other issues.

How to ensure arrays have proper dimensions for softmax layer in tensorflow python

I am building my first neural net from scratch with tensorflow and am getting stuck. I am trying to solve a multiclass text sequence classification problem (3 classes). I have been following tf tutorials, but I am not sure what is going wrong.
My input data is as follows:
samples = list of strings of text sequences ranging from 1 to 295 words long
labels = list of 295 integers (one per sample) that are either 0, 1 or 2.
My code:
import tensorflow as tf
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.layers import Embedding
from keras import preprocessing
from keras.models import Sequential
from keras.layers import Flatten, Dense
import numpy as np
# tokenize and vectorize text data to prepare for embedding
tokenizer = Tokenizer()
tokenizer.fit_on_texts(samples)
sequences = tokenizer.texts_to_sequences(samples)
word_index = tokenizer.word_index
print(f'Found {len(word_index)} unique tokens.')
# one_hot_results = tokenizer.texts_to_matrix(samples, mode='binary')
# setting variables
num_samples = len(word_index) # 1499
# Input_dim: This is the size of the vocabulary in the text data.
input_dim = num_samples # 1499
# output_dim: This is the size of the vector space in which words will be embedded.
output_dim = 32 # recommended by tutorial
# max_sequence_length: This is the length of input sequences
max_sequence_length = len(max(sequences, key=len)) # 295
# train/test index splice variable
training_samples = round(len(samples)*.8)
data = pad_sequences(sequences, maxlen=max_sequence_length)
# preprocess labels
labels = np.asarray(labels)
indices = np.arange(data.shape[0])
np.random.shuffle(indices)
data = data[indices] # shape (499, 295)
labels = to_categorical(labels, num_classes=None, dtype='float32')
# Create test/train data (80% train, 20% test)
x_train = data[:training_samples]
y_train = labels[:training_samples]
x_test = data[training_samples:]
y_test = labels[training_samples:]
# creating embedding layer with shape [amount of unique words in samples, length of longest sample in samples]
# embedding_layer = Embedding(len(word_index)+1, len(max(samples, key=len)))
model = Sequential()
model.add(Embedding(input_dim, output_dim, input_length=max_sequence_length))
model.add(Dense(32, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.summary()
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train,
y_train,
epochs=10,
batch_size=32,
validation_data=(x_test, y_test))
Essentially, I am just trying to feed in my text data as tokens into an embedding layer, and dense layer, and then the 3 node softmax layer for classification. When I run this, I get the following error:
Found 1499 unique tokens.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:66: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:541: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4432: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, 295, 32) 47968
_________________________________________________________________
dense_1 (Dense) (None, 295, 32) 1056
_________________________________________________________________
dense_2 (Dense) (None, 295, 3) 99
=================================================================
Total params: 49,123
Trainable params: 49,123
Non-trainable params: 0
_________________________________________________________________
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:793: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3576: The name tf.log is deprecated. Please use tf.math.log instead.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-8-94cb754412aa> in <module>()
49 epochs=10,
50 batch_size=32,
---> 51 validation_data=(x_test, y_test))
2 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
129 ': expected ' + names[i] + ' to have ' +
130 str(len(shape)) + ' dimensions, but got array '
--> 131 'with shape ' + str(data_shape))
132 if not check_batch_axis:
133 data_shape = data_shape[1:]
ValueError: Error when checking target: expected dense_2 to have 3 dimensions, but got array with shape (399, 3)
I need more information to solve the problem you're facing. But I can tell you what's wrong.
You have a target set of size [399, 3]. However, your model summary looks like follows.
Model: "sequential_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_5 (Embedding) (None, 295, 50) 5000
_________________________________________________________________
dense_12 (Dense) (None, 295, 32) 1632
_________________________________________________________________
dense_13 (Dense) (None, 295, 3) 99
=================================================================
You see that even though you have a list of labels with 2 dimensions, your model expects a set of targets with 3 dimensions.
Now solving that, well... that's a different story. Let's see our options.
You have exactly 295 words in each sample and 399 samples like that.
If you are 100% certain that each sequence in your data is 295 elements long, you can simply solve this problem by adding a Flatten() layer. However, keep in mind that this will increase the amount of parameters in your model with,
The length of your sequence
The embedding size
model = Sequential()
model.add(Embedding(input_dim, output_dim, input_length=max_sequence_length, input_shape=(295,)))
#model.add(lambda x: tf.reduce_mean(x, axis=1))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.summary()
You have 399 samples but each sample can have any number of items between 1-295
Now your input can have varying lengths. Therefore, you need to define input_shape argument to the first layer accordingly.
The second problem is that your Embedding layer produces varying size outputs (depending on the sequence length). But the dense layer cannot take variable size inputs. Therefore you need some "squashing" transformation inbetween. One that usually works well with Embeddings is taking the mean. You can do this the following way.
model = Sequential()
model.add(Embedding(input_dim, output_dim, input_length=max_sequence_length, input_shape=(None,)))
model.add(Lambda(lambda x: tf.reduce_mean(x, axis=1)))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.summary()
You have this Dense layer in your model which is supposed to be trained and generates the classes:
model.add(Dense(3, activation='softmax'))
So it expects to get 3-dimensional outputs. However, your dataset has 1-dimensional labels with 3 different values (0,1,2). You can easily convert your classes to categorical vectors. For example you can generate these vectors in your dataset:
0 --- (1,0,0)
1 --- (0,1,0)
2 --- (0,0,1)
If you want to automate this process, you can use Keras to_categorical function:
keras.utils.to_categorical(y, num_classes=None, dtype='float32')
For more information about to_categorical visit Keras documentation here

how does input_shape in keras.applications work?

I have been through the Keras documentation but I am still unable to figure how does the input_shape parameter works and why it does not change the number of parameters for my DenseNet model when I pass it my custom input shape. An example:
import keras
from keras import applications
from keras.layers import Conv3D, MaxPool3D, Flatten, Dense
from keras.layers import Dropout, Input, BatchNormalization
from keras import Model
# define model 1
INPUT_SHAPE = (224, 224, 1) # used to define the input size to the model
n_output_units = 2
activation_fn = 'sigmoid'
densenet_121_model = applications.densenet.DenseNet121(include_top=False, weights=None, input_shape=INPUT_SHAPE, pooling='avg')
inputs = Input(shape=INPUT_SHAPE, name='input')
model_base = densenet_121_model(inputs)
output = Dense(units=n_output_units, activation=activation_fn)(model_base)
model = Model(inputs=inputs, outputs=output)
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) (None, 224, 224, 1) 0
_________________________________________________________________
densenet121 (Model) (None, 1024) 7031232
_________________________________________________________________
dense_1 (Dense) (None, 2) 2050
=================================================================
Total params: 7,033,282
Trainable params: 6,949,634
Non-trainable params: 83,648
_________________________________________________________________
# define model 2
INPUT_SHAPE = (512, 512, 1) # used to define the input size to the model
n_output_units = 2
activation_fn = 'sigmoid'
densenet_121_model = applications.densenet.DenseNet121(include_top=False, weights=None, input_shape=INPUT_SHAPE, pooling='avg')
inputs = Input(shape=INPUT_SHAPE, name='input')
model_base = densenet_121_model(inputs)
output = Dense(units=n_output_units, activation=activation_fn)(model_base)
model = Model(inputs=inputs, outputs=output)
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) (None, 512, 512, 1) 0
_________________________________________________________________
densenet121 (Model) (None, 1024) 7031232
_________________________________________________________________
dense_2 (Dense) (None, 2) 2050
=================================================================
Total params: 7,033,282
Trainable params: 6,949,634
Non-trainable params: 83,648
_________________________________________________________________
Ideally with an increase in the input shape the number of parameters should increase, however as you can see they stay exactly the same. My questions are thus:
Why do the number of parameters not change with a change in the input_shape?
I have only defined one channel in my input_shape, what would happen to my model training in this scenario? The documentation says the following:
input_shape: optional shape tuple, only to be specified if include_top
is False (otherwise the input shape has to be (224, 224, 3) (with
'channels_last' data format) or (3, 224, 224) (with 'channels_first'
data format). It should have exactly 3 inputs channels, and width and
height should be no smaller than 32. E.g. (200, 200, 3) would be one
valid value.
However when I run the model with this configuration it runs without any problems. Could there be something that I am missing out?
Using Keras 2.2.4 with Tensorflow 1.12.0 as backend.
1.
In the convolutional layers the input size does not influence the number of weights, because the number of weights is determined by the kernel matrix dimensions. A larger input size leads to a larger output size, but not to an increasing number of weights.
This means, that the output size of the convolutional layers of the second model will be larger than for the first model, which would increase the number of weights in the following dense layer. However if you take a look into the architecture of DenseNet you notice that there's a GlobalMaxPooling2D layer after all the convolutional layers, which averages all the values for each output channel. Thats why the output of DenseNet will be of size 1024, whatever the input shape.
2.
Yes, the model will still work. I'm not entirely sure about that, but my guess is that the single channel will be broadcasted (dublicated) to fill all three channels. Thats at least how these things are usually handled (see for exaple tensorflow or numpy).
The DenseNet is composed of two parts, the convolution part, and the global pooling part.
The number of the convolution part's trainable weights doesn't depend on the input shape.
Usually, a classification network should employ fully connected layers to infer the classification, however, in DenseNet, global pooling is used and doesn't bring any trainable weights.
Therefore, the input shape doesn't affect the number of weights of the entire network.

Error when running ANN with reccurent layer

I have created the below ANN with 2 fully connected layers and one recurrent. However when running it i get the error: Exception: Input 0 is incompatible with layer lstm_11: expected ndim=3, found ndim=2 Why is it happening?
from keras.models import Sequential
from keras.layers import Dense
from sklearn.cross_validation import train_test_split
import numpy
from sklearn.preprocessing import StandardScaler
from keras.layers import LSTM
seed = 7
numpy.random.seed(seed)
dataset = numpy.loadtxt("sorted_output.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:15]
scaler = StandardScaler(copy=True, with_mean=True, with_std=True ) #data normalization
X = scaler.fit_transform(X) #data normalization
Y = dataset[:,15]
# split into 67% for train and 33% for test
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=seed)
# create model
model = Sequential()
model.add(Dense(12, input_dim=15, init='uniform', activation='relu'))
model.add(LSTM(10, return_sequences=True))
model.add(Dense(15, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test,y_test), nb_epoch=150, batch_size=10)
Based on above all of layers are "Dense" layers in LSTM you are returning the sequence as true.. You should be setting return_sequences=False because of all Dense layer and it should work.
The reason for this error is that LSTM expects the input to have a shape of 3 dimensions (for batch, sequence length and input dimension). But the Dense layer before it outputs a shape of 2 dimensions (for batch and output dimension).
You can see the output shape of the Dense layer by executing below lines of code
>>> model = Sequential()
>>> model.add(Dense(12, input_dim=15, init='uniform', activation='relu'))
>>> model.summary()
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
dense_4 (Dense) (None, 12) 192 dense_input_2[0][0]
====================================================================================================
Total params: 192
Trainable params: 192
Non-trainable params: 0
____________________________________________________________________________________________________
However, you don't explain your intention with the model, so I cannot give you further guidance for this issue. What is your input data? Do you expect the input to be a sequence?
If your input is a sequence, then I suggest you to remove the first Dense layer. But if your input is not a sequence, then I suggest you to remove the LSTM layer.

merging recurrent layers with dense layer in Keras

I want to build a neural network where the two first layers are feedforward and the last one is recurrent.
here is my code :
model = Sequential()
model.add(Dense(150, input_dim=23,init='normal',activation='relu'))
model.add(Dense(80,activation='relu',init='normal'))
model.add(SimpleRNN(2,init='normal'))
adam =OP.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
model.compile(loss="mean_squared_error", optimizer="rmsprop")
and I get this error :
Exception: Input 0 is incompatible with layer simplernn_11: expected ndim=3, found ndim=2.
model.compile(loss='mse', optimizer=adam)
It is correct that in Keras, RNN layer expects input as (nb_samples, time_steps, input_dim). However, if you want to add RNN layer after a Dense layer, you still can do that after reshaping the input for the RNN layer. Reshape can be used both as a first layer and also as an intermediate layer in a sequential model. Examples are given below:
Reshape as first layer in a Sequential model
model = Sequential()
model.add(Reshape((3, 4), input_shape=(12,)))
# now: model.output_shape == (None, 3, 4)
# note: `None` is the batch dimension
Reshape as an intermediate layer in a Sequential model
model.add(Reshape((6, 2)))
# now: model.output_shape == (None, 6, 2)
For example, if you change your code in the following way, then there will be no error. I have checked it and the model compiled without any error reported. You can change the dimension as per your need.
from keras.models import Sequential
from keras.layers import Dense, SimpleRNN, Reshape
from keras.optimizers import Adam
model = Sequential()
model.add(Dense(150, input_dim=23,init='normal',activation='relu'))
model.add(Dense(80,activation='relu',init='normal'))
model.add(Reshape((1, 80)))
model.add(SimpleRNN(2,init='normal'))
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
model.compile(loss="mean_squared_error", optimizer="rmsprop")
In Keras, you cannot put a Reccurrent layer after a Dense layer because the Dense layer gives output as (nb_samples, output_dim). However, a Recurrent layer expects input as (nb_samples, time_steps, input_dim). So, a Dense layer gives a 2-D output, but a Recurrent layer expects a 3-D input. However, you can do the reverse, i.e., put a Dense layer after a Recurrent layer.

Categories