Replacing the embedding layer in a pretrained Keras model

Replacing the embedding layer in a pretrained Keras model - python

I'm trying to replace the embedding layer in a Keras NLP model. I've trained the model for one language, but I would like to transfer it to another language for which I have comparable embeddings. I hope to achieve this by replacing the index-to-embedding mapping for the source language by the index-to-embedding mapping for the target language.
I've tried to do it like this:
from keras.layers import Embedding
from keras.models import load_model
filename = "my_model.h5"
model = load_model(filename)
new_embedding_layer = Embedding(1000, 300, weights=[my_new_embedding_matrix], trainable=False)
new_embedding_layer.build((None, None))
model.layers[0] = new_embedding_layer
When I print out the model summary, this seems to have worked: the new embedding layer has the correct number of parameters (1000*300=300,000):
_________________________________________________________________
None
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_85 (Embedding) multiple 300000
_________________________________________________________________
lstm_1 (LSTM) (None, 128) 219648
_________________________________________________________________
dense_1 (Dense) (None, 23) 2967
=================================================================
Total params: 522,615
Trainable params: 222,615
Non-trainable params: 300,000
However, when I use the new model to process new examples, nothing seems to have changed: it still accepts input sequences that have values larger than the new vocabulary size of 1000, and returns the same predictions as before.
seq = np.array([10000])
model.predict([seq])
I also notice that the output shape of the new embedding layer is "multiple" rather than (None, None, 300). Maybe this is related to the problem?
Can anyone tell me what I'm missing?

If you Embedding layers have the same shape, then you can simply load your model as you did :
from keras.models import load_model
filename = "my_model.h5"
model = load_model(filename)
Then, rather than building a new embedding layer, you can simply set the weights of the old one :
model.layers[idx_of_your_embedding_layer].set_weights(my_new_embedding_matrix)

Related

How can i make my CNN output a vector of features

I'm working on a project and i need to make my CNN output like the output of the "Flatten" Layer.
No classification just a vector of input photo features, and I'm kind of lost... i know every thing about CNN structure but how can i start doing this with python?

Another alternative is the following.
Imagine you have a tf Keras model (here I take a small one for the sake of simplicity).
>>> model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 128) 100480
_________________________________________________________________
dense_2 (Dense) (None, 64) 8256
_________________________________________________________________
dense_3 (Dense) (None, 32) 2080
_________________________________________________________________
dense_4 (Dense) (None, 1) 33
=================================================================
Total params: 110,849
Trainable params: 110,849
Non-trainable params: 0
_________________________________________________________________
Let's say you want a $32$-long feature vector, corresponding to the layer dense_3.
now you can create another object
outputs = model.get_layer('dense_3').output
child_model = tf.keras.Model(inputs = model.inputs, outputs= outputs)
and your child model does what you want. You need to call
features = child_model.predict(image)
Important note: If you print model.layers and child_model.layers you will not be surprised they share the same layers at the same memory addresses. This means training one will set the layers weights of the other.

Are you using Keras? If you are, you can take a look here for examples (https://keras.io/api/applications/#extract-features-with-vgg16).
Basically, you do
features = model.predict(x)
np.save(outfile, features) # outfile is your desire output filename
you can load back the file using
features = np.load(outfile)

Cannot resolve error in keras sequential model

I am working on a gesture recognition problem. For that I have a train set. Train set consists of multiple folders and each folder consists of a series of 30 images. From those images the model is trained. Also I have a csv file that contains the class label of each folder. The class labels are : "Left Swipe", "Right Swipe", "Stop", "Thumbs Down" and "Thumbs Up". Those labels are present in one np.array variable train_class. Now, I have created a CNN model then feeding that in a Sequential model.
The code is available in below GIT location
https://github.com/subhrajyoti-ghosh/ML-and-Deep-Learning/blob/main/Gesture_Recognition.ipynb
But when I am trying to fit the model, I am receiving error. Can you please help me understanding the error and how to solve that?

You are trying to use a TimeDistributed layer on a 2D input (batch_size, 256), which will not work, because the layer needs at least a 3D tensor. You should try using tf.keras.layers.RepeatVector:
import tensorflow as tf
resnet = tf.keras.applications.ResNet50(include_top=False,weights='imagenet',input_shape=(224,224,3))
cnn = tf.keras.Sequential([resnet])
cnn.add(tf.keras.layers.Conv2D(64,(2,2),strides=(1,1)))
cnn.add(tf.keras.layers.Conv2D(16,(3,3),strides=(1,1)))
cnn.add(tf.keras.layers.Flatten())
inputs = tf.keras.layers.Input(shape=(224,224,3))
x = cnn(inputs)
x = tf.keras.layers.RepeatVector(n=30)(x)
x = tf.keras.layers.GRU(16,return_sequences=True)(x)
x = tf.keras.layers.GRU(8)(x)
outputs = tf.keras.layers.Dense(5,activation='softmax')(x)
model = tf.keras.Model(inputs, outputs)
dummy_x = tf.random.normal((1, 224,224,3))
print(model.summary())
print(model(dummy_x))
Model: "model_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_14 (InputLayer) [(None, 224, 224, 3)] 0
sequential_6 (Sequential) (None, 256) 24121296
repeat_vector_2 (RepeatVect (None, 30, 256) 0
or)
gru_5 (GRU) (None, 30, 16) 13152
gru_6 (GRU) (None, 8) 624
dense_7 (Dense) (None, 5) 45
=================================================================
Total params: 24,135,117
Trainable params: 24,081,997
Non-trainable params: 53,120
_________________________________________________________________
None

Acces to last convolutional layer transfer learning

I'm trying to get some heatmaps from a computervision model that's it's already working to classify images but I'm finding some difficulties.
This is the model summary:
model.summary()
Model: "model_4"
Layer (type) Output Shape Param #
=================================================================
input_9 (InputLayer) [(None, 512, 512, 1)] 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 512, 512, 3) 30
_________________________________________________________________
densenet121 (Functional) (None, 1024) 7037504
_________________________________________________________________
dense_4 (Dense) (None, 100) 102500
_________________________________________________________________
dropout_4 (Dropout) (None, 100) 0
_________________________________________________________________
predictions (Dense) (None, 2) 202
=================================================================
Total params: 7,140,236
Trainable params: 7,056,588
Non-trainable params: 83,648
As part of the standard procces to create a heatmap, I know I have to acces to the last convolutional layer in the model, that in this case I'll say it's a layer inside the Densenet121, but I can not find a way to access to all the layers belonging to densenet121.
Right now, I've been using conv2d_4 layer to run some tests, but I feel is not the right way because that layer is before all the Transfer learning work from densenet.
Also, I just looked up for Funcitnal layers in KErar official documentation but I cound't find it, so I guess it's not a layer, it's like the hole densenet model embedded there, but I can not find a way to access.
By the way, here I share the model construction because it may help to answer this:
from tensorflow.keras.applications.densenet import DenseNet121
num_classes = 2
input_tensor = Input(shape=(IMG_SIZE,IMG_SIZE,1))
x = Conv2D(3,(3,3), padding='same')(input_tensor)
x = DenseNet121(include_top=False, classes=2, pooling="avg", weights="imagenet")(x)
x = Dense(100)(x)
x = Dropout(0.45)(x)
predictions = Dense(num_classes, activation='softmax', name="predictions")(x)
model = Model(inputs=input_tensor, outputs=predictions)

I found you can use
.get_layer()
twice to acces layers inside functional densenet model embebeed in the "main" model.
In this case I can use model.get_layer('densenet121').summary() to check all thje layer inside the embebeed model, and then use them with this code: model.get_layer('densenet121').get_layer('xxxxx')

scikit-learn regression with multiple continuous targets

I want to perform regression on a dataset where the input has multiple features and the output has multiple continuous targets.
I've been looking through the sklearn documentation, but the only multi-target examples I've found have either 1) a discrete set of target labels or 2) use a heuristic algorithm like KNN instead of an optimization-based algorithm like regression. Adding regularization would also be great, but I can't find a method even for simple least-squares. This is a really simple, smooth optimization problem so I'd be shocked if it wasn't already implemented somewhere. I'd appreciate it if someone could point me in the right direction!

You can find what you are looking for here.
https://machinelearningmastery.com/multi-output-regression-models-with-python/
But it would be better to try using Keras if you have enough data (output layer without any activation.
from keras.layers import Dense, Input
from keras.models import Model
from keras.regularizers import l2
num_inputs = 10
num_outputs = 4
inp = Input((num_inputs,))
out = Dense(num_outputs, kernel_regularizer=l2(0.01))(inp)
model = Model(inp, out)
model.compile(optimizer='sgd', loss='mse', metrics=['acc','mse'])
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_9 (InputLayer) (None, 10) 0
_________________________________________________________________
dense_7 (Dense) (None, 4) 44
=================================================================
Total params: 44
Trainable params: 44
Non-trainable params: 0
_________________________________________________________________

how does input_shape in keras.applications work?

I have been through the Keras documentation but I am still unable to figure how does the input_shape parameter works and why it does not change the number of parameters for my DenseNet model when I pass it my custom input shape. An example:
import keras
from keras import applications
from keras.layers import Conv3D, MaxPool3D, Flatten, Dense
from keras.layers import Dropout, Input, BatchNormalization
from keras import Model
# define model 1
INPUT_SHAPE = (224, 224, 1) # used to define the input size to the model
n_output_units = 2
activation_fn = 'sigmoid'
densenet_121_model = applications.densenet.DenseNet121(include_top=False, weights=None, input_shape=INPUT_SHAPE, pooling='avg')
inputs = Input(shape=INPUT_SHAPE, name='input')
model_base = densenet_121_model(inputs)
output = Dense(units=n_output_units, activation=activation_fn)(model_base)
model = Model(inputs=inputs, outputs=output)
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) (None, 224, 224, 1) 0
_________________________________________________________________
densenet121 (Model) (None, 1024) 7031232
_________________________________________________________________
dense_1 (Dense) (None, 2) 2050
=================================================================
Total params: 7,033,282
Trainable params: 6,949,634
Non-trainable params: 83,648
_________________________________________________________________
# define model 2
INPUT_SHAPE = (512, 512, 1) # used to define the input size to the model
n_output_units = 2
activation_fn = 'sigmoid'
densenet_121_model = applications.densenet.DenseNet121(include_top=False, weights=None, input_shape=INPUT_SHAPE, pooling='avg')
inputs = Input(shape=INPUT_SHAPE, name='input')
model_base = densenet_121_model(inputs)
output = Dense(units=n_output_units, activation=activation_fn)(model_base)
model = Model(inputs=inputs, outputs=output)
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) (None, 512, 512, 1) 0
_________________________________________________________________
densenet121 (Model) (None, 1024) 7031232
_________________________________________________________________
dense_2 (Dense) (None, 2) 2050
=================================================================
Total params: 7,033,282
Trainable params: 6,949,634
Non-trainable params: 83,648
_________________________________________________________________
Ideally with an increase in the input shape the number of parameters should increase, however as you can see they stay exactly the same. My questions are thus:
Why do the number of parameters not change with a change in the input_shape?
I have only defined one channel in my input_shape, what would happen to my model training in this scenario? The documentation says the following:
input_shape: optional shape tuple, only to be specified if include_top
is False (otherwise the input shape has to be (224, 224, 3) (with
'channels_last' data format) or (3, 224, 224) (with 'channels_first'
data format). It should have exactly 3 inputs channels, and width and
height should be no smaller than 32. E.g. (200, 200, 3) would be one
valid value.
However when I run the model with this configuration it runs without any problems. Could there be something that I am missing out?
Using Keras 2.2.4 with Tensorflow 1.12.0 as backend.

1.
In the convolutional layers the input size does not influence the number of weights, because the number of weights is determined by the kernel matrix dimensions. A larger input size leads to a larger output size, but not to an increasing number of weights.
This means, that the output size of the convolutional layers of the second model will be larger than for the first model, which would increase the number of weights in the following dense layer. However if you take a look into the architecture of DenseNet you notice that there's a GlobalMaxPooling2D layer after all the convolutional layers, which averages all the values for each output channel. Thats why the output of DenseNet will be of size 1024, whatever the input shape.
2.
Yes, the model will still work. I'm not entirely sure about that, but my guess is that the single channel will be broadcasted (dublicated) to fill all three channels. Thats at least how these things are usually handled (see for exaple tensorflow or numpy).

The DenseNet is composed of two parts, the convolution part, and the global pooling part.
The number of the convolution part's trainable weights doesn't depend on the input shape.
Usually, a classification network should employ fully connected layers to infer the classification, however, in DenseNet, global pooling is used and doesn't bring any trainable weights.
Therefore, the input shape doesn't affect the number of weights of the entire network.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Replacing the embedding layer in a pretrained Keras model - python

Related

How can i make my CNN output a vector of features

Cannot resolve error in keras sequential model

Acces to last convolutional layer transfer learning

scikit-learn regression with multiple continuous targets

how does input_shape in keras.applications work?

Categories

Resources