I have a pre-trained Fasttext model and I want to embed it in Keras.
model = Sequential()
model.add(Embedding(MAX_NB_WORDS,
EMBEDDING_DIM,
input_length=X.shape[1],
input_length=4,
weights=[embedding_matrix],
trainable=False))
But it didn't work.
I found that lots of people have same problems with embedding pre-trained model to Keras, and all of them are left with no solution.
It seems like weights and embeddings_initializer are deprecated.
Is there any alternative method to solve the problem?
Thanks in advance
Weights parameter is deprecated in Embedding layer of Keras.
The new version of embedding layer will look like below -
embedding_layer = Embedding(num_words,
EMBEDDING_DIM,
embeddings_initializer=Constant(embedding_matrix),
input_length=MAX_SEQUENCE_LENGTH,
trainable=False)
You can find latest version of embedding layer details here - Keras Embedding Layer
You can find the example of pretrained word embedding here - Pretrained Word Embedding
Related
I am using a pretrained Resnet50 (from the tensorflow.keras.applications package) and finetune it for multilabel classification (with 2 classes), and I'd like to extract the Saliency maps from the finetuned model.
To make a classifier, i add 2 dense layers to the Resnet model, creating a new sequential model as follow :
self.model = tf.keras.Sequential([
resnet50,
layers.Dense(1024, activation='relu', name='hidden_layer'),
layers.Dense(2, activation='sigmoid', name='output')
])
but my problem is that the resnet50 becomes a "single layer", like each layer is no more accessible : the model summary only contains 3 layers. I'd like to know if there is a way to add layers to a functional model without creating a sequential model, in order to be able to access each layer of the resnet model.
Thank you in advance,
To access layers of the resnet50 model, you have to first access the resnet50 layer aand then from that access the convolution layer for which you want to create saliency map.
self.model.get_layer("resnet50").get_layer("conv5_block2_3_conv")
I am trying to fine tune Universal Sentence Encoder and use the new encoder layer for something else.
import tensorflow as tf
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Dense, Dropout
import tensorflow_hub as hub
module_url = "universal-sentence-encoder"
model = Sequential([
hub.KerasLayer(module_url, input_shape=[], dtype=tf.string, trainable=True, name="use"),
Dropout(0.5, name="dropout"),
Dense(256, activation="relu", name="dense"),
Dense(len(y), activation="sigmoid", name="activation")
])
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
model.fit(X, y, batch_size=256, epochs=30, validation_split=0.25)
This worked. Loss went down and accuracy was decent. Now I want to extract just Universal Sentence Encoder layer. However, here is what I get.
Do you know how I can fix this nan issue? I expected to see encoding of numeric values.
Is it only possible to save tuned_use layer as a model as this post recommends? Ideally, I want to save tuned_use layer just like Universal Sentence Encoder so that I can open and use it exactly the same as hub.KerasLayer(tuned_use_location, input_shape=[], dtype=tf.string).
Hoping this will help someone, I ended up solving this by using universal-sentence-encoder-4 instead of universal-sentence-encoder-large-5. I spent quite a lot of time troubleshooting but it was tough as there was no issue with the input data and the model was trained successfully. This might be due to gradient exploding issue but could not add gradient clipping or Leaky ReLU into the original architecture.
I have been trying to train a bidirectional LSTM using TensorFlow v2 keras for text classification. Below is the architecture:
model1 = Sequential()
model1.add(Embedding(vocab, 128,input_length=maxlength))
model1.add(Bidirectional(LSTM(32,dropout=0.2,recurrent_dropout=0.2,return_sequences=True)))
model1.add(Bidirectional(LSTM(16,dropout=0.2,recurrent_dropout=0.2,return_sequences=True)))
model1.add(GlobalAveragePooling1D())
model1.add(Dense(5, activation='softmax'))
model1.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model1.summary()
Now, it is the summary details where I am confused
My doubts are related to the output shapes of BiLSTM layers. How they are (283,64) & (283,32) though the number of units used is 32 & 16 respectively for the 2 layers. Here, maxlength=283, vocab=19479
I believe that the explanation for this result is the bidirectional Nature of the LSTM layers in which you have added to your neural network: The size of the layer you have added is doubled for the layer to also learn the sequence backwards. I hope you can understand, if you have any questions, you can ask me in the comments.
This is because of Bidirectional. If you remove it, you'll see that output shapes are (283,32) & (283,16). Bidirectional creates some kind of extra layer
Has anyone used CNN in Keras to fine-tune a pre-trained BERT?
I have been trying to design this but the pre-trained model comes with Masks (i think in the embedding layer) and when the fine-tuning architecture is built on the output of one the encoder layers it gives this errorLayer conv1d_1 does not support masking, but was passed an input_mask
So far I have tried some suggested workarounds such as using keras_trans_mask to remove the mask before the CNN and add it back later. But that leads to other errors too.
Is it possible to disable Masking in the pre-trained model or is there a workaround for it?
EDIT: This is the code I'm working with
inputs = model.inputs[:2]
layer_output = model.get_layer('Encoder-12-FeedForward-Norm').output
conv_layer= keras.layers.Conv1D(100, kernel_size=3, activation='relu',
data_format='channels_first')(layer_output)
maxpool_layer = keras.layers.MaxPooling1D(pool_size=4)(conv_layer)
flat_layer= keras.layers.Flatten()(maxpool_layer)
outputs = keras.layers.Dense(units=3, activation='softmax')(flat_layer)
model = keras.models.Model(inputs, outputs)
model.compile(RAdam(learning_rate =LR),loss='sparse_categorical_crossentropy',metrics=['sparse_categorical_accuracy'])
So layer_output contains a mask and it cannot be applied to conv1d
I'm using Tensorflow 2.0 and a pre-trained VGG16 model and want to activate dropout during prediction. So far I tried the following without success:
model = tf.keras.applications.VGG16(input_shape=(224, 224, 3), weights='imagenet', is_training=True)
model = tf.keras.applications.VGG16(input_shape=(224, 224, 3), weights='imagenet', dropout_rate=0.5)
However, none of these approaches worked. How can I enable dropout during the prediction phase?
The VGG16 architecture does not contain a dropout layer by default. You would need to insert a dropout layer in the model.
Here is a post I found useful to solve this:
Add dropout layers between pretrained dense layers in keras