How to save an embedding layer which was created outside model? - python

I created a word embedding layer outside model and used it as input before fitting my model. Now I need to predict new sentences by this model, how can I save the pre-trained embedding layer and apply it to my new sentences?
Code example:
Before input to model and fitting:
embedding_sentence = tf.keras.layers.Embedding(vocab_size, model_dimension, trainable=True)
embedded_sentence = embedding_sentence(vectorized_sentence)
Model fitting:
model = tf.keras.Sequential()
model.add(tf.keras.layers.GlobalAveragePooling1D())
...
Now I need to predict new sentences, how can I apply the trained embedding to them?

The above information is insufficient to answer this question accurately but still, I will give it a try. In tensorflow, you can use a function named get_weights to get the weights of a pre-train embedding layer and save it in a numpy/hd5 file which can be used later as an embedding layer in a new architecture.
weights = embedding_sentence.get_weights()
np.save('embedding_weights.npy', weights)
# Now load the weights to the embedding layer again
new_embedding_sentence = tf.keras.layers.Embedding(vocab_size, model_dimension, trainable=True)
new_embedding_sentence.build((None,)) # required to set the weights
new_embedding_sentence.set_weights(weights)
new_sentence = "This a dummy sentence"
new_sentence_embedding = new_embedding_sentence(new_sentence )
predictions = model(new_sentence_embedding)

Related

get outputs of classifier layers from '.pt' pretrained AlexNet model

I have a '.pt' file that includes alexnet model which trained on my dataset. How I can get "out_features" of classifier layers (layers 1 & 4) after running my model for different dataset.
I needs this data for inputs of SVM.
I have tried:
Model(inputs, outputs=model.classifier[1].out_features)
model.classifier[1].out_features(inputs)
model.classifier[1].parameters(torch.tensor(inputs))
but they didn't work
at first you have to load model
model = torch.load(model.pt)
after that you have to remove the last layer
features = list(model.classifier.children())[:-4] # Remove last layer
model.classifier = nn.Sequential(*features)
then you can have the weights of last layer by applying model for inputs
out = model(inputs)

Getting embeddings from wav2vec2 models in HuggingFace

I am trying to get the embeddings from pre-trained wav2vec2 models (e.g., from jonatasgrosman/wav2vec2-large-xlsr-53-german) using my own dataset.
My aim is to use these features for a downstream task (not specifically speech recognition). Namely, since the dataset is relatively small, I would train an SVM with these embeddings for the final classification.
So far I have tried this:
model_name = "facebook/wav2vec2-large-xlsr-53-german"
feature_extractor = Wav2Vec2Processor.from_pretrained(model_name)
model = Wav2Vec2Model.from_pretrained(model_name)
input_values = feature_extractor(train_dataset[:10]["speech"], return_tensors="pt", padding=True,
feature_size=1, sampling_rate=16000 ).input_values
Then, I am not sure whether the embeddings here correspond to the sequence of last_hidden_states:
hidden_states = model(input_values).last_hidden_state
or to the sequence of features of the last conv layer of the model:
features_last_cnn_layer = model(input_values).extract_features
Also, is this the correct way to extract features from a pre-trained model?
How one can get embeddings from a specific layer?
PD: Posting here as the HuggingFace's forum seems to be less active.
Just check the documentation:
last_hidden_state (torch.FloatTensor of shape (batch_size,
sequence_length, hidden_size)) – Sequence of hidden-states at the
output of the last layer of the model.
extract_features (torch.FloatTensor of shape (batch_size,
sequence_length, conv_dim[-1])) – Sequence of extracted feature
vectors of the last convolutional layer of the model.
The last_hidden_state vector represents so called contextualized embeddings (i.e. every feature (CNN output) has a vector representation that is to some extend influenced by the other tokens of the sequence).
The extract_features vector represents the embeddings of your input (after the CNNs).
.
Also, is this the correct way to extract features from a pre-trained
model?
Yes.
How one can get embeddings from a specific layer?
Set output_hidden_states=True:
o = model(input_values,output_hidden_states=True)
o.keys()
Output:
odict_keys(['last_hidden_state', 'extract_features', 'hidden_states'])
The hidden_states value contains the embeddings and the contextualized embeddings of each attention layer.
P.S.: jonatasgrosman/wav2vec2-large-xlsr-53-german model was trained with feat_extract_norm==layer. That means, you should also pass an attention mask to the model:
model_name = "facebook/wav2vec2-large-xlsr-53-german"
feature_extractor = Wav2Vec2Processor.from_pretrained(model_name)
model = Wav2Vec2Model.from_pretrained(model_name)
i= feature_extractor(train_dataset[:10]["speech"], return_tensors="pt", padding=True,
feature_size=1, sampling_rate=16000 )
model(**i)

PyTorch transfer learning with pre-trained ImageNet model

I want to create an image classifier using transfer learning on a model already trained on ImageNet.
How do I replace the final layer of a torchvision.models ImageNet classifier with my own custom classifier?
Get a pre-trained ImageNet model (resnet152 has the best accuracy):
from torchvision import models
# https://pytorch.org/docs/stable/torchvision/models.html
model = models.resnet152(pretrained=True)
Print out its structure so we can compare to the final state:
print(model)
Remove the last module (generally a single fully connected layer) from model:
classifier_name, old_classifier = model._modules.popitem()
Freeze the parameters of the feature detector part of the model so that they are not adjusted by back-propagation:
for param in model.parameters():
param.requires_grad = False
Create a new classifier:
classifier_input_size = old_classifier.in_features
classifier = nn.Sequential(OrderedDict([
('fc1', nn.Linear(classifier_input_size, hidden_layer_size)),
('activation', nn.SELU()),
('dropout', nn.Dropout(p=0.5)),
('fc2', nn.Linear(hidden_layer_size, output_layer_size)),
('output', nn.LogSoftmax(dim=1))
]))
The module name for our classifier needs to be the same as the one which was removed. Add our new classifier to the end of the feature detector:
model.add_module(classifier_name, classifier)
Finally, print out the structure of the new network:
print(model)

Extending the vocabulary on deployment

When doing training, I initialize my embedding matrix, using the pretrained embeddings picked for words in training set vocabulary.
import torchtext as tt
contexts = tt.data.Field(lower=True, sequential=True, tokenize=tokenizer, use_vocab=True)
contexts.build_vocab(data, vectors="fasttext.en.300d",
vectors_cache=config["vectors_cache"])
In my model I pass contexts.vocab as parameter and initialize embeddings:
embedding_dim = vocab.vectors.shape[1]
self.embeddings = nn.Embedding(len(vocab), embedding_dim)
self.embeddings.weight.data.copy_(vocab.vectors)
self.embeddings.weight.requires_grad=False
I train my model and during training I save its 'best' state via torch.save(model, f).
Then I want to test/create demo for model in separate file for evaluation. I load the model via torch.load. How do I extend the embedding matrix to contain test vocabulary? I tried to replace embedding matrix
# data is TabularDataset with test data
contexts.build_vocab(data, vectors="fasttext.en.300d",
vectors_cache=config["vectors_cache"])
model.embeddings = torch.nn.Embedding(len(contexts.vocab), contexts.vocab.vectors.shape[1])
model.embeddings.weight.data.copy_(contexts.vocab.vectors)
model.embeddings.weight.requires_grad = False
But the results are terrible (almost 0 accuracy). Model was doing good during training. What is the 'correct way' of doing this?

Delete last layer and insert three Conv2D layers in Keras

I have a model in Keras for classification that I trained on some dataset. Call that model "classification_model". That model is saved in "classification.h5". Model for detection is the same, except that we delete last convolutional layer, and add three Conv2D layers of size (3,3). So, our model for detection "detection_model" should look like this:
detection_model = classification_model[: last_conv_index] + Conv2d + Conv2d + Conv2d.
How can we implement that in Keras?
Well, load your classification model and use Keras functional API to construct your new model:
model = load_model("classification.h5")
last_conv_layer_output = model.layers[last_conv_index].output
conv = Conv2D(...)(last_conv_layer_output)
conv = Conv2D(...)(conv)
output = Conv2D(...)(conv)
new_model = Model(model.inputs, output)
# compile the new model and save it ...

Categories