Feature extraction from LSTM to Sklearn models - python

I've a LSTM model and i want to extract features from this LSTM to send it into a Random Forest or a logistic regression on Sklearn.
model = tf.keras.Sequential()
inputs = tf.keras.Input(shape=(t+1, n_features))
x=tf.keras.layers.LSTM(128, dropout=0.1, return_sequences=True)(inputs)
x1=tf.keras.layers.LSTM(128, dropout=0.1, return_sequences=False)(x)
o=tf.keras.layers.Dense(3,activation='softmax')(x1)
model = tf.keras.Model(inputs = inputs, outputs = o)
so i want to use x1 as the input of my Random forest.
Any idea ?
Thanks :)

Just create a model with the desired input/output tensors. For example:
feat_extractor = tf.keras.Model(inputs=inputs, outputs=x1)
# Then, assuming X is a batch of input patterns:
feats = feat_extractor.predict(X)

Related

How to save an embedding layer which was created outside model?

I created a word embedding layer outside model and used it as input before fitting my model. Now I need to predict new sentences by this model, how can I save the pre-trained embedding layer and apply it to my new sentences?
Code example:
Before input to model and fitting:
embedding_sentence = tf.keras.layers.Embedding(vocab_size, model_dimension, trainable=True)
embedded_sentence = embedding_sentence(vectorized_sentence)
Model fitting:
model = tf.keras.Sequential()
model.add(tf.keras.layers.GlobalAveragePooling1D())
...
Now I need to predict new sentences, how can I apply the trained embedding to them?
The above information is insufficient to answer this question accurately but still, I will give it a try. In tensorflow, you can use a function named get_weights to get the weights of a pre-train embedding layer and save it in a numpy/hd5 file which can be used later as an embedding layer in a new architecture.
weights = embedding_sentence.get_weights()
np.save('embedding_weights.npy', weights)
# Now load the weights to the embedding layer again
new_embedding_sentence = tf.keras.layers.Embedding(vocab_size, model_dimension, trainable=True)
new_embedding_sentence.build((None,)) # required to set the weights
new_embedding_sentence.set_weights(weights)
new_sentence = "This a dummy sentence"
new_sentence_embedding = new_embedding_sentence(new_sentence )
predictions = model(new_sentence_embedding)

How to get the latent vector as an output from a cnn model before training to the fully connected layer?

I am working on CNN model using Tensorflow frames in google collab. I am unable to extract the latent vectors from the convolutional layers. I want to extract the output of the convolutional layers, the layers before fully connected layer.
I have tried with the following code
a = dropout()(classifier_model.output)
print(a)
I am unable to understand the solution suggested on the link Stackoverflow solution to print the value of tensorflow object after applying a-conv-pool-layer
Anyone with any suggestion?
You can use get_layer method of the Model class to get a layer by its name, find bellow an example with a dummy 1D CNN and a binary classifier :
timesteps = 100
nfeatures = 2
# build the model using the functional API
# example of a 1D CNN inspired by the your stack overflow link, but using a model instead of successive *raw* layers
# the values of the Conv1D filters and kernels are different
input = Input((timesteps, nfeatures))
p = Conv1D(filters=16, kernel_size=10)(input)
p = ReLU()(p)
p = MaxPool1D(pool_size=2)(p)
p = Conv1D(filters=32, kernel_size=10)(p)
p = ReLU()(p)
p = MaxPool1D(pool_size=2)(p)
p = Conv1D(filters=64, kernel_size=10)(p)
p = ReLU()(p)
p = MaxPool1D(pool_size=2, name='conv1Dfeat')(p) # give a name to the CNN output
# fully connected part
p = Flatten()(p)
p = Dense(10)(p)
# could add a dropout layer to ease optimization
finaloutput = Dense(1, activation='sigmoid')(p)
# full model
model = Model(inputs=input, outputs=finaloutput)
# compile network, i.e. define optimizer, loss and metrics
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()
You need to train the model using the fit method with some data. Then you can get the output of the layer which name is conv1Dfeat (the last layer of the convolutive part) by defining the model:
modelCNN = Model(inputs=input, outputs=model.get_layer('conv1Dfeat').output)
modelCNN.summary()
If you want to get the output of the convolutive part, let's say based on a single numpy input array of shape (timesteps, nfeatures), you can use the predict of the Model class on batched data:
data = np.random.normal(size=(timesteps, nfeatures)) # dummy data
data_tf = tf.expand_dims(data, axis=0) # convert to TF tensor and add batch dimension at the same time
cnn_out_np = modelCNN.predict(data_tf)
cnn_out_np = np.squeeze(cnn_out_np, axis=0) # remove batch dimension
print(cnn_out_np.shape)
(4, 64)

Combine outputs of two Pre Trained models (trained on different dataset) and use some form of binary classifier to predict images

I have two Pre-Trained models.
Model_1 = Inception Model with Imagenet Dataset (1000 classes)
My_Model = Inception Model trained with a custom dataset (20 classes) via Transfer Learning and Fine-Tuning
I would like to combine the outputs of both models (Model_1 and My_Model) in a new layer.
The new layer should use some binary classifier to tell whether to use Model_1 or My_Model for prediction based on the input image.
For Example:
If I try to predict a "Dog" image, the binary classifier which combines both models should say that I need to use Model_1 to predict the Dog image (since My_Model dataset is not trained with Dog image) whereas Model_1 is trained with Dog images.
Can anyone tell me how to achieve this? Some example implementation or code snippet will be helpful.
Thanks
To do this you need to make a combined model and then train the combined model on another custom dataset here is an example of what the combined model can look like. To make the dataset, simply take each image and decide which model you'd like to use and then you can train the output of the combined model to give a positive value for one model and a negative value for the other model. hope it helps
import numpy as np
import pandas as pd
import keras
from keras.layers import Dense, Flatten, Concatenate
from tensorflow.python.client import device_lib
# check for my gpu
print(device_lib.list_local_devices())
# making some models like the ones you have
input_shape = (10000, 3)
m1_input = Input(shape = input_shape, name = "m1_input")
fc = Flatten()(m1_input)
m1_output = Dense(1000, activation='sigmoid',name = "m1_output")(fc)
Model_1 = Model(m1_input,m1_output)
m2_input = Input(shape = input_shape, name = "m2_input")
fc = Flatten()(m2_input)
m2_output = Dense(20, activation='sigmoid',name = "m2_output")(fc)
My_Model = Model(m2_input,m2_output)
# set the trained models to be untrainable
for layer in Model_1.layers:
layer.trainable = False
for layer in My_Model.layers:
layer.trainable = False
#build a combined model
combined_model_input = Input(shape = input_shape, name = "combined_model_input")
m1_predict = Model_1(combined_model_input)
m2_predict = My_Model(combined_model_input)
combined = Concatenate()([m1_predict, m2_predict])
fc = Dense(500, activation='sigmoid',name = "fc1")(combined)
fc = Dense(100, activation='sigmoid',name = "fc2")(fc)
output_layer = Dense(1, activation='tanh',name = "fc3")(fc)
model = Model(combined_model_input, output_layer)
#check the number of parameters that are trainable
print(model.summary())
#psudocode to show how to make a training set for the combined model:
combined_model_y= []
for im in images:
if class_of(im) in list_of_my_model_classes:
combined_model_y.append(1)
else:
combined_model_y.append(-1)
combined_model_y = np.array(combined_model_y)
# then train the combined model:
model.compile('adam', loss = 'binary_crossentropy')
model.fit(images, combined_model_y, ....)

Extract the output of cnn

I have trained a cnn model to classify images of dog and cat
it is giving 98% accuracy
But I want to visualize the output of cnn layer i.e the features from which my cnn is predicting whether it is a dog or a cat
If there any way to visualize the output of cnn?
You can divide your model into two models:
Previous Model:
input = Input(...)
# Your Layers
output = Dense(1)
old_model = Model(inputs=[input], output)
New Model:
input = Input(...)
#Add the first layers and the CNN here
cnn_layer = Conv2D(...)
feature_extraction_model = Model(inputs=[input], outputs=cnn_layer)
input_cnn = Input(...) # The shape of your CNN output
# Add the classification layer here
output = Dense(1)
classifier_model = Model(inputs=[input_cnn], outputs=output)
Now you define the new model as a combination of: feature_extraction_model and classifier_model
new_model = Model(inputs=[input], outputs=classifier_model(input_cnn))
# Train the model
new_model.fit(x, y)
Now you can have access to the CNNlayer post training:
cnn_output = feature_extraction_model.predict(x)

Understanding Tensorflow LSTM models input?

I have some trouble understanding LSTM models in TensorFlow.
I use the tflearn as a wrapper, as it does all the initialization and other higher level stuff automatically. For simplicity, let's consider this example program. Until line 42, net = tflearn.input_data([None, 200]), it's pretty clear what happens. You load a dataset into variables and make it of a standard length (in this case, 200). Both the input variables and also the 2 classes are, in this case, converted to one-hot vectors.
How does the LSTM take the input? Across how many samples does it predict the output?
What does net = tflearn.embedding(net, input_dim=20000, output_dim=128) represent?
My goal is to replicate the activity recognition dataset in the paper. For example, I would like to input a 4096 vector as input to the LSTM, and the idea is to take 16 of such vectors, and then produce the classification result. I think the code would look like this, but I don't know how the input to the LSTM should be given.
from __future__ import division, print_function, absolute_import
import tflearn
from tflearn.data_utils import to_categorical, pad_sequences
from tflearn.datasets import imdb
train, val = something.load_data()
trainX, trainY = train #each X sample is a (16,4096) nd float64
valX, valY = val #each Y is a one hot vector of 101 classes.
net = tflearn.input_data([None, 16,4096])
net = tflearn.embedding(net, input_dim=4096, output_dim=256)
net = tflearn.lstm(net, 256)
net = tflearn.dropout(net, 0.5)
net = tflearn.lstm(net, 256)
net = tflearn.dropout(net, 0.5)
net = tflearn.fully_connected(net, 101, activation='softmax')
net = tflearn.regression(net, optimizer='adam',
loss='categorical_crossentropy')
model = tflearn.DNN(net, clip_gradients=0., tensorboard_verbose=3)
model.fit(trainX, trainY, validation_set=(testX, testY), show_metric=True,
batch_size=128,n_epoch=2,snapshot_epoch=True)
Basically, lstm takes the size of your vector for once cell:
lstm = rnn_cell.BasicLSTMCell(lstm_size, forget_bias=1.0)
Then, how many time series do you want to feed? It's up to your fed vector. The number of arrays in the X_split decides the number of time steps:
X_split = tf.split(0, time_step_size, X)
outputs, states = rnn.rnn(lstm, X_split, initial_state=init_state)
In your example, I guess the lstm_size is 256, since it's the vector size of one word. The time_step_size would be the max word count in your training/test sentences.
Please see this example: https://github.com/nlintz/TensorFlow-Tutorials/blob/master/07_lstm.py

Categories