I am trying to fine tune Universal Sentence Encoder and use the new encoder layer for something else.
import tensorflow as tf
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Dense, Dropout
import tensorflow_hub as hub
module_url = "universal-sentence-encoder"
model = Sequential([
hub.KerasLayer(module_url, input_shape=[], dtype=tf.string, trainable=True, name="use"),
Dropout(0.5, name="dropout"),
Dense(256, activation="relu", name="dense"),
Dense(len(y), activation="sigmoid", name="activation")
])
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
model.fit(X, y, batch_size=256, epochs=30, validation_split=0.25)
This worked. Loss went down and accuracy was decent. Now I want to extract just Universal Sentence Encoder layer. However, here is what I get.
Do you know how I can fix this nan issue? I expected to see encoding of numeric values.
Is it only possible to save tuned_use layer as a model as this post recommends? Ideally, I want to save tuned_use layer just like Universal Sentence Encoder so that I can open and use it exactly the same as hub.KerasLayer(tuned_use_location, input_shape=[], dtype=tf.string).
Hoping this will help someone, I ended up solving this by using universal-sentence-encoder-4 instead of universal-sentence-encoder-large-5. I spent quite a lot of time troubleshooting but it was tough as there was no issue with the input data and the model was trained successfully. This might be due to gradient exploding issue but could not add gradient clipping or Leaky ReLU into the original architecture.
Related
I have a multiclass(108 classes) classification model which I want to apply transfer learning to the classification layer. I want to deploy this model in a low computing resource device (Raspberry Pi) and I thought to implement the classification layer in pure numpy instead of using Keras or TF.
Below is my original model.
from tensorflow.keras.models import Sequential, Model, LSTM, Embedding
model = Sequential()
model.add(Embedding(108, 50, input_length=10))
model.add((LSTM(32, return_sequences=False)))
model.add(Dense(108, activation="softmax"))
model.compile(loss="categorical_crossentropy", optimizer=Adam(lr=0.001), metrics=['accuracy'])
model.summary()
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=3)
history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.5, callbacks=[es]).history
I split this model into two parts, encoder and decoder as follows. decoder is the classification layer which I want to convert into NumPy model and then do the on-device transfer learning later.
encoder = Sequential([
Embedding(108, 50, input_length=10),
GRU(32, return_sequences=False)
])
decoder = Sequential([
Dense(108, activation="softmax")
])
model = Model(inputs=encoder.input, outputs=decoder(encoder.output))
model.compile(loss="categorical_crossentropy", optimizer=Adam(lr=0.001), metrics=['accuracy'])
model.summary()
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=3)
history = model.fit(X_train, y_train, epochs=50, batch_size=32, validation_split=0.5, callbacks=[es]).history
I have a few questions related to this approach.
Only way i know to train this model is, first to train the encoder and decoder. then
train NumPy classification layer using trained encoder outputs.
Is there any way i can train the NumPy model at the same time when i train the encoder (without using the above Keras decoder part and Model)? I can't use Model as I can't use Keras or TF in raspberry Pi during the transfer learning.
If there is no any way to train encoder and Numpy model at the same time,
How to use learned decoder weights as the starting weights of the Numpy Model instead of starting from random weights?
What is the most efficient code (or way) to implement the Numpy classification layer (decoder)? It requires a highly efficient model as i do the transfer learning on Raspberry Pi for incoming streaming data.
Once i trained the model for reasonable data, i plan to convert the encoder into TFLite and do the inference
Highly appreciate any help or guidance to achieve this as I'm new to NumPy-based NN implementations.
Thanks in advance
I am experimenting with training NLTK classifier model with tensorflow and keras, would anyone know if this could be recreated with sklearn neural work MLP classifier? For what I am using ML for I don't think I need tensorflow but something simplier and easier to install/deploy.
Not a lot of wisdom on machine learning wisdom here, any tips greatly appreciated even describing this deep learning tensorflow keras model greatly appreciated.
So my tf keras model architecture looks like this:
training = []
random.shuffle(training)
training = np.array(training)
# create train and test lists. X - patterns, Y - intents
train_x = list(training[:,0])
train_y = list(training[:,1])
# Create model - 3 layers. First layer 128 neurons, second layer 64 neurons and 3rd output layer contains number of neurons
# equal to number of intents to predict output intent with softmax
model = Sequential()
model.add(Dense(128, input_shape=(len(train_x[0]),), activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(len(train_y[0]), activation='softmax'))
# Compile model. Stochastic gradient descent with Nesterov accelerated gradient gives good results for this model
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
# Fit the model
model.fit(np.array(train_x), np.array(train_y), epochs=200, batch_size=5, verbose=1)
SO the sklearn neural network, am I on track at all with this below? Can someone help me I understand what exactly the tensorflow model architecture is and what cannot be duplicated with sklearn. I sort of understand tensorflow is probabaly much more powerful that sklearn that is something more simple.
#Importing MLPClassifier
from sklearn.neural_network import MLPClassifier
model = MLPClassifier(hidden_layer_sizes=(128,64),activation ='relu',solver='sgd',random_state=1)
Just google converting a keras model to pytorch, there is quite a bit of tutorials out there for that... It doesnt look easy but maybe worth the effort for what ever you need it for...
Going down this road just using sklearn MLP neural network I can get good enough results with sklearn... without the hassle of getting tensorflow installed properly.
Also on a cloud linux instance tensorflow requires a LOT more memory and storage than a FREE account can do on pythonanywhere.com but free account seems just fine with sklearn
When experimenting with sklearn MLP NN for whatever reason better results just leaving architecture as default and playing around with the learning rate.
from sklearn.neural_network import MLPClassifier
model = MLPClassifier(learning_rate_init=0.0001,max_iter=9000,shuffle=True).fit(train_x, train_y)
I am working on a multi-target regression problem in Keras and I would like the predicted values in the last layer to be sorted. I am currently implementing something like this:
# Lambda layer
import tensorflow as tf
def sort_layer(tensor):
return tf.sort(tensor)
# Training model on train set
from keras.models import Sequential
from keras.layers import Dense,Lambda
model = Sequential()
model.add(Dense(100,input_dim=X_train.shape[1],activation="relu"))
model.add(Dense(150,activation="relu"))
model.add(Dense(50,activation="relu"))
model.add(Dense(y_train.shape[1],activation="linear"))
model.add(Lambda(sort_layer))
model.compile(loss="mse", optimizer="adam")
model.fit(X_train,y_train, epochs=100,batch_size=10, verbose=0)
This doesn't seem to be working every time as some predictions don't come out sorted. Can anyone explain what I am doing wrong and suggest a good fix?
Thank you!
I am trying to build a regression model that predicts the 'Ratings' for movies using the dataset https://www.kaggle.com/shubhammehta21/movie-lens-small-latest-dataset. However after training the model, predictions outputs the same value for all test features. I have read previous similar features that suggested adjusting learning rates, no. of features and checking that the model predicting is the same as the trained model. None of these has worked for me.
I load the data and process it:
links= pd.read_csv('../input/movie-lens-small-latest-dataset/links.csv')
movies=pd.read_csv('../input/movie-lens-small-latest-dataset/movies.csv')
...
dataset=movies.merge(ratings,on='movieId').merge(tags,on='movieId').merge(links,on='movieId')
to_drop='title','genres','timestamp_x','timestamp_y','userId_y','imdbId','tmdbId']
dataset.drop(columns=to_drop,inplace=True)
dataset=pd.get_dummies(dataset)
The code shows how I build the regression model. I have tried adjusting the number of neuron and layers, however, that has not influenced the output.
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import Adam
model = Sequential()
model.add(Dense(13, input_dim=1586, kernel_initializer='zero', activation='relu'))
model.add(Dense(6, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal',activation='linear'))
# Compile model
adam = Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=adam,metrics=['mse','mae'])
model.summary()
history = model.fit(train_dataset,train_labels,batch_size=30, epochs=10,verbose=1, validation_split=0.3)
score = model.evaluate(validation_dataset,validation_labels)
print("Test score:", score)
Whenever I try to predict the test dataset:
model.predict(test_dataset)
It predicts the value of
3.97
on all values. I am expecting a range of values between 0 - 5.
You should never (I mean, never) use kernel_initializer='zero' - to be honest, I am surprised that the option even exists in Keras!
Also, kernel_initializer='normal' is not recommended.
As a first step, remove all kernel_initializer arguments, so as to revert to the default and recommended one, kernel_initializer='glorot-uniform'; keep in mind that defaults are there for a reason (usually they work well), and you should change them only if you really have a reason to do so (which I trust you don't have here) and you know what you are doing.
If you still don't get what you would expect, experiment with other parameters (no. of layers/neurons, more epochs etc); you should leave the learning rate (lr) of Adam optimizer as is for starters (it's also one of these default values that seem to work nicely across cases).
I am learning how to create sequential models. I have a model:
*model = Sequential()*
I then went on to add pooling layers and convolution layers (which were fine). But when creating the dense layer:
*model.add(Dense(num_classes, activation = 'softmax'))*
the line returned:
*tf.nn.softmax(x, axis=axis)*
which caused an error as axis was not defined. Both Keras and TensorFlow documentation show that the default axis for softmax is None or -1.
Is this a bug with keras? And is there an easy workaround (If I were to set the axis I am not sure what the input tensor would be)?
-I can add the rest of the code if necessary, but it simply consists of the other layers and I don't think it will help much.
I believe your Keras and/or TensorFlow is not up to date, you should update it/them.
This was a known problem in Keras in the Summer of 2017 and was fixed in this commit. See more on this comment on the bug report.
Also axis was introduced as an argument on November 22, 2017 in TensorFlow's softmax() so if the TensorFlow version is 1.4.0 or less, that would also cause this error.
Which one exactly causes the error depends on the rank of the tensor processed if you review the source of Keras at the linked commit.
This code is fine with the current versions (tested on https://colab.research.google.com):
import keras
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.optimizers import SGD
print( keras.__version__ )
model = Sequential()
model.add( Dense(6, input_shape=(6,), activation = 'softmax' ) )
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
outputs
2.1.6
but more importantly, compiles the model with no error.