I have a multiclass(108 classes) classification model which I want to apply transfer learning to the classification layer. I want to deploy this model in a low computing resource device (Raspberry Pi) and I thought to implement the classification layer in pure numpy instead of using Keras or TF.
Below is my original model.
from tensorflow.keras.models import Sequential, Model, LSTM, Embedding
model = Sequential()
model.add(Embedding(108, 50, input_length=10))
model.add((LSTM(32, return_sequences=False)))
model.add(Dense(108, activation="softmax"))
model.compile(loss="categorical_crossentropy", optimizer=Adam(lr=0.001), metrics=['accuracy'])
model.summary()
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=3)
history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.5, callbacks=[es]).history
I split this model into two parts, encoder and decoder as follows. decoder is the classification layer which I want to convert into NumPy model and then do the on-device transfer learning later.
encoder = Sequential([
Embedding(108, 50, input_length=10),
GRU(32, return_sequences=False)
])
decoder = Sequential([
Dense(108, activation="softmax")
])
model = Model(inputs=encoder.input, outputs=decoder(encoder.output))
model.compile(loss="categorical_crossentropy", optimizer=Adam(lr=0.001), metrics=['accuracy'])
model.summary()
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=3)
history = model.fit(X_train, y_train, epochs=50, batch_size=32, validation_split=0.5, callbacks=[es]).history
I have a few questions related to this approach.
Only way i know to train this model is, first to train the encoder and decoder. then
train NumPy classification layer using trained encoder outputs.
Is there any way i can train the NumPy model at the same time when i train the encoder (without using the above Keras decoder part and Model)? I can't use Model as I can't use Keras or TF in raspberry Pi during the transfer learning.
If there is no any way to train encoder and Numpy model at the same time,
How to use learned decoder weights as the starting weights of the Numpy Model instead of starting from random weights?
What is the most efficient code (or way) to implement the Numpy classification layer (decoder)? It requires a highly efficient model as i do the transfer learning on Raspberry Pi for incoming streaming data.
Once i trained the model for reasonable data, i plan to convert the encoder into TFLite and do the inference
Highly appreciate any help or guidance to achieve this as I'm new to NumPy-based NN implementations.
Thanks in advance
Related
I am experimenting with training NLTK classifier model with tensorflow and keras, would anyone know if this could be recreated with sklearn neural work MLP classifier? For what I am using ML for I don't think I need tensorflow but something simplier and easier to install/deploy.
Not a lot of wisdom on machine learning wisdom here, any tips greatly appreciated even describing this deep learning tensorflow keras model greatly appreciated.
So my tf keras model architecture looks like this:
training = []
random.shuffle(training)
training = np.array(training)
# create train and test lists. X - patterns, Y - intents
train_x = list(training[:,0])
train_y = list(training[:,1])
# Create model - 3 layers. First layer 128 neurons, second layer 64 neurons and 3rd output layer contains number of neurons
# equal to number of intents to predict output intent with softmax
model = Sequential()
model.add(Dense(128, input_shape=(len(train_x[0]),), activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(len(train_y[0]), activation='softmax'))
# Compile model. Stochastic gradient descent with Nesterov accelerated gradient gives good results for this model
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
# Fit the model
model.fit(np.array(train_x), np.array(train_y), epochs=200, batch_size=5, verbose=1)
SO the sklearn neural network, am I on track at all with this below? Can someone help me I understand what exactly the tensorflow model architecture is and what cannot be duplicated with sklearn. I sort of understand tensorflow is probabaly much more powerful that sklearn that is something more simple.
#Importing MLPClassifier
from sklearn.neural_network import MLPClassifier
model = MLPClassifier(hidden_layer_sizes=(128,64),activation ='relu',solver='sgd',random_state=1)
Just google converting a keras model to pytorch, there is quite a bit of tutorials out there for that... It doesnt look easy but maybe worth the effort for what ever you need it for...
Going down this road just using sklearn MLP neural network I can get good enough results with sklearn... without the hassle of getting tensorflow installed properly.
Also on a cloud linux instance tensorflow requires a LOT more memory and storage than a FREE account can do on pythonanywhere.com but free account seems just fine with sklearn
When experimenting with sklearn MLP NN for whatever reason better results just leaving architecture as default and playing around with the learning rate.
from sklearn.neural_network import MLPClassifier
model = MLPClassifier(learning_rate_init=0.0001,max_iter=9000,shuffle=True).fit(train_x, train_y)
I have been trying to save the weights of my neural network model so that I could use a few of its layers for another neural network model to be trained on another dataset.
pre-trained model:
model = Sequential()
model.add(tf.keras.layers.Dense(100, input_shape=(X_train_orig_sm.shape)))
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(10))
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(1))
model.add(tf.keras.layers.Activation('sigmoid'))
model.summary()
# need sparse otherwise shape is wrong. check why
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print('Fitting the data to the model')
batch_size = 20
epochs = 10
history = model.fit(X_train_orig_sm, Y_train_orig_sm, batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.2)
print('Evaluating the test data on the model')
How I saved the weights of neural network:
model.save_weights("dnn_model.h5")
How I try to load the weights of neural network:
dnn_model=model.load_weights("dnn_model.h5")
dnn_model.layers[5]
While trying to load the model, I get the following error:
AttributeError: 'NoneType' object has no attribute 'layers'
I dont seem to understand why the layers of the neural network are not recognised even though the pre-trained neural network is trained before the model was saved. Any advice, solution or direction will be highly appreciated. Thank you.
When you call model.save_weights("dnn_model.h5"), you only save the "weights" of the model. You do not save the actual structure of the model. That's why you cannot access the layers etc.
To save the actual model, you can call the below.
# save
model.save('dnn_model') # save as pb
model.save('dnn_model.h5') # save as HDF5
# load
dnn_model = tf.keras.models.load_model('dnn_model') # load as pb
dnn_model = tf.keras.models.load_model('dnn_model.h5') # load as HDF5
Note: You do not need to add an extension to the name to save as pb.
Source: https://www.tensorflow.org/tutorials/keras/save_and_load
I am trying to fine tune Universal Sentence Encoder and use the new encoder layer for something else.
import tensorflow as tf
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Dense, Dropout
import tensorflow_hub as hub
module_url = "universal-sentence-encoder"
model = Sequential([
hub.KerasLayer(module_url, input_shape=[], dtype=tf.string, trainable=True, name="use"),
Dropout(0.5, name="dropout"),
Dense(256, activation="relu", name="dense"),
Dense(len(y), activation="sigmoid", name="activation")
])
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
model.fit(X, y, batch_size=256, epochs=30, validation_split=0.25)
This worked. Loss went down and accuracy was decent. Now I want to extract just Universal Sentence Encoder layer. However, here is what I get.
Do you know how I can fix this nan issue? I expected to see encoding of numeric values.
Is it only possible to save tuned_use layer as a model as this post recommends? Ideally, I want to save tuned_use layer just like Universal Sentence Encoder so that I can open and use it exactly the same as hub.KerasLayer(tuned_use_location, input_shape=[], dtype=tf.string).
Hoping this will help someone, I ended up solving this by using universal-sentence-encoder-4 instead of universal-sentence-encoder-large-5. I spent quite a lot of time troubleshooting but it was tough as there was no issue with the input data and the model was trained successfully. This might be due to gradient exploding issue but could not add gradient clipping or Leaky ReLU into the original architecture.
I am writing a program for clasifying images into two categories: "Wires" and "non-Wires". I have hand-labeled around 5000 microscope images, examples:
non-wire
wire
The neural network I am using is adapted from "Deep Learning with Python", chapter about convolutional networks (I don't think convolutional networks are neccesary here because there are no obvious hierarchies; Dense networks should be more suitable):
model = models.Sequential()
model.add(layers.Dense(32, activation='relu',input_shape=(200,200,3)))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Dense(32, activation='relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(2, activation='softmax'))
However, test accuracy after 10 epochs of training does not go over 92% when playing around with the paramters of the network. Training images contain about 1/3 wires, 2/3 non-wires. My question: Do you see any obvious mistakes in this neural network design that inhibits accuracy, or do you think I am limited by the image quality? I have about 4000 train and 1000 test images.
You might get some improvement by trying to handle the class imbalance using a weights dictionary. If the label of non wire is 0 and the label for wire is 1 then the weight dictionary would be
weight_dict= { 0:.5, 1:1}
in model.fit set
class_weight=weight_dict .
Without seeing the results of training (training loss and validation loss) can't tell what else to do. If you are over fitting try adding some dropout layers. Also recommend you try using an adjustable learning using the keras callback ReduceLROnPlateau, and early stopping using the keras callback EarlyStopping. Documentation is here. Set each callback to monitor validation loss. My suggested code is shown below:
reduce_lr=tf.keras.callbacks.ReduceLROnPlateau(
monitor="val_loss",factor=0.5, patience=2, verbose=1)
e_stop=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=5,
verbose=0, restore_best_weights=True)
callbacks=[reduce_lr, e_stop]
In model.fit include
callbacks=callbacks
If you want to give a convolutional network a try I recommend transfer learning using the Mobilenetmodel. Documentation for that is here.. My recommend code for that is below:
base_model=tf.keras.applications.mobilenet.MobileNet( include_top=False,
input_shape=(200,200,3) pooling='max', weights='imagenet',dropout=.4)
x=base_model.output
x=keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
x = Dense(1024, activation='relu')(x)
x=Dropout(rate=.3, seed=123)(x)
output=Dense(2, activation='softmax')(x)
model=Model(inputs=base_model.input, outputs=output)
model.compile(Adamax(lr=.001),loss='categorical_crossentropy',metrics=
['accuracy'] )
In model.fit include the callbacks as shown above.
I have two GPU installed on two different machines. I want to build a cluster that allows me to learn a Keras model by using the two GPUs together.
Keras blog shows two slices of code in Distributed training section and link official Tensorflow documentation.
My problem is that I don't know how to learn my model and put into practice what is reported in Tensorflow documentation.
For example, what should I do if I want to execute the following code on a cluster of multiple GPU?
# For a single-input model with 2 classes (binary classification):
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# Generate dummy data
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(2, size=(1000, 1))
# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)
In the first and second part of the blog he explains how to use keras models with tensorflow.
Also I found this example of keras with distributed training.
And here is another with horovod.