Let suppose I have a keras model and a function train_model(data) to train it on some data.
I would like to know if it is possible to combine/merge identical architecture, identical hyperparam models that have been trained separately/independently?
python train_model( data1 ) ### one one epoch
python train_model( data2 ) ### one one epoch
...
then
load( model1 )
load( model2 )
model3 = combine( model1, model2 )
### model3 equivalent to 2 epochs of learning.
I try to understand/find a way to distribute the learning.
Have you already tried this?
from keras.models import load_model
# load models - it is just the architecture
model1 = load_model('path/to/trained/model1.h5')
model2 = load_model('path/to/trained/model2.h5')
# load trained weights
model1.load_weights('path/to/weights/from/model1.hdf5')
model2.load_weights('path/to/weights/from/model2.hdf5')
# create a model that will merge both 1 and 2
model = Sequential()
model.add(Merge([model1, model2], mode = 'concat'))
model.add(Dense(1)) # for regression, use you last Dense layer here
model.compile(#your compiling parameters)
# use your merged model
model.predict(dataset_to_be_predicted)
The idea is from here
Related
I have a '.pt' file that includes alexnet model which trained on my dataset. How I can get "out_features" of classifier layers (layers 1 & 4) after running my model for different dataset.
I needs this data for inputs of SVM.
I have tried:
Model(inputs, outputs=model.classifier[1].out_features)
model.classifier[1].out_features(inputs)
model.classifier[1].parameters(torch.tensor(inputs))
but they didn't work
at first you have to load model
model = torch.load(model.pt)
after that you have to remove the last layer
features = list(model.classifier.children())[:-4] # Remove last layer
model.classifier = nn.Sequential(*features)
then you can have the weights of last layer by applying model for inputs
out = model(inputs)
I have a question regarding the evaluation of an LSTM Model. I have trained an LSTM Model and stored it with model.save(...). Now I want load_model and evaluate it on the validation set datasets. Since neural networks are stochastic, I run it several times and compute the mean and the variance of the different metrics I am interested in.
Now I am shocked that after the first run all consecutive runs have the same performance on every metric. I don't think that is right, but I don't know where the error occurs.
So my question is:
what is my mistake in setting up the validation of my model?
and how can I fix that?
Here are the code snippets that should explain what I am doing:
Compile and fit the Model
def compile_and_fit( hparams,
MAX_EPOCHS,
model_path ):
window = WindowGenerator( input_width= hparams[HP_WINDOW_SIZE],
label_width=hparams[HP_WINDOW_SIZE], shift=1,
label_columns=['q_MARI'], batch_size = hparams[HP_BATCH_SIZE])
model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(hparams[HP_NUM_UNITS], return_sequences=True, name="LSTM_1"),
tf.keras.layers.Dropout(hparams[HP_DROPOUT], name="Dropout_1"),
tf.keras.layers.LSTM(hparams[HP_NUM_UNITS], return_sequences=True, name="LSTM_2"),
tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(1))
])
learning_rate = hparams[HP_LEARNING_RATE]
model.compile(loss=tf.losses.MeanSquaredError(),
optimizer=tf.optimizers.Adam(learning_rate=learning_rate),
metrics=get_metrics())
history = model.fit(window.train,
epochs=MAX_EPOCHS,
validation_data=window.val,
callbacks= get_callbacks(model_path))
_, a,_,_,_,_ = model.evaluate(window.val)
return a, model, history
Train and safe it
a, model, history = compile_and_fit( hparams = hparams, MAX_EPOCHS = MAX_EPOCHS, model_path = run_path)
model.save(run_path)
Load and evaluate it
model = tf.keras.models.load_model(os.path.join(hparam_path, model_name),
custom_objects={"max_error": max_error, "median_absolute_error": median_absolute_error, "rev_metric": rev_metric, "nse_metric": nse_metric})
model.compile(loss=tf.losses.MeanSquaredError(), optimizer="adam", metrics=get_metrics())
metric_values = np.empty(shape = (nr_runs, len(metrics)), dtype=float)
for j in range(nr_runs):
window = WindowGenerator(input_width= hparam_vals[i], label_width=hparam_vals[i], shift=1,
label_columns=['q_MARI'])
metric_values[j]= np.array(model.evaluate(window.val))
means = metric_values.mean(axis=0)
varis = metric_values.var(axis=0)
print(f'means: {means}, varis: {varis}')
The results I am getting
For setting up the Training I follow those two guides:
https://www.tensorflow.org/tutorials/structured_data/time_series
https://www.tensorflow.org/tensorboard/hyperparameter_tuning_with_hparams
LSTM is not stochastic. Evaluation results should be the same for the same data.
There are two steps, when you train the model, randomness will influence the model you trained. However, after that, you saved the model, the prediction result would be same if you use the same model.
I'm beginner in Python and ML. I was practising this Iris Data set to create a ML model using tensor flow 2.0.
I parsed the csv and trained the model using the dataset. I'm able to get 90 % training accuracy and 91 % validation accuracy during my model creation.
import tensorflow as tf
import numpy as np
from sklearn import preprocessing
csv_data = np.loadtxt('iris_training.csv',delimiter=',')
target_all = csv_data[:,-1]
csv_data = csv_data[:,0:-1]
# Shuffling the input
shuffled_indices = np.arange(csv_data.shape[0])
np.random.shuffle(shuffled_indices)
shuffled_inputs = csv_data[shuffled_indices]
shuffled_targets = target_all[shuffled_indices]
# Standardize the Inputs
shuffled_inputs = preprocessing.scale(shuffled_inputs)
# Split date into train , validation and test
total_count = shuffled_inputs.shape[0]
train_data_count = int(0.8*total_count)
validation_data_count = int(0.1*total_count)
test_data_count = total_count - train_data_count - validation_data_count
train_inputs = shuffled_inputs[:train_data_count]
train_targets = shuffled_targets[:train_data_count]
validation_inputs = shuffled_inputs[train_data_count:train_data_count+validation_data_count]
validation_targets = shuffled_targets[train_data_count:train_data_count+validation_data_count]
test_inputs = shuffled_inputs[train_data_count+validation_data_count:]
test_targets = shuffled_targets[train_data_count+validation_data_count:]
print(len(train_inputs))
print(len(validation_inputs))
print(len(test_inputs))
# Model Creation
input_size = 4
hidden_layer_size = 100
output_size = 3
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(hidden_layer_size, input_dim=input_size, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(hidden_layer_size, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(output_size, activation=tf.nn.softmax))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(train_inputs,train_targets, epochs=10, validation_data=(validation_inputs, validation_targets), verbose=2)
prediction = model.predict(test_inputs)
Point me if there is something in my code that i could do to improve the accuracy of my model for this simple Iris Dataset.
File Used for training my Model : Iris Csv
As for your model, you can try to do hyperparameters tuning,
Setting the learning rate to a lower value
Increase the epoch
Add more training dataset since you have a small set of the dataset.
The neural network shines when there is a good amount of data for the training.
You can also add more layers to the model, add dropouts to avoid overfitting
as well as using different activation functions.
These are the common factors that affect model performance.
In my TensorFlow model I have some data that I feed into a stack of CNNs before it goes into a few fully connected layers. I have implemented that with Keras' Sequential model. However, I now have some data that should not go into the CNN and instead be fed directly into the first fully connected layer because that data contains some values and labels that are part of the input data but that data should not undergo convolutions as it is not image data.
Is such a thing possible with tensorflow.keras or should I do that with tensorflow.nn instead? As far as I understand Keras' sequential models is that the input goes in one end and comes out the other with no special wiring in the middle.
Am I correct that to do this I have to use tensorflow.concat on the data from the last CNN layer and the data that bypasses the CNNs before feeding it into the first fully connected layer?
Here is an simple example in which the operation is to sum the activations from different subnets:
import keras
import numpy as np
import tensorflow as tf
from keras.layers import Input, Dense, Activation
tf.reset_default_graph()
# this represents your cnn model
def nn_model(input_x):
feature_maker = Dense(10, activation='relu')(input_x)
feature_maker = Dense(20, activation='relu')(feature_maker)
feature_maker = Dense(1, activation='linear')(feature_maker)
return feature_maker
# a list of input layers, of course the input shapes can be different
input_layers = [Input(shape=(3, )) for _ in range(2)]
coupled_feature = [nn_model(input_x) for input_x in input_layers]
# assume you take the sum of the outputs
coupled_feature = keras.layers.Add()(coupled_feature)
prediction = Dense(1, activation='relu')(coupled_feature)
model = keras.models.Model(inputs=input_layers, outputs=prediction)
model.compile(loss='mse', optimizer='adam')
# example training set
x_1 = np.linspace(1, 90, 270).reshape(90, 3)
x_2 = np.linspace(1, 90, 270).reshape(90, 3)
y = np.random.rand(90)
inputs_x = [x_1, x_2]
model.fit(inputs_x, y, batch_size=32, epochs=10)
You can actually plot the model to gain more intuition
from keras.utils.vis_utils import plot_model
plot_model(model, show_shapes=True)
The model of the above code looks like this
With a little remodeling and the functional API you can:
#create the CNN - it can also be a sequential
cnn_input = Input(image_shape)
cnn_output = Conv2D(...)(cnn_input)
cnn_output = Conv2D(...)(cnn_output)
cnn_output = MaxPooling2D()(cnn_output)
....
cnn_model = Model(cnn_input, cnn_output)
#create the FC model - can also be a sequential
fc_input = Input(fc_input_shape)
fc_output = Dense(...)(fc_input)
fc_output = Dense(...)(fc_output)
fc_model = Model(fc_input, fc_output)
There is a lot of space for creativity, this is just one of the ways.
#create the full model
full_input = Input(image_shape)
full_output = cnn_model(full_input)
full_output = fc_model(full_output)
full_model = Model(full_input, full_output)
You can use any of the three models in any way you want. They share the layers and the weights, so internally they are the same.
Saving and loading the full model might be quirky. I'd probably save the other two separately and when loading create the full model again.
Notice also that if you save two models that share the same layers, after loading they will probably not share these layers anymore. (Another reason for saving/loading only fc_model and cnn_model, while creating full_model again from code)
I have two Pre-Trained models.
Model_1 = Inception Model with Imagenet Dataset (1000 classes)
My_Model = Inception Model trained with a custom dataset (20 classes) via Transfer Learning and Fine-Tuning
I would like to combine the outputs of both models (Model_1 and My_Model) in a new layer.
The new layer should use some binary classifier to tell whether to use Model_1 or My_Model for prediction based on the input image.
For Example:
If I try to predict a "Dog" image, the binary classifier which combines both models should say that I need to use Model_1 to predict the Dog image (since My_Model dataset is not trained with Dog image) whereas Model_1 is trained with Dog images.
Can anyone tell me how to achieve this? Some example implementation or code snippet will be helpful.
Thanks
To do this you need to make a combined model and then train the combined model on another custom dataset here is an example of what the combined model can look like. To make the dataset, simply take each image and decide which model you'd like to use and then you can train the output of the combined model to give a positive value for one model and a negative value for the other model. hope it helps
import numpy as np
import pandas as pd
import keras
from keras.layers import Dense, Flatten, Concatenate
from tensorflow.python.client import device_lib
# check for my gpu
print(device_lib.list_local_devices())
# making some models like the ones you have
input_shape = (10000, 3)
m1_input = Input(shape = input_shape, name = "m1_input")
fc = Flatten()(m1_input)
m1_output = Dense(1000, activation='sigmoid',name = "m1_output")(fc)
Model_1 = Model(m1_input,m1_output)
m2_input = Input(shape = input_shape, name = "m2_input")
fc = Flatten()(m2_input)
m2_output = Dense(20, activation='sigmoid',name = "m2_output")(fc)
My_Model = Model(m2_input,m2_output)
# set the trained models to be untrainable
for layer in Model_1.layers:
layer.trainable = False
for layer in My_Model.layers:
layer.trainable = False
#build a combined model
combined_model_input = Input(shape = input_shape, name = "combined_model_input")
m1_predict = Model_1(combined_model_input)
m2_predict = My_Model(combined_model_input)
combined = Concatenate()([m1_predict, m2_predict])
fc = Dense(500, activation='sigmoid',name = "fc1")(combined)
fc = Dense(100, activation='sigmoid',name = "fc2")(fc)
output_layer = Dense(1, activation='tanh',name = "fc3")(fc)
model = Model(combined_model_input, output_layer)
#check the number of parameters that are trainable
print(model.summary())
#psudocode to show how to make a training set for the combined model:
combined_model_y= []
for im in images:
if class_of(im) in list_of_my_model_classes:
combined_model_y.append(1)
else:
combined_model_y.append(-1)
combined_model_y = np.array(combined_model_y)
# then train the combined model:
model.compile('adam', loss = 'binary_crossentropy')
model.fit(images, combined_model_y, ....)