I trained and saved a Bidirectional LSTM model in Keras successfully with:
model = Sequential()
model.add(Bidirectional(LSTM(N_HIDDEN_NEURONS,
return_sequences=True,
activation="tanh",
input_shape=(SEGMENT_TIME_SIZE, N_FEATURES))))
model.add(Bidirectional(LSTM(N_HIDDEN_NEURONS)))
model.add(Dropout(0.5))
model.add(Dense(N_CLASSES, activation='sigmoid'))
model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train,
batch_size=BATCH_SIZE,
epochs=N_EPOCHS,
validation_data=[X_test, y_test])
model.save('model_keras/model.h5')
However, when I want to load it with:
model = load_model('model_keras/model.h5')
I get an error:
ValueError: You are trying to load a weight file containing 3 layers
into a model with 0 layers.
I also tried different methods like saving and loading model architecture and weights separately but none of them worked for me. Also, previously, when I was using normal (unidirectional) LSTMs, loading the model worked fine.
As mentioned by #mpariente and #today, the input_shape is an argument of Bidirectional, not LSTM, see Keras documentation. My solution:
# Model
model = Sequential()
model.add(Bidirectional(LSTM(N_HIDDEN_NEURONS,
return_sequences=True,
activation="tanh"),
input_shape=(SEGMENT_TIME_SIZE, N_FEATURES)))
model.add(Bidirectional(LSTM(N_HIDDEN_NEURONS)))
model.add(Dropout(0.5))
model.add(Dense(N_CLASSES, activation='sigmoid'))
model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train,
batch_size=BATCH_SIZE,
epochs=N_EPOCHS,
validation_data=[X_test, y_test])
model.save('model_keras/model.h5')
and then, to load, simply do:
model = load_model('model_keras/model.h5')
Related
I am learning neural networks. I get 98% accuracy with classical ML methods, so I think I made a coding error. The neural networks model is not learning.
Things I tried:
Changing X and y to float64 or float32
Normalizing data
Changing the activation to "linear" or "relu"
Removing Flatten()
Adding hidden layers
Using stochastic gradient descent as optimizer, instead of "adam".
Changing the y label with another label
There are 9 labels in X_train and 8 different classes in y_train.
X_train:
y_train:
Code:
model = keras.models.Sequential()
model.add(keras.layers.Input(shape=(9,)))
model.add(keras.layers.Dense(8, activation='softmax'))
model.add(layers.Flatten())
model.compile(optimizer= 'adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
Fitting:
I tried these lines by changing the target label. None of them help training the model. Some give "nan" loss, some go slightly up and down, but all of them are below 0.1% accuracy:
model = tf.keras.Sequential()
model.add(layers.Input(shape=(9,)))
model.add(layers.Dense(1, name='dense1'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=20, batch_size=24)
or this:
model = tf.keras.Sequential()
model.add(layers.Input(shape=(9,)))
model.add(layers.Dense(3, activation='relu', name='relu1'))
model.add(layers.Dense(16, activation='relu', name='relu2'))
model.add(layers.Dense(16, activation='relu', name='relu3'))
model.add(layers.Dense(1, name='dense1'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
history = model.fit(x=X_train, y=y_train, epochs=20)
I build a model in Keras for CNN. I want to save this model and reuse this trained model for transfer learning as a pre-trained model. I do not understand what is the proper way to save the model for reusing it for transfer learning. Again how to load my pre-trained model in Keras so that I can add some layers after load the previous model.
This is my model that I build
model_cnn = Sequential() # initilaizing the Sequential nature for CNN model
# Adding the embedding layer which will take in maximum of 450 words as input and provide a 32 dimensional output of those words which belong in the top_words dictionary
model_cnn.add(Embedding(vocab_size, embedding_size, input_length=max_len))
model_cnn.add(Conv1D(32, 3, padding='same', activation='relu'))
model_cnn.add(Conv1D(64, 3, padding='same', activation='relu'))
model_cnn.add(MaxPooling1D())
model_cnn.add(Flatten())
model_cnn.add(Dense(250, activation='relu'))
model_cnn.add(Dense(2, activation='softmax'))
# optimizer = keras.optimizers.Adam(lr=0.001)
model_cnn.compile(
loss='categorical_crossentropy',
optimizer=sgd,
metrics=['acc',Precision(),Recall(),]
)
history_cnn = model_cnn.fit(
X_train, y_train,
validation_data=(X_val, y_val),
batch_size = 64,
epochs=epochs,
verbose=1
)
The recommended way to save model, is saving with SavedModel format:
dir = "target_directory"
model_cnn.save(dir) # it will save a .pb file with assets and variables folders
Then you can load it:
model_cnn = tf.keras.models.load_model(dir)
Now, you can add some layers and make another model. For example:
input = tf.keras.Input(shape=(128,128,3))
x = model_cnn(input)
x = tf.keras.layers.Dense(1, activation='sigmoid')(x)
new_model = tf.keras.models.Model(inputs=input, outputs=x)
I am trying to implement a Bidirectional LSTM for a sequence-to-sequence model. I have already one-hot-encoded my sequences with 12 total features. The input is 11 steps while the output is 23 steps. First, I coded this LSTM implementation that works with the first LSTM as the encoder and the second as the decoder.
model = Sequential()
model.add(LSTM(75, input_shape=(11, 12)))
model.add(RepeatVector(23))
model.add(LSTM(50, return_sequences=True))
model.add(TimeDistributed(Dense(12, activation='softmax')))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
X, y = generate_data(1, taskset, trainset)
model.fit(X, y, epochs=1, batch_size=32, verbose=1)
I then tried to turn this into a bidirectional LSTM as follows:
model = Sequential()
model.add(Bidirectional(LSTM(75, return_sequences=True), input_shape=(11,12), merge_mode='concat'))
model.add(Bidirectional(LSTM(50, return_sequences=True)))
model.add(TimeDistributed(Dense(12, activation='softmax')))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics= ['accuracy'])
model.summary()
X, y = generate_data(1, taskset, trainset)
model.fit(X, y, epochs=1, batch_size=32, verbose=1)
The goal is to use the first bidirectional LSTM as the encoder and the second bidirectional LSTM as the decoder. I removed the RepeatVector in the bidirectional implementation because it gave me a dimension error (needed dim=2, received dim=3). With the current bidirectional LSTM I am getting this error:
ValueError: Shapes (None, 23, 12) and (None, 11, 12) are incompatible
Any help with fixing the bidirectional LSTM implementation?
Simply setting return_sequences=False in your first bidirectional LSTM and adding as before RepeatVector(23) works fine
n_sample = 10
X = np.random.uniform(0,1, (n_sample, 11, 12))
y = np.random.randint(0,2, (n_sample, 23, 12))
model = Sequential()
model.add(Bidirectional(LSTM(75), input_shape=(11,12), merge_mode='concat'))
model.add(RepeatVector(23))
model.add(Bidirectional(LSTM(50, return_sequences=True)))
model.add(TimeDistributed(Dense(12, activation='softmax')))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics= ['accuracy'])
model.fit(X, y, epochs=3, batch_size=32, verbose=1)
I have seen some examples of transfer learning where one can use pre-trained models from keras.application (Xception, VGG16, VGG19, ResNet50 e.t.c) but what I want is to transfer the learning from the model I saved using model.save('model.h5')
This is my current model:
model = Sequential()
model.add(Embedding(max_words, embedding_dim, input_length=maxlen))
model.add(LSTM(32))
model.add(Dropout(0.6))
model.add(Dense(2, activation='sigmoid'))
model.compile(optimizer='rmsprop', loss='binary_crossentropy',metrics=['acc'])
model.fit(sequences, labels, epochs=10, batch_size=32, validation_split=0.2)
Now, Instead of saying
model_base = keras.applications.vgg16.VGG16(include_top=False, weights='imagenet')
I want to load the saved model probably with load_model('model.h5') and add it as a layer to my current model.
try this
model = Sequential()
model.add(Embedding(max_words, embedding_dim, input_length=maxlen))
model.add(LSTM(32))
model.add(Dropout(0.6))
model.add(Dense(2, activation='sigmoid'))
model.layers[0].set_weights([embedding_matrix])
model.layers[0].trainable = False
model.load_weights('model.h5')
model.compile(optimizer='rmsprop', loss='binary_crossentropy',metrics=['acc'])
model_base = model
Dont forget to remove your classifier
Can we save any of the created LSTM models themselves? I believe that “pickling” is the standard method to serialize python objects to a file. Ideally, I wanted to create a python module that contained one or more functions that either allowed me to specify an LSTM model to load or used a hard-coded pre-fit model to generate forecasts based on data passed in to initialize the model.
I tried to use it but gave me an error.
Code that I used:
# create and fit the LSTM network
batch_size = 1
model = Sequential()
model.add(LSTM(50, batch_input_shape=(batch_size, look_back, 1), stateful=True, return_sequences=True))
model.add(Dropout(0.3))
model.add(Activation('relu'))
model.add(LSTM(50, batch_input_shape=(batch_size, look_back, 1), stateful=True))
model.add(Dropout(0.3))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('relu'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics = ['accuracy'])
for i in range(10):
model.fit(trainX, trainY, epochs=1, batch_size=batch_size, verbose=2, shuffle=False)
model.reset_states()
with open ('sequential.pickle','wb') as f:
pickle.dump(model,f)
pickle_in = open ('sequential.pickle','rb')
model = pickle.load(pickle_in)
# make predictions
trainPredict = model.predict(trainX, batch_size=batch_size)
model.reset_states()
testPredict = model.predict(testX, batch_size=batch_size)
From the documentation:
It is not recommended to use pickle or cPickle to save a Keras model.
You can use model.save(filepath) to save a Keras model into a single
HDF5 file which will contain:
the architecture of the model, allowing to re-create the model
the weights of the model
the training configuration (loss, optimizer)
the state of the optimizer, allowing to resume training exactly where you left off. You can then use keras.models.load_model(filepath)
to reinstantiate your model.
To save your model, you'd need to call model.save:
model.save('model.h5') # creates a HDF5 file 'model.h5'
Similarly, loading the model is done like this:
from keras.models import load_model
model = load_model('model.h5')