learn a new set of data from existing model for enhancing it (tensorflow - keras - callbacks) - python

I make a learning on a dataset, and everything is ok. Sometimes, changes occur, and I've got some new data. I'd like to "continue" the learning from my existing model, with the new set of data, without begining from scratch again.
Here is a simple example to show the problematic and where I'm stucked.
import tensorflow as tf
mnist = tf.keras.datasets.mnist
# get the data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
# split in 2 parts to get 2 set of data
x_train1 = x_train[:5000]
x_train2 = x_train[5000:10000]
y_test1 = y_test[:5000]
y_test2 = y_test[5000:]
# set the model
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)])
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
# Compile the model
model.compile(optimizer='adam',
loss=loss_fn,
metrics=['accuracy'])
# set the callback
checkpoint_path = "CHECKPOINTS/cp.ckpt"
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path, monitor='accuracy', mode="max", save_best_only=False, save_weights_only=False, save_freq="epoch", verbose=0)
Then I fit my first set of data :
# fit the 1st dataset
model.fit(x_train1, y_test1, epochs=5,callbacks=[cp_callback],verbose=1)
Output:
Epoch 1/5
157/157 [==============================] - 0s 897us/step - loss: >2.3518 - accuracy: >0.1075
Epoch 2/5
157/157 [==============================] - 0s 752us/step - loss: >2.2833 - accuracy: >0.1438
Epoch 3/5
157/157 [==============================] - 0s 731us/step - loss: >2.2656 - accuracy: >0.1564
Epoch 4/5
157/157 [==============================] - 0s 755us/step - loss: >2.2388 - accuracy: >0.1719
Epoch 5/5
157/157 [==============================] - 0s 759us/step - loss: >2.2117 - accuracy: >0.1901
Then my 2nd one :
# fit the 2nd one
model.fit(x_train2, y_test2, epochs=5,callbacks=[cp_callback],verbose=1)
Output:
Epoch 1/5
157/157 [==============================] - 0s 943us/step - loss: >2.3240 - accuracy: >0.0964
Epoch 2/5
157/157 [==============================] - 0s 778us/step - loss: >2.2881 - accuracy: >0.1238
Epoch 3/5
157/157 [==============================] - 0s 805us/step - loss: >2.2688 - accuracy: >0.1514
Epoch 4/5
157/157 [==============================] - 0s 814us/step - loss: >2.2498 - accuracy: >0.1496
Epoch 5/5
157/157 [==============================] - 0s 1ms/step - loss: >2.2289 - accuracy: 0.1704
As you can see, it begins from 0 again the training, without keeping the existing train made before.
How could I do that.
Thanx by advance.

Related

Keras Transformer - Test Loss Not Changing

I'm trying to create a small transformer model with Keras to model stock prices, based off of this tutorial from the Keras docs. The problem is, my test loss is massive and barely changes between epochs, unsurprisingly resulting in severe underfitting, with my outputs all the same arbitrary value.
My code is below:
def transformer_encoder_block(inputs, head_size, num_heads, filters, dropout=0):
# Normalization and Attention
x = layers.LayerNormalization(epsilon=1e-6)(inputs)
x = layers.MultiHeadAttention(
key_dim=head_size, num_heads=num_heads, dropout=dropout
)(x, x)
x = layers.Dropout(dropout)(x)
res = x + inputs
# Feed Forward Part
x = layers.LayerNormalization(epsilon=1e-6)(res)
x = layers.Conv1D(filters=filters, kernel_size=1, activation="relu")(x)
x = layers.Dropout(dropout)(x)
x = layers.Conv1D(filters=inputs.shape[-1], kernel_size=1)(x)
return x + res
data = ...
input = np.array(
keras.preprocessing.sequence.pad_sequences(data["input"], padding="pre", dtype="float32"))
output = np.array(
keras.preprocessing.sequence.pad_sequences(data["output"], padding="pre", dtype="float32"))
# Input shape: (723, 36, 22)
# Output shape: (723, 36, 1)
# Train data
train_features = input[100:]
train_labels = output[100:]
train_labels = tf.keras.utils.to_categorical(train_labels, num_classes=3)
# Test data
test_features = input[:100]
test_labels = output[:100]
test_labels = tf.keras.utils.to_categorical(test_labels, num_classes=3)
inputs = keras.Input(shape=(None,22), dtype="float32", name="inputs")
# Ignore padding in inputs
x = layers.Masking(mask_value=0)(inputs)
x = transformer_encoder_block(x, head_size=64, num_heads=16, filters=3, dropout=0.2)
# Multiclass = Softmax (decrease, no change, increase)
outputs = layers.TimeDistributed(layers.Dense(3, activation="softmax", name="outputs"))(x)
# Create model
model = keras.Model(inputs=inputs, outputs=outputs)
# Compile model
model.compile(loss="categorical_crossentropy", optimizer=(tf.keras.optimizers.Adam(learning_rate=0.005)), metrics=['accuracy'])
# Train model
history = model.fit(train_features, train_labels, epochs=10, batch_size=32)
# Evaluate on the test data
test_loss = model.evaluate(test_features, test_labels, verbose=0)
print("Test loss:", test_loss)
out = model.predict(test_features)
After padding, input is of shape (723, 36, 22), and output is of shape (723, 36, 1) (before converting output to one hop, after which there are 3 output classes).
Here's an example output for ten epochs (trust me, more than ten doesn't make it better):
Epoch 1/10
20/20 [==============================] - 2s 62ms/step - loss: 10.7436 - accuracy: 0.3335
Epoch 2/10
20/20 [==============================] - 1s 62ms/step - loss: 10.7083 - accuracy: 0.3354
Epoch 3/10
20/20 [==============================] - 1s 60ms/step - loss: 10.6555 - accuracy: 0.3392
Epoch 4/10
20/20 [==============================] - 1s 62ms/step - loss: 10.7846 - accuracy: 0.3306
Epoch 5/10
20/20 [==============================] - 1s 60ms/step - loss: 10.7600 - accuracy: 0.3322
Epoch 6/10
20/20 [==============================] - 1s 59ms/step - loss: 10.7074 - accuracy: 0.3358
Epoch 7/10
20/20 [==============================] - 1s 59ms/step - loss: 10.6569 - accuracy: 0.3385
Epoch 8/10
20/20 [==============================] - 1s 60ms/step - loss: 10.7767 - accuracy: 0.3314
Epoch 9/10
20/20 [==============================] - 1s 61ms/step - loss: 10.7346 - accuracy: 0.3341
Epoch 10/10
20/20 [==============================] - 1s 62ms/step - loss: 10.7093 - accuracy: 0.3354
Test loss: [10.073813438415527, 0.375]
4/4 [==============================] - 0s 22ms/step
Using the same data on a simple LSTM model with the same shape yielded a desirable prediction with a constantly decreasing loss.
Tweaking the learning rate appears to have no effect, nor does stacking more transformer_encoder_block()s.
If anyone has any suggestions for how I can solve this, please let me know.

Training fails if model is saved beforehand

I noticed that saving a Tensorflow model prior to training causes the training to perform poorly. Obviously the solution is to just save the model later, but I'm curious why this is happening in the first place.
The following code runs fine, and produces the following output:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.Sequential([
layers.Flatten(input_shape=(28,28)),
layers.Dense(16, activation='relu'),
layers.Dense(16, activation='relu'),
layers.Dense(10)
])
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer='adam',
loss=loss_fn,
metrics=['accuracy'])
### model.save('my_model') ###
model.fit(x_train, y_train, epochs=3, verbose=1)
Epoch 1/3
1875/1875 [==============================] - 3s 1ms/step - loss: 0.4513 - accuracy: 0.8688
Epoch 2/3
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2326 - accuracy: 0.9333
Epoch 3/3
1875/1875 [==============================] - 2s 1ms/step - loss: 0.1974 - accuracy: 0.9432
However when the second last line is commented out, the training fails badly:
INFO:tensorflow:Assets written to: my_model\assets
Epoch 1/3
1875/1875 [==============================] - 3s 1ms/step - loss: 0.4156 - accuracy: 0.0948
Epoch 2/3
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2149 - accuracy: 0.1000
Epoch 3/3
1875/1875 [==============================] - 2s 1ms/step - loss: 0.1840 - accuracy: 0.0998
It's a known bug of TensorFlow. Use metrics=['sparse_categorical_accuracy'] instead of metrics=['accuracy'] when compiling the model to avoid it.

TensorFlow with same accuracy in Python

I was just following a TensorFlow example from the book Hands-On Machine Learning with Scikit-Learn and TensorFlow but got weird results.
The example:
import tensorflow as tf
from tensorflow import keras
tf.__version__
keras.__version__
fashion_mnist = keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()
X_valid, X_train = X_train_full[:5000] / 255.0, X_train_full[5000:] / 255.0
y_valid, y_train = y_train_full[:5000] / 255.0, y_train_full[5000:] / 255.0
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat",
"Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=[28, 28]),
keras.layers.Dense(300, activation="relu"),
keras.layers.Dense(100, activation="relu"),
keras.layers.Dense(10, activation="softmax")
])
model.compile(loss="sparse_categorical_crossentropy",
optimizer='sgd',
metrics=['accuracy'])
history = model.fit(X_train, y_train, epochs=50, validation_data=(X_valid, y_valid))
As the epochs evolve we should se an improvement for accuracy as indicated in the book:
Train on 55000 samples, validate on 5000 samples
Epoch 1/30
55000/55000 [==========] - 3s 55us/sample - loss: 1.4948 - acc: 0.5757 - val_loss: 1.0042 - val_acc: 0.7166
Epoch 2/30
55000/55000 [==========] - 3s 55us/sample - loss: 0.8690 - acc: 0.7318 - val_loss: 0.7549 - val_acc: 0.7616
[...]
Epoch 50/50
55000/55000 [==========] - 4s 72us/sample - loss: 0.3607 - acc: 0.8752 - acc: 0.8752 -val_loss: 0.3706 - val_acc: 0.8728
But when I ran I got the following:
Epoch 1/30
1719/1719 [==============================] - 3s 2ms/step - loss: 0.0623 - accuracy: 0.1005 - val_loss: 0.0011 - val_accuracy: 0.0914
Epoch 2/30
1719/1719 [==============================] - 3s 2ms/step - loss: 8.7637e-04 - accuracy: 0.1011 - val_loss: 5.2079e-04 - val_accuracy: 0.0914
Epoch 3/30
1719/1719 [==============================] - 3s 2ms/step - loss: 4.9200e-04 - accuracy: 0.1019 - val_loss: 3.4211e-04 - val_accuracy: 0.0914
[...]
Epoch 49/50
1719/1719 [==============================] - 3s 2ms/step - loss: 3.1710e-05 - accuracy: 0.0992 - val_loss: 3.2966e-05 - val_accuracy: 0.0914
Epoch 50/50
1719/1719 [==============================] - 3s 2ms/step - loss: 2.7711e-05 - accuracy: 0.1022 - val_loss: 3.1833e-05 - val_accuracy: 0.0914
So, as you can see the reproduction got a strongly lower accuracy that has not improved: it stayed at 0.0914 instead of 0.8728.
Is there something wrong in my TensorFlow installation, setup or even in the code?
you can not divide y such as y_valid, y_train = y_train_full[:5000] / 255.0, y_train_full[5000:] / 255.0. The completed code is following :
import tensorflow as tf
from tensorflow import keras
tf.__version__
keras.__version__
fashion_mnist = keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()
X_train_full = X_train_full / 255.0
X_test = X_test / 255.0
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat",
"Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation="relu"),
keras.layers.Dense(10, activation="softmax")
])
model.compile(loss="sparse_categorical_crossentropy",
optimizer='sgd',
metrics=['accuracy'])
history = model.fit(X_train_full, y_train_full, epochs=5, validation_data=(X_test, y_test))
It will give the acc like :
Epoch 1/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.9880 - accuracy: 0.6923 - val_loss: 0.5710 - val_accuracy: 0.8054
Epoch 2/5
1875/1875 [==============================] - 2s 944us/step - loss: 0.5281 - accuracy: 0.8227 - val_loss: 0.5112 - val_accuracy: 0.8228
Epoch 3/5
1875/1875 [==============================] - 2s 913us/step - loss: 0.4720 - accuracy: 0.8391 - val_loss: 0.4782 - val_accuracy: 0.8345
Epoch 4/5
1875/1875 [==============================] - 2s 915us/step - loss: 0.4492 - accuracy: 0.8462 - val_loss: 0.4568 - val_accuracy: 0.8410
Epoch 5/5
1875/1875 [==============================] - 2s 935us/step - loss: 0.4212 - accuracy: 0.8550 - val_loss: 0.4469 - val_accuracy: 0.8444
Also, optimizer adam may be give better result than sgd.

RNN validation accuracy stays constant. Is this normal?

I'm new to machine learning and I'm building an RNN classifier for a problem similar to Name Entity Recognition (NER) but with only two tags.
I followed a tutorial to build the classifier, and now when fitting the model, I get a constant validation accuracy for all the epochs, and some part of me thinks this may be a mistake. Is it normal to have a constant val_accuracy ?
this is my model:
input = Input(shape=(66,))
word_embedding_size = 66
model = Embedding(input_dim=n_words, output_dim=word_embedding_size, input_length=66)(input)
model = Bidirectional(LSTM(units=word_embedding_size,
return_sequences=True,
dropout=0.5,
recurrent_dropout=0.5,
kernel_initializer=k.initializers.he_normal()))(model)
model = LSTM(units=word_embedding_size * 2,
return_sequences=True,
dropout=0.5,
recurrent_dropout=0.5,
kernel_initializer=k.initializers.he_normal())(model)
model = TimeDistributed(Dense(n_tags, activation="sigmoid"))(model)
out = model
model = Model(input, out)
adam = k.optimizers.Adam(lr=0.0005, beta_1=0.9, beta_2=0.999)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()
history = model.fit(X, np.array(Y), batch_size=256, epochs=10, validation_split=0.3, verbose=1)
and this is how the epoch look
Epoch 1/10
2/2 [==============================] - 2s 801ms/step - loss: 0.6990 - accuracy: 0.3123 - val_loss: 0.5732 - val_accuracy: 0.9675
Epoch 2/10
2/2 [==============================] - 1s 334ms/step - loss: 0.5552 - accuracy: 0.9713 - val_loss: 0.4202 - val_accuracy: 0.9675
Epoch 3/10
2/2 [==============================] - 1s 310ms/step - loss: 0.3997 - accuracy: 0.9723 - val_loss: 0.2377 - val_accuracy: 0.9675
Epoch 4/10
2/2 [==============================] - 1s 303ms/step - loss: 0.2260 - accuracy: 0.9723 - val_loss: 0.1168 - val_accuracy: 0.9675
Epoch 5/10
2/2 [==============================] - 1s 312ms/step - loss: 0.1126 - accuracy: 0.9723 - val_loss: 0.0851 - val_accuracy: 0.9675

How to take as Input a list of arrays in Keras API

Well, i'm new to Machine Learning, and so with Keras. I'm trying to create a model from which can be passed as Input a list of arrays of arrays (a list of 6400 arrays within 2 arrays).
This is my code's problem:
XFIT = np.array([x_train, XX_train])
YFIT = np.array([y_train, yy_train])
Inputs = keras.layers.Input(shape=(6400, 2))
hidden1 = keras.layers.Dense(units=100, activation="sigmoid")(Inputs)
hidden2 = keras.layers.Dense(units=100, activation='relu')(hidden1)
predictions = keras.layers.Dense(units=3, activation='softmax')(hidden2)
model = keras.Model(inputs=Inputs, outputs=predictions)
There's no error; however, the Input layer (Inputs) forces me to pass a (6400, 2) shape, as each array (x_train and XX_train) has 6400 arrays inside. The result, with the epochs done, is this:
Train on 2 samples
Epoch 1/5
2/2 [==============================] - 1s 353ms/sample - loss: 1.1966 - accuracy: 0.2488
Epoch 2/5
2/2 [==============================] - 0s 9ms/sample - loss: 1.1303 - accuracy: 0.2544
Epoch 3/5
2/2 [==============================] - 0s 9ms/sample - loss: 1.0982 - accuracy: 0.3745
Epoch 4/5
2/2 [==============================] - 0s 9ms/sample - loss: 1.0854 - accuracy: 0.3745
Epoch 5/5
2/2 [==============================] - 0s 9ms/sample - loss: 1.0835 - accuracy: 0.3745
Process finished with exit code 0
I can't train more than twice in each epoch because of the input shape. How can I change this input?
I have triend other shapes but they got me errors.
x_train, XX_train seems like this
[[[0.505834 0.795461]
[0.843175 0.975741]
[0.22349 0.035036]
...
[0.884796 0.867509]
[0.396942 0.659936]
[0.873194 0.05454 ]]
[[0.95968 0.281957]
[0.137547 0.390005]
[0.635382 0.901555]
...
[0.887062 0.486206]
[0.49827 0.949123]
[0.034411 0.983711]]]
Thank you and forgive me if i've commited any fault, first time in Keras and first time in StackOverFlow :D
You are almost there. The problem is with:
XFIT = np.array([x_train, XX_train])
YFIT = np.array([y_train, yy_train])
Let's see with an example:
import numpy as np
x_train = np.random.random((6400, 2))
y_train = np.random.randint(2, size=(6400,1))
xx_train = np.array([x_train, x_train])
yy_train = np.array([y_train, y_train])
print(xx_train.shape)
(2, 6400, 2)
print(yy_train.shape)
(2, 6400, 1)
In the array, we have 2 batches with 6400 samples each. This means when we call model.fit, it only has 2 batches to train on. Instead, what we can do:
xx_train = np.vstack([x_train, x_train])
yy_train = np.vstack([y_train, y_train])
print(xx_train.shape)
(12800, 2)
print(yy_train.shape)
(12800, 1)
Now, we have correctly joined both sample and can now train.
Inputs = Input(shape=(2, ))
hidden1 = Dense(units=100, activation="sigmoid")(Inputs)
hidden2 = Dense(units=100, activation='relu')(hidden1)
predictions = Dense(units=1, activation='sigmoid')(hidden2)
model = Model([Inputs], outputs=predictions)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(xx_train, yy_train, batch_size=10, epochs=5)
Train on 12800 samples
Epoch 1/5
12800/12800 [==============================] - 3s 216us/sample - loss: 0.6978 - acc: 0.5047
Epoch 2/5
12800/12800 [==============================] - 2s 186us/sample - loss: 0.6952 - acc: 0.5018
Epoch 3/5
12800/12800 [==============================] - 3s 196us/sample - loss: 0.6942 - acc: 0.4962
Epoch 4/5
12800/12800 [==============================] - 3s 217us/sample - loss: 0.6938 - acc: 0.4898
Epoch 5/5
12800/12800 [==============================] - 3s 217us/sample - loss: 0.6933 - acc: 0.5002

Categories