I'm currently facing an issue with my Tensorflow pipeline.
Don't know if it's specific of Tensorflow or Python.
I'm trying to do a confusion matrix afterward my compiled vgg16 model.
So i used the model object got after the fit method and try to predict the same features to compute my CM.
But the message "Processus arrêté" or process stopped in English appear and the script stop working
Here is the output :
Using TensorFlow backend.
Load audio features and labels : 100% Time: 0:00:50 528.41 B/s
VGG16 model with last layer changed
Number of label: 17322
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Functional) (None, 4, 13, 512) 14713536
_________________________________________________________________
flatten (Flatten) (None, 26624) 0
_________________________________________________________________
dense (Dense) (None, 256) 6816000
_________________________________________________________________
dropout (Dropout) (None, 256) 0
_________________________________________________________________
dense_1 (Dense) (None, 1) 257
=================================================================
Total params: 21,529,793
Trainable params: 13,895,681
Non-trainable params: 7,634,112
_________________________________________________________________
2772/2772 [==============================] - 121s 44ms/step - loss: 0.2315 - acc: 0.9407 - val_loss: 0.0829 - val_acc: 0.9948
Processus arrêté
Here is the model :
def launch2(self):
print("VGG16 model with last layer changed")
x = np.array(self.getFeatures())[...,np.newaxis]
print("Number of label: " + str(len(self.getLabels())))
vgg_conv=VGG16(weights=None, include_top=False, input_shape=(128, 431, 1))
#Freeze the layers except the last 4 layers
for layer in vgg_conv.layers[:-4]:
layer.trainable = False
#Create the model
model = tensorflow.keras.Sequential()
#Add the vgg convolutional base model
model.add(vgg_conv)
opt = Adam(lr=1e-4)
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['acc'])
model.summary()
model.fit(x=x,y=self.getLabels(),shuffle=True,batch_size=5,epochs=1, validation_split=0.2, verbose=1)
model.save('vggModelLastLayer.h5')
self.testModel(model,x)
Here is the function which allow me to compute the CM :
def testModel(self, model,x):
print("Informations about model still processing. Last step is long")
y_labels = [int(i) for i in self.getLabels().tolist()]
classes = model.predict_classes(x)
predicted_classes = np.argmax(results, axis=1)
# Call model info (true labels, predited labels)
#self.modelInfo(y_labels, predicted_classes)
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
cm=confusion_matrix(y_labels,predicted_classes)
target_names=["Bulls","No bulls"]
print(classification_report(y_labels,predicted_classes, target_names=target_names))
print(cm)
How could I fix this ? Is this a memory leak or something ?
Thank you in advance
I've found why this turned out like this
Just because of memory. My memory RAM wasn't enough big to calculate the total amount of data i had
Related
I have pre-trained a model (my own saved model) with two classes, which I want to use for transfer learning to train a model with six classes.
I have loaded the pre-trained model into the new training script:
base_model = tf.keras.models.load_model("base_model_path")
How can I remove the top/head layer (a conv1D layer) ?
I see that in keras one can use base_model.pop(), and for tf.keras.applications one can simply use include_top=false
but is there something similar when using tf.keras and load_model?
(I have tried something like this:
for layer in base_model.layers[:-1]:
layer.trainable = False`
and then add it to a new model (?) but I am not sure on how to continue)
Thanks for any help!
You could try something like this:
The base model is made up of a simple Conv1D network with an output layer with two classes:
import tensorflow as tf
samples = 100
timesteps = 5
features = 2
classes = 2
dummy_x, dummy_y = tf.random.normal((100, 5, 2)), tf.random.uniform((100, 1), maxval=2, dtype=tf.int32)
base_model = tf.keras.Sequential()
base_model.add(tf.keras.layers.Conv1D(32, 3, activation='relu', input_shape=(5, 2)))
base_model.add(tf.keras.layers.GlobalMaxPool1D())
base_model.add(tf.keras.layers.Dense(32, activation='relu'))
base_model.add( tf.keras.layers.Dense(classes, activation='softmax'))
base_model.compile(optimizer='adam', loss = tf.keras.losses.SparseCategoricalCrossentropy())
print(base_model.summary())
base_model.fit(dummy_x, dummy_y, batch_size=16, epochs=1)
base_model.save("base_model")
base_model = tf.keras.models.load_model("base_model")
Model: "sequential_8"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d_31 (Conv1D) (None, 3, 32) 224
global_max_pooling1d_13 (Gl (None, 32) 0
obalMaxPooling1D)
dense_17 (Dense) (None, 32) 1056
dense_18 (Dense) (None, 2) 66
=================================================================
Total params: 1,346
Trainable params: 1,346
Non-trainable params: 0
_________________________________________________________________
None
7/7 [==============================] - 0s 3ms/step - loss: 0.6973
INFO:tensorflow:Assets written to: base_model/assets
The new model is also is made up of a simple Conv1D network, but with an output layer with six classes. It also contains all the layers of the base_model except the first Conv1D layer and the last output layer:
classes = 6
dummy_x, dummy_y = tf.random.normal((100, 5, 2)), tf.random.uniform((100, 1), maxval=6, dtype=tf.int32)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(64, 3, activation='relu', input_shape=(5, 2)))
model.add(tf.keras.layers.Conv1D(32, 2, activation='relu'))
for layer in base_model.layers[1:-1]: # Skip first and last layer
model.add(layer)
model.add(tf.keras.layers.Dense(classes, activation='softmax'))
model.compile(optimizer='adam', loss = tf.keras.losses.SparseCategoricalCrossentropy())
print(model.summary())
model.fit(dummy_x, dummy_y, batch_size=16, epochs=1)
Model: "sequential_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d_32 (Conv1D) (None, 3, 64) 448
conv1d_33 (Conv1D) (None, 2, 32) 4128
global_max_pooling1d_13 (Gl (None, 32) 0
obalMaxPooling1D)
dense_17 (Dense) (None, 32) 1056
dense_19 (Dense) (None, 6) 198
=================================================================
Total params: 5,830
Trainable params: 5,830
Non-trainable params: 0
_________________________________________________________________
None
7/7 [==============================] - 0s 3ms/step - loss: 1.8069
<keras.callbacks.History at 0x7f90c87a3c50>
I have a regression task and time series data. For each observation I need to predict one outcome value. My data is a series of images. I have hand-crafted 32 features from my images. Images have 10 channels. My data has 4D shape: (observations, time steps, channels, features), e.g. (3348, 121, 10, 32). After normalisation one channel for one observation looks like this:
matplotlib.pyplot.matshow(normalized[170,:,0,:].transpose())
The figure shows 121 time steps (x-axis) and each time step has features on rows (32). The intensity of feature value is shown in colors. So there seems to be something happening in time.
Question 1: How to apply RNN in such a task?
Could I somehow use CNN to extract information from my 3rd and 4th axis of my data (as shown in figure above)?
A proposed solution (and troubles ahead):
I have flattened the data into 3D. I think I have sadly degenerated information for the learner, but at least this (almost) works:
m,n = xtrain4d.shape[:2]
xtrain3d = xtrain4d.reshape(m,n,-1)
Shape is now: (3348, 121, 320). Target ytrain.shape is (3348,).
Here's my LSTM:
def LSTMrecurrentNN(shape1, shape2):
model = Sequential()
model.add(LSTM(128, return_sequences=True, input_shape=(shape1, shape2)))
model.add(Dropout(0.5))
model.add(Dense(64, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(16, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='linear'))
return model
model = LSTMrecurrentNN(xtrain3d.shape[1], xtrain3d.shape[2])
model.summary()
Model: "sequential_12"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_11 (LSTM) (None, 121, 128) 229888
_________________________________________________________________
dropout_18 (Dropout) (None, 121, 128) 0
_________________________________________________________________
dense_21 (Dense) (None, 121, 64) 8256
_________________________________________________________________
dropout_19 (Dropout) (None, 121, 64) 0
_________________________________________________________________
dense_22 (Dense) (None, 121, 16) 1040
_________________________________________________________________
dropout_20 (Dropout) (None, 121, 16) 0
_________________________________________________________________
dense_23 (Dense) (None, 121, 1) 17
=================================================================
Total params: 239,201
Trainable params: 239,201
Non-trainable params: 0
_________________________________________________________________
Running the model:
epochs = 20
batchsize=128
learningrate=0.001
epsilon=0.1
# monitor validation progress:
early = EarlyStopping(monitor = "val_loss", mode = "min", patience = 10)
callbacks_list = [early]
# compile:
model.compile(loss = 'mean_squared_error',
optimizer = Adam(learning_rate=learningrate, epsilon = epsilon),
metrics = ['mse'])
# and train the model
history = model.fit(xtrain3d, ytrain,
epochs=epochs, batch_size=batchsize, verbose=0,
validation_split = 0.20,
callbacks = callbacks_list)
# predict:
test_predictions = model.predict(Xtest)
Training and validation performance looks ok:
But on the test set the model predicts one value for all observations! The figure below shows how for 10 observations in the test set the model predicts already in the early time steps a value of 3267 that is close to the mean of target y.
Statistics tell the same:
scipy.stats.describe(test_predictions[:,-1,0])
DescribeResult(nobs=1544, minmax=(3267.813, 3267.813),
mean=3267.8127, variance=5.964328e-08, skewness=1.0, kurtosis=-2.0)
For target y:
scipy.stats.describe(ytest)
DescribeResult(nobs=1544, minmax=(0.0, 8000.0),
mean=3312.1081606217617,
variance=1381985.8476585718, skewness=0.2847730511366937, kurtosis=0.20894280037919222)
Question 2: Why the model predicts the same value for all?
Any hints, how to check LSTM behaviour (states)? I would like to know how far back it "remembers".
I'm currently working on a Keras neural network for fun. I'm just learning the basics, but cant get over this dimension problem:
So my input data (X) should be a 12x6 matrix, with 12 timestamps and 6 different data values for every timestamp:
X = np.zeros([2867, 12, 6])
Y = np.zeros([2867, 3])
My Output (Y) should be a one-hot encoded 3x1 vector.
Now i want to feed this data through the following LSTM model.
model = Sequential()
model.add(LSTM(30, activation="softsign", return_sequences=True, input_shape=(12, 6)))
model.add(Dense(3))
model.summary()
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x=X, y=Y, batch_size=100, epochs=1000, verbose=2, validation_split=0.2)
The Summary looks like this:
Model: "sequential"
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 12, 30) 4440
_________________________________________________________________
dense (Dense) (None, 12, 3) 93
=================================================================
Total params: 4,533
Trainable params: 4,533
Non-trainable params: 0
_________________________________________________________________
When i run this program, i get this error:
ValueError: Shapes (None, 3) and (None, 12, 3) are incompatible.
I already tried to reshape my data to a 72x1 vector, but this doesnt work either.
Maybe someone can help me how to shape my input data correctly :).
You probably need to define your model as follows as you used the categorical_crossentropy loss function.
model.add(LSTM(30, activation="softsign",
return_sequences=False, input_shape=(12, 6)))
model.add(Dense(3, activations='softmax'))
I have trained a model to predict topic categories using word2vec and an lstm model using keras and got about 98% accuracy during training, I saved the model then loaded it into another file for trying on the test set, I used model.evaluate and model.predict and the results were very different.
I'm using keras with tensorflow as a backend, the model summary is:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_1 (LSTM) (None, 22) 19624
_________________________________________________________________
dropout_1 (Dropout) (None, 22) 0
_________________________________________________________________
dense_1 (Dense) (None, 40) 920
_________________________________________________________________
activation_1 (Activation) (None, 40) 0
=================================================================
Total params: 20,544
Trainable params: 20,544
Non-trainable params: 0
_________________________________________________________________
None
The code:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.load_weights(os.path.join('model', 'lstm_model_weights.hdf5'))
score, acc = model.evaluate(x_test, y_test, batch_size=batch_size)
print()
print('Score: %1.4f' % score)
print('Evaluation Accuracy: %1.2f%%' % (acc*100))
predicted = model.predict(x_test, batch_size=batch_size)
acc2 = np.count_nonzero(predicted.argmax(1) == y_test.argmax(1))/y_test.shape[0]
print('Prediction Accuracy: %1.2f%%' % (acc2*100))
The output of this code is
39680/40171 [============================>.] - ETA: 0s
Score: 0.1192
Evaluation Accuracy: 97.50%
Prediction Accuracy: 9.03%
Can anyone tell me what did I miss?
I think model evaluation works on dev set (or average of the dev-set accuracy if you use cross-validation), but prediction works on test set.
I am a beginner at Deep Learning and am attempting to practice the implementation of Neural Networks in Python by performing audio analysis on a dataset. I have been following the Urban Sound Challenge tutorial and have completed the code for training the model, but I keep running into errors when trying to run the model on the test set.
Here is my code for creation of the model and training:
import numpy as np
from sklearn.preprocessing import LabelEncoder
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
num_labels = y.shape[1]
filter_size = 2
model = Sequential()
model.add(Dense(256, input_shape = (40,)))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_labels))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
model.fit(X, y, batch_size=32, epochs=40, validation_data=(val_X, val_Y))
Running model.summary() before fitting the model gives me:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_3 (Dense) (None, 256) 10496
_________________________________________________________________
activation_3 (Activation) (None, 256) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 256) 0
_________________________________________________________________
dense_4 (Dense) (None, 10) 2570
_________________________________________________________________
activation_4 (Activation) (None, 10) 0
=================================================================
Total params: 13,066
Trainable params: 13,066
Non-trainable params: 0
_________________________________________________________________
After fitting the model, I attempt to run it on one file so that it can classify the sound.
file_name = ".../UrbanSoundClassifier/test/Test/5.wav"
test_X, sample_rate = librosa.load(file_name,res_type='kaiser_fast')
mfccs = np.mean(librosa.feature.mfcc(y=test_X, sr=sample_rate, n_mfcc=40).T,axis=0)
test_X = np.array(mfccs)
print(model.predict(test_X))
However, I get
ValueError: Error when checking : expected dense_3_input to have shape
(None, 40) but got array with shape (40, 1)
Would someone kindly like to point me in the right direction as to how I should be testing the model? I do not know what the input for model.predict() should be.
Full code can be found here.
So:
The easiest fix to that is simply reshaping test_x:
test_x = test_x.reshape((1, 40))
More sophisticated is to reuse the pipeline you have for the creation of train and valid set also for a test set. Please, notice that the process you applied to data files is totally different in case of test. I'd create a test dataframe:
test_dataframe = pd.DataFrame({'filename': ["here path to test file"]}
and then reused existing pipeline for creation of validation set.