I have made a model that tries to predict the chances of every piano key playing in a time step given all time steps before it. I tried making a GRU network with 88 outputs(one for every piano key)
input shape = (600,88,)
desired output/ label shape = (88, )
import numpy as np
import midi_processer
from keras import models
from keras import layers
x_train, x_test = np.load("samples.npy", mmap_mode='r'), np.load("test_samples.npy", mmap_mode='r')
y_train, y_test = np.load("labels.npy", mmap_mode='r'), np.load("test_labels.npy", mmap_mode='r')
def build_model():
model = models.Sequential()
model.add(layers.GRU(512,activation='tanh', recurrent_activation='hard_sigmoid'))
model.add(layers.Dense(88, activation = 'sigmoid'))
return model
x_partial, x_val = x_train[:13000], x_train[13000:]
y_partial, y_val = y_train[:13000], y_train[13000:]
model = build_model()
model.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = ['accuracy'])
history = model.fit(x_partial, y_partial, batch_size = 50, epochs = , validation_data= (x_val,y_val))
instead of learning normally my algorithm had stayed with constant accuracy throughout all of the epochs
Epoch 1/15
260/260 [==============================] - 998s 4s/step - loss: -0.1851 - accuracy: 0.0298 - val_loss: -8.8735 - val_accuracy: 0.0310
Epoch 2/15
260/260 [==============================] - 827s 3s/step - loss: -33.6520 - accuracy: 0.0382 - val_loss: -56.0122 - val_accuracy: 0.0310
Epoch 3/15
260/260 [==============================] - 844s 3s/step - loss: -78.6130 - accuracy: 0.0382 - val_loss: -98.2798 - val_accuracy: 0.0310
Epoch 4/15
260/260 [==============================] - 906s 3s/step - loss: -121.0963 - accuracy: 0.0382 - val_loss: -139.3440 - val_accuracy: 0.0310
Epoch 5/15
I'm new to machine learning and I'm building an RNN classifier for a problem similar to Name Entity Recognition (NER) but with only two tags.
I followed a tutorial to build the classifier, and now when fitting the model, I get a constant validation accuracy for all the epochs, and some part of me thinks this may be a mistake. Is it normal to have a constant val_accuracy ?
this is my model:
input = Input(shape=(66,))
word_embedding_size = 66
model = Embedding(input_dim=n_words, output_dim=word_embedding_size, input_length=66)(input)
model = Bidirectional(LSTM(units=word_embedding_size,
model = LSTM(units=word_embedding_size * 2,
model = TimeDistributed(Dense(n_tags, activation="sigmoid"))(model)
out = model
model = Model(input, out)
adam = k.optimizers.Adam(lr=0.0005, beta_1=0.9, beta_2=0.999)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(X, np.array(Y), batch_size=256, epochs=10, validation_split=0.3, verbose=1)
and this is how the epoch look
Epoch 1/10
2/2 [==============================] - 2s 801ms/step - loss: 0.6990 - accuracy: 0.3123 - val_loss: 0.5732 - val_accuracy: 0.9675
Epoch 2/10
2/2 [==============================] - 1s 334ms/step - loss: 0.5552 - accuracy: 0.9713 - val_loss: 0.4202 - val_accuracy: 0.9675
Epoch 3/10
2/2 [==============================] - 1s 310ms/step - loss: 0.3997 - accuracy: 0.9723 - val_loss: 0.2377 - val_accuracy: 0.9675
Epoch 4/10
2/2 [==============================] - 1s 303ms/step - loss: 0.2260 - accuracy: 0.9723 - val_loss: 0.1168 - val_accuracy: 0.9675
Epoch 5/10
2/2 [==============================] - 1s 312ms/step - loss: 0.1126 - accuracy: 0.9723 - val_loss: 0.0851 - val_accuracy: 0.9675
I am working on an image classification results. My training and testing split used the same random_state. Model definition is the same. However, when I run the model for multiple times, three out of four times, the model is not learning, the loss function does not go down; one out of four times, the model is learning, I can get good classificaiton results. I suspect the randomness comes from the ImageDataGenerator(). But I cannot figure out how to let the model learn every time.
I have a relative small labeled dataset, I don't have ways to increase the data size
I tried different optimizers, different batch size. It doesn't help. I found that when I reduce the trainable layers and make the later fully-connected layers smaller (reduce to 256 units), the model start to learn every time. But why big network does not learn well even on the training data set??? My understanding is that the model will overfit, but why in this case, it is not learning at all?
filenames = os.listdir(r"XXX")
ref_db= pd.read_csv(r"XXX")
ref_db['obj_id']= [str(i)+ '.tif' for i in ref_db.OBJECTID.values ]
ref_db2= ref_db[['label', 'obj_id' ]]
ref_db2['label'] = ref_db2['label'].apply(str)
train_df, validate_df = train_test_split(ref_db2, test_size=0.20, random_state=42)
train_df = train_df.reset_index(drop=True)
validate_df = validate_df.reset_index(drop=True)
total_train = train_df.shape[0]
total_validate = validate_df.shape[0]
train_datagen = ImageDataGenerator(
train_generator = train_datagen.flow_from_dataframe(
inputs= Input(shape=(IMAGE_WIDTH, IMAGE_HEIGHT, 3))
base_model = VGG19(weights='imagenet', include_top=False,)
for layer in base_model.layers[:-3]:
layer.trainable = False
x = base_model(inputs)
x = Flatten()(x)
x = Dense(1024, activation="relu")(x)
#x = Dropout(0.5)(x)
x = Dense(512, activation="relu")(x)
predictions = Dense(1, activation="sigmoid")(x)
model_vgg= Model(inputs=inputs , outputs=predictions)
model_vgg.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
history = model_vgg.fit_generator(
This is the unwanted behavior, Model is not learning, all observations are predicted as 1, the loss is not dropping
Found 756 validated image filenames belonging to 2 classes.
Found 190 validated image filenames belonging to 2 classes.
Epoch 1/50
- 4s - loss: 4.0464 - acc: 0.6203 - val_loss: 4.9820 - val_acc: 0.6875
Epoch 2/50
- 2s - loss: 4.3811 - acc: 0.7252 - val_loss: 4.8856 - val_acc: 0.6935
Epoch 3/50
- 2s - loss: 5.0209 - acc: 0.6851 - val_loss: 5.3556 - val_acc: 0.6641
Epoch 4/50
- 2s - loss: 4.3583 - acc: 0.7266 - val_loss: 4.1142 - val_acc: 0.7419
Epoch 5/50
- 2s - loss: 4.9317 - acc: 0.6907 - val_loss: 4.7329 - val_acc: 0.7031
Epoch 6/50
- 2s - loss: 4.6275 - acc: 0.7097 - val_loss: 5.3998 - val_acc: 0.6613
Epoch 7/50
This is the expected behavior, model is learning, both 1 and 0 are predicted, the loss is dropping
Found 756 validated image filenames belonging to 2 classes.
Found 190 validated image filenames belonging to 2 classes.
Epoch 1/50
- 4s - loss: 2.1181 - acc: 0.6484 - val_loss: 0.8013 - val_acc: 0.6562
Epoch 2/50
- 2s - loss: 0.6609 - acc: 0.7096 - val_loss: 0.5670 - val_acc: 0.7581
Epoch 3/50
- 2s - loss: 0.6539 - acc: 0.6912 - val_loss: 0.5923 - val_acc: 0.6953
Epoch 4/50
- 2s - loss: 0.5695 - acc: 0.7083 - val_loss: 0.5426 - val_acc: 0.6774
Epoch 5/50
- 2s - loss: 0.5262 - acc: 0.7176 - val_loss: 0.5386 - val_acc: 0.6875
I've got image classification model with CNN. I've done transfer learning using MobileNet. At the end of the Mobile Net, I added 4 layers to learn weights for my images(not update weights for MobileNet). Mobile Net's Weights are not changed. As a result, I've got 91% of accuracy with this model and evaluated with same training set(train _generator). But I've got lower accuracy about 41% at this time. Why different result is came out? I've used same training set though... Is there any difference with model.fit_generator's accuracy and model.evaluate_generator? or something wrong in data? Please help... How can I improve accuracy?? Here is my entire code below.
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.applications import MobileNet
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.mobilenet import preprocess_input
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
base_model = MobileNet(weights='imagenet', include_top=False)
x=Dense(1024, activation='relu')(x)
x=Dense(1024, activation='relu')(x)
x=Dense(512, activation='relu')(x)
preds=Dense(7, activation='softmax')(x)
model=Model(inputs=base_model.input, outputs=preds)
for layer in model.layers[:-4]:
for layer in model.layers[-4:]:
train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
train_generator = train_datagen.flow_from_directory('/Users/LG/Desktop/finger',
target_size=(224, 224),
model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['accuracy'])
Epoch 1/10
17/17 [==============================] - 53s 3s/step - loss: 1.9354 - acc: 0.3026
Epoch 2/10
17/17 [==============================] - 52s 3s/step - loss: 1.1933 - acc: 0.5276
Epoch 3/10
17/17 [==============================] - 52s 3s/step - loss: 0.8936 - acc: 0.6787
Epoch 4/10
17/17 [==============================] - 54s 3s/step - loss: 0.6040 - acc: 0.7843
Epoch 5/10
17/17 [==============================] - 53s 3s/step - loss: 0.5367 - acc: 0.8080
Epoch 6/10
17/17 [==============================] - 55s 3s/step - loss: 0.2676 - acc: 0.9099
Epoch 7/10
17/17 [==============================] - 52s 3s/step - loss: 0.4531 - acc: 0.8387
Epoch 8/10
17/17 [==============================] - 53s 3s/step - loss: 0.3580 - acc: 0.8747
Epoch 9/10
17/17 [==============================] - 55s 3s/step - loss: 0.1963 - acc: 0.9301
Epoch 10/10
17/17 [==============================] - 53s 3s/step - loss: 0.2237 - acc: 0.9133
model.evaluate_generator(train_generator, steps=5)
[2.169835996627808, 0.41875]
I'm trying to do a simple Keras Neural Network but the model doesn't fit:
Train on 562 samples, validate on 188 samples
Epoch 1/20
562/562 [==============================] - 1s 1ms/step - loss: 8.1130 - acc: 0.4911 - val_loss: 7.6320 - val_acc: 0.5213
Epoch 2/20
562/562 [==============================] - 0s 298us/step - loss: 8.1130 - acc: 0.4911 - val_loss: 7.6320 - val_acc: 0.5213
Epoch 3/20
562/562 [==============================] - 0s 295us/step - loss: 8.1130 - acc: 0.4911 - val_loss: 7.6320 - val_acc: 0.5213
Epoch 4/20
562/562 [==============================] - 0s 282us/step - loss: 8.1130 - acc: 0.4911 - val_loss: 7.6320 - val_acc: 0.5213
Epoch 5/20
562/562 [==============================] - 0s 289us/step - loss: 8.1130 - acc: 0.4911 - val_loss: 7.6320 - val_acc: 0.5213
Epoch 6/20
562/562 [==============================] - 0s 265us/step - loss: 8.1130 - acc: 0.4911 - val_loss: 7.6320 - val_acc: 0.5213
The data base is structured in a CSV file like this:
doc venda img1 img2 v1 v2 gt
RG venda1 img123 img12 [3399, 162675, ...] [3399, 162675, ...] 1
My intent its to use the diff between v1 and v2 vector to answer me if img1 and im2 are from the same class.
The code:
from sklearn.model_selection import train_test_split
(X_train, X_test, Y_train, Y_test) = train_test_split(train, train_labels, test_size=0.25, random_state=42)
# create the model
model = Sequential()
model.add(Dense(10, activation="relu", input_dim=10, kernel_initializer="uniform"))
model.add(Dense(6, activation="relu", kernel_initializer="uniform"))
model.add(Dense(1, activation='sigmoid'))
validation_data=(np.array(X_test), np.array(Y_test)),
What i'm doing wrong?
Divide the difference vector by some constant number so that the feature vector is in range 0 to 1 or -1 to 1. Right now the values are too big and the loss is coming high. Network learns faster if the data is normalized properly.
I have had success normalizing features using this function. I forget exactly why I use the same mu and sigma from train set on the test and val but I am pretty sure I learned it during the deep.ai course on coursera
def normalize_features(dataset):
mu = np.mean(dataset, axis = 0) # columns
sigma = np.std(dataset, axis = 0)
norm_parameters = {'mu': mu,
'sigma': sigma}
return (dataset-mu)/(sigma+1e-10), norm_parameters
# Normal X data; using same mu and sigma from test set;
x_train, norm_parameters = normalize_features(x_train)
x_val = (x_val-norm_parameters['mu'])/(norm_parameters['sigma']+1e-10)
x_test = (x_test-norm_parameters['mu'])/(norm_parameters['sigma']+1e-10)
My model trains fine on a CPU machine but I am running into an issue when trying to rerun it on our cluster (using a single GPU and the same dataset). When training on a GPU machine validation loss and accuracy are not improving from epoch to epoch (see below).This was not the case on a CPU machine (I was able to achieve validation accuracy ~0.8 after 20 epochs)
Keras 2.1.3
TensforFlow backend
70/20/10 train/dev/test
~ 7000 images
model is based on ResNet50
import sys
import math
import os
import glob
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential, Model
from keras.layers import Flatten, Dense
from keras import backend as k
from keras.callbacks import ModelCheckpoint, CSVLogger, EarlyStopping
############ Training parameters ##################
img_width, img_height = 224, 224
batch_size = 32
epochs = 100
############ Define the data ##################
train_data_dir = '/mnt/data/train'
validation_data_dir = '/mnt/data/validate'
train_data_dir_class1 = os.path.join(train_data_dir,'class1', '*.jpg')
train_data_dir_class2 = os.path.join(train_data_dir, 'class2', '*.jpg')
validation_data_dir_class1 = os.path.join(validation_data_dir, 'class1', '*.jpg')
validation_data_dir_class2 = os.path.join(validation_data_dir, 'class2', '*.jpg')
# number of training and validation samples
nb_train_samples = len(glob.glob(train_data_dir_class1)) + len(glob.glob(train_data_dir_class2))
nb_validation_samples = len(glob.glob(validation_data_dir_class1)) + len(glob.glob(validation_data_dir_class2))
############ Define the model ##################
model = applications.resnet50.ResNet50(weights = "imagenet",
include_top = False,
input_shape = (img_width, img_height, 3))
for layer in model.layers:
layer.trainable = False
# Adding a FC layer
x = model.output
x = Flatten()(x)
predictions = Dense(1, activation = "sigmoid")(x)
# creating the final model
model_final = Model(inputs = model.input, outputs = predictions)
# compile the model
model_final.compile(loss = "binary_crossentropy",
optimizer = optimizers.Adam(lr = 0.001,
beta_1 = 0.9,
beta_2 = 0.999,
epsilon = 1e-10),
metrics = ["accuracy"])
# train and test generators
train_datagen = ImageDataGenerator(rescale = 1./255,
horizontal_flip = True,
fill_mode = "nearest",
zoom_range = 0.3,
width_shift_range = 0.3,
height_shift_range = 0.3,
rotation_range = 30)
test_datagen = ImageDataGenerator(rescale = 1./255)
train_generator = train_datagen.flow_from_directory(train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = "binary",
seed = 2018)
validation_generator = test_datagen.flow_from_directory(validation_data_dir,
target_size = (img_height, img_width),
class_mode = "binary",
seed = 2018)
early = EarlyStopping(monitor = 'val_loss', min_delta = 10e-5, patience = 10, verbose = 1, mode = 'auto')
performance_log = CSVLogger('/mnt/results/vanilla_model_log.csv', separator = ',', append = False)
# Train the model
model_final.fit_generator(generator = train_generator,
steps_per_epoch = math.ceil(train_generator.samples / batch_size),
epochs = epochs,
validation_data = validation_generator,
validation_steps = math.ceil(validation_generator.samples / batch_size),
callbacks = [early, performance_log])
# Save the model
Training Log
Epoch 1/100
151/151 [==============================] - 237s 2s/step - loss: 0.7234 - acc: 0.5240 - val_loss: 0.9899 - val_acc: 0.5425
Epoch 2/100
151/151 [==============================] - 65s 428ms/step - loss: 0.6491 - acc: 0.6228 - val_loss: 1.0248 - val_acc: 0.5425
Epoch 3/100
151/151 [==============================] - 65s 429ms/step - loss: 0.6091 - acc: 0.6648 - val_loss: 1.0377 - val_acc: 0.5425
Epoch 4/100
151/151 [==============================] - 64s 426ms/step - loss: 0.5829 - acc: 0.6968 - val_loss: 1.0459 - val_acc: 0.5425
Epoch 5/100
151/151 [==============================] - 64s 427ms/step - loss: 0.5722 - acc: 0.7070 - val_loss: 1.0472 - val_acc: 0.5425
Epoch 6/100
151/151 [==============================] - 64s 427ms/step - loss: 0.5582 - acc: 0.7166 - val_loss: 1.0501 - val_acc: 0.5425
Epoch 7/100
151/151 [==============================] - 64s 424ms/step - loss: 0.5535 - acc: 0.7188 - val_loss: 1.0492 - val_acc: 0.5425
Epoch 8/100
151/151 [==============================] - 64s 426ms/step - loss: 0.5377 - acc: 0.7287 - val_loss: 1.0209 - val_acc: 0.5425
Epoch 9/100
151/151 [==============================] - 64s 425ms/step - loss: 0.5328 - acc: 0.7368 - val_loss: 1.0062 - val_acc: 0.5425
Epoch 10/100
151/151 [==============================] - 65s 432ms/step - loss: 0.5296 - acc: 0.7381 - val_loss: 1.0016 - val_acc: 0.5425
Epoch 11/100
151/151 [==============================] - 65s 430ms/step - loss: 0.5231 - acc: 0.7419 - val_loss: 1.0021 - val_acc: 0.5425
Since I was able to get good results on a CPU machine, I hypothesized that validation loss/accuracy must be calculated incorrectly at the end of each epoch. To test this theory I used train set as validation set: if validation loss/accuracy is calculated correctly we should see roughly the same values for train and validation loss and accuracy. As you may see below, validation loss values are not the same as training loss values, which makes me believe validation loss is calculated incorrectly at the end of each epoch. Why does it happen? What are the possible solutions?
Modified Code
import sys
import math
import os
import glob
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential, Model
from keras.layers import Flatten, Dense
from keras import backend as k
from keras.callbacks import ModelCheckpoint, CSVLogger, EarlyStopping
############ Training parameters ##################
img_width, img_height = 224, 224
batch_size = 32
epochs = 100
############ Define the data ##################
train_data_dir = '/mnt/data/train'
validation_data_dir = '/mnt/data/train' # redefined validation set to test accuracy of validation loss/accuracy calculations
train_data_dir_class1 = os.path.join(train_data_dir,'class1', '*.jpg')
train_data_dir_class2 = os.path.join(train_data_dir, 'class2', '*.jpg')
validation_data_dir_class1 = os.path.join(validation_data_dir, 'class1', '*.jpg')
validation_data_dir_class2 = os.path.join(validation_data_dir, 'class2', '*.jpg')
# number of training and validation samples
nb_train_samples = len(glob.glob(train_data_dir_class1)) + len(glob.glob(train_data_dir_class2))
nb_validation_samples = len(glob.glob(validation_data_dir_class1)) + len(glob.glob(validation_data_dir_class2))
############ Define the model ##################
model = applications.resnet50.ResNet50(weights = "imagenet",
include_top = False,
input_shape = (img_width, img_height, 3))
for layer in model.layers:
layer.trainable = False
# Adding a FC layer
x = model.output
x = Flatten()(x)
predictions = Dense(1, activation = "sigmoid")(x)
# creating the final model
model_final = Model(inputs = model.input, outputs = predictions)
# compile the model
model_final.compile(loss = "binary_crossentropy",
optimizer = optimizers.Adam(lr = 0.001,
beta_1 = 0.9,
beta_2 = 0.999,
epsilon = 1e-10),
metrics = ["accuracy"])
# train and test generators
train_datagen = ImageDataGenerator(rescale = 1./255,
horizontal_flip = True,
fill_mode = "nearest",
zoom_range = 0.3,
width_shift_range = 0.3,
height_shift_range = 0.3,
rotation_range = 30)
test_datagen = ImageDataGenerator(rescale = 1./255)
train_generator = train_datagen.flow_from_directory(train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = "binary",
seed = 2018)
validation_generator = test_datagen.flow_from_directory(validation_data_dir,
target_size = (img_height, img_width),
class_mode = "binary",
seed = 2018)
early = EarlyStopping(monitor = 'val_loss', min_delta = 10e-5, patience = 10, verbose = 1, mode = 'auto')
performance_log = CSVLogger('/mnt/results/vanilla_model_log.csv', separator = ',', append = False)
# Train the model
model_final.fit_generator(generator = train_generator,
steps_per_epoch = math.ceil(train_generator.samples / batch_size),
epochs = epochs,
validation_data = validation_generator,
validation_steps = math.ceil(validation_generator.samples / batch_size),
callbacks = [early, performance_log])
# Save the model
Training log for the modified code:
Epoch 1/100
151/151 [==============================] - 251s 2s/step - loss: 0.6804 - acc: 0.5910 - val_loss: 0.6923 - val_acc: 0.5469
Epoch 2/100
151/151 [==============================] - 87s 578ms/step - loss: 0.6258 - acc: 0.6523 - val_loss: 0.6938 - val_acc: 0.5469
Epoch 3/100
151/151 [==============================] - 88s 580ms/step - loss: 0.5946 - acc: 0.6874 - val_loss: 0.7001 - val_acc: 0.5469
Epoch 4/100
151/151 [==============================] - 88s 580ms/step - loss: 0.5718 - acc: 0.7086 - val_loss: 0.7036 - val_acc: 0.5469
Epoch 5/100
151/151 [==============================] - 87s 578ms/step - loss: 0.5634 - acc: 0.7157 - val_loss: 0.7067 - val_acc: 0.5469
Epoch 6/100
151/151 [==============================] - 87s 578ms/step - loss: 0.5467 - acc: 0.7243 - val_loss: 0.7099 - val_acc: 0.5469
Epoch 7/100
151/151 [==============================] - 87s 578ms/step - loss: 0.5392 - acc: 0.7317 - val_loss: 0.7096 - val_acc: 0.5469
Epoch 8/100
151/151 [==============================] - 87s 578ms/step - loss: 0.5287 - acc: 0.7387 - val_loss: 0.7083 - val_acc: 0.5469
Epoch 9/100
151/151 [==============================] - 87s 575ms/step - loss: 0.5306 - acc: 0.7385 - val_loss: 0.7088 - val_acc: 0.5469
Epoch 10/100
151/151 [==============================] - 87s 577ms/step - loss: 0.5303 - acc: 0.7318 - val_loss: 0.7111 - val_acc: 0.5469
Epoch 11/100
151/151 [==============================] - 87s 578ms/step - loss: 0.5157 - acc: 0.7474 - val_loss: 0.7143 - val_acc: 0.5469
A very quick idea that might help.
I think image labels are randomly assigned by two image data generator and trained.
And two image data generator gives different label distribution.
That's why training accuracy goes up while validation set remains around 50%.
I haven't entirely checked documentation of data image generator. Hope this might helps.
Argument classes for flow_from_directory() describes a way of setting up training labels.
classes: optional list of class subdirectories (e.g. ['dogs',
'cats']). Default: None. If not provided, the list of classes will be
automatically inferred from the subdirectory names/structure under
directory, where each subdirectory will be treated as a different
class (and the order of the classes, which will map to the label
indices, will be alphanumeric). The dictionary containing the mapping
from class names to class indices can be obtained via the attribute