I've got a Keras model that looks like this:
# Creating model
model = Sequential()
model.add(LSTM(32, input_dim=5, activation="relu", return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(32, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(16, activation="relu"))
model.add(Dense(8, activation="relu"))
model.add(Dense(1, activation="sigmoid"))
self.model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
# Training
sequence_generator = cycle(time_sequence_generator(X, Y, 10))
model.fit(sequence_generator, steps_per_epoch=X.shape[0], epochs=10)
No matter how many epochs I use, it seems as though the model only ever predicts 0.
The labels are about 65% 0 and 35% 1, is this skewed enough to cause a problem like this? What else could I have done to cause this?
Related
I am not sure where i am wrong in this code. My goal is to train my dataset for binary classification using LSTM and GRU.
[the output comes with module wrapper and GRU not executing please check the image][1]
#BUILD THE MODEL
top_words = 10000
embedding_vecor_length = 32
model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length, input_length=X.shape[1]))
#model.add(Dropout(0.2))
model.add(GRU(100,dropout=0.2, recurrent_dropout=0.2, return_sequences=True))
model.add(LSTM(100,dropout=0.2, recurrent_dropout=0.2))
#model.add(Dropout(0.2))
model.add(Dense(1, activation='sigmoid'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='Adam',
metrics=['accuracy'])
print(model.summary())
model.summary()
```
[1]: https://i.stack.imgur.com/14pyl.jpg
I am training a keras autoencoder model with the following structure:
model = Sequential()
model.add(Dense(128, activation='relu', input_shape=(MAX_CONTEXTS, 3)))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(3, activation='relu'))
model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
My data is in the shape of (number_of_samples, 430, 3) and contains values from [-1.9236537371711413, 1.9242677998256246]. This data is already normalized. I then train this model:
history = model.fit(X, X, epochs=15, batch_size=2, verbose=1, shuffle=True, validation_split=0.2)
and get an accuracy of 95.03% (suspiciously high, but my problem now is something else). Now when I predict a sample of my data, the positive values are relatively good, close to what they are in the input, but the negative values are all rounded to 0. Is this a fault of the loss function that I chose? And if so which other loss function should I choose? Or do I have to scale my data differently?
This is because you apply relu activation at the output layer.
I am training a Keras model for my data. I have to split the data into 3 parts and I am calling the same keras model for each split and trying to fit and predict consecutively.
I have a suspicion that every-time I call the model the model weights remain the same after reaching convergence from last training. And the next model called starts minimising the error from its previous state. I want that each time the model is trained, it starts to fit the data from a different random weights initialisation. Because all of my 3 splits are subset of the same dataset and I don't want any data leakage into the model due to seeing the split data beforehand while training.
Can I know if it is reinitialising the weights every-time the model is fit. And if not how can I do so?
here is how my code looks like
model = Sequential()
model.add(Dense(512, input_dim=77, kernel_initializer='RandomNormal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(256, kernel_initializer='RandomNormal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(512, kernel_initializer='RandomNormal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(256, kernel_initializer='RandomNormal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(512, kernel_initializer='RandomNormal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(256, kernel_initializer='RandomNormal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
# Compile model
model.compile(loss='mean_absolute_error', optimizer='adam')
model()
# evaluate model
history = model.fit(scaler.transform(X_train_high), y_train_high,
batch_size=128,
epochs=5)
results = model.evaluate(scaler.transform(X_train_high), y_train_high, batch_size=128)
print('High test loss, test acc:', results)
# evaluate model
history = model.fit(scaler.transform(X_train_medium), y_train_medium,
batch_size=128,
epochs=5)
results = model.evaluate(scaler.transform(X_train_medium), y_train_medium, batch_size=128)
print(' Medium test loss, test acc:', results)
# evaluate model
history = model.fit(scaler.transform(X_train_low), y_train_low,
batch_size=128,
epochs=5)
results = model.evaluate(scaler.transform(X_train_low), y_train_low, batch_size=128, epochs=5)
print('Low test loss, test acc:', results)
The model will keep its weight until you redefine one.
def define_model():
model = Sequential()
model.add(Dense(512, input_dim=77, kernel_initializer='RandomNormal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(256, kernel_initializer='RandomNormal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(512, kernel_initializer='RandomNormal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(256, kernel_initializer='RandomNormal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(512, kernel_initializer='RandomNormal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(256, kernel_initializer='RandomNormal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
model=define_model()
# Compile model
model.compile(loss='mean_absolute_error', optimizer='adam')
# evaluate model
history = model.fit(scaler.transform(X_train_high), y_train_high,
batch_size=128,
epochs=5)
results = model.evaluate(scaler.transform(X_train_high), y_train_high, batch_size=128)
print('High test loss, test acc:', results)
model=define_model()
model.compile(loss='mean_absolute_error', optimizer='adam')
# evaluate model
history = model.fit(scaler.transform(X_train_medium), y_train_medium,
batch_size=128,
epochs=5)
results = model.evaluate(scaler.transform(X_train_medium), y_train_medium, batch_size=128)
print(' Medium test loss, test acc:', results)
You can check by model.get_weights.
I am training a simple neural network in Keras with Theano backend consisting of 4 dense layers connected to a Merge layer and then to a softmax classifier layer. Using Adam for training, the first few epochs train in about 60s each (in the CPU) but, after that, the training time per epoch starts increasing, taking more than 400s by epoch 70, making it unusable.
Is there anything wrong with my code or is this suppose to happen?
This only happens when using Adam, not with sgd, adadelta, rmsprop or adagrad. I'd use any of the other methods but Adam produces far better results.
The code:
modela = Sequential()
modela.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
modelb = Sequential()
modelb.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
modelc = Sequential()
modelc.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
modeld = Sequential()
modeld.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
model = Sequential()
model.add(Merge([modela, modelb, modelc, modeld], mode='concat', concat_axis=1))
model.add(Dense(258, init='uniform', activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
hist = model.fit([Xa, Xb, Xc, Xd], Ycat, validation_split=.25, nb_epoch=80, batch_size=100, verbose=2)
I'm trying to fit a simple Neural Network to predict a binary target using keras-1.0.6. The output saturates after the very first epoch. I try playing around with the learning rate (from 0.1 to 1e-6), decay and momentum of the SGD optimizer and with the layers (10-512 hidden neurons and 1-2 hidden layers) and their activation functions of the network, but nothing worked - the prediction accuracy was the same.
My training set has shape (13602, 115) and my validation set has shape (3400,115). The target variable y_train and y_test have values encoded as 1 and 0 (60% are 1's and 40% are 0's). At first, the data was not normalized though when I normalized it I got the same results.
Verifying the output, I see that the model is predicting only 1 class. Sometimes it predicts only 1's and other times only 0's (when I tweak the model).
I also tried to encode the target variable in the shape (n_sample, 2) but the output was the same.
I followed some questions here and googling that suggests tunning the learning rate and not using 'softmax' activation but couldn't improve the results.
Some of the models I tried is below:
The simplest model:
model.add(Dense(1, input_dim=X_train.shape[1], activation='sigmoid'))
Model 2:
model = Sequential()
model.add(Dense(512, input_dim=X_train.shape[1]))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
model.add(Activation('sigmoid'))
Model 3
model.add(Dense(64, input_dim=X_train.shape[1], init='uniform', activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
Model 4
model.add(Dense(64, input_dim=X_train.shape[1], init='uniform', activation='sigmoid'))
model.add(Dense(1, input_dim=X_train.shape[1], activation='sigmoid'))
and to compile and fit the model:
sgd = SGD(lr=0.01, decay=0.1, momentum=0.0, nesterov=True)
model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train2, nb_epoch=5, batch_size=50, validation_split=0.2)
model.predict(X_test)
The output gives either [0,0,0,0,0,0,0,...] or [1,1,1,1,1,1,1,1,...]
Does anybody have a clue on what's going on here?