I have just started using Keras and was trying to train a model using Keras deep learning kit. Works till the epochs are runned but crashes just after it.
np.random.seed(1778) # for reproducibility
need_normalise=True
need_validataion=True
nb_epoch=2#8
#Creating model
model = Sequential()
model.add(Dense(512, input_shape=(dims,)))
model.add(PReLU())
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
opt=Adadelta(lr=1,decay=0.995,epsilon=1e-5)
model.compile(loss='binary_crossentropy', optimizer=opt)
auc_scores=[]
best_score=-1
best_model=None
print('Training model...')
if need_validataion:
for i in range(nb_epoch):
#early_stopping=EarlyStopping(monitor='val_loss', patience=0, verbose=1)
#model.fit(X_train, y_train, nb_epoch=nb_epoch,batch_size=256,validation_split=0.01,callbacks=[early_stopping])
model.fit(X_train, y_train, nb_epoch=2,batch_size=256,validation_split=0.15)
y_pre = model.predict_proba(X_valid)
scores = roc_auc_score(y_valid,y_pre)
auc_scores.append(scores)
print (i,scores)
if scores>best_score:
best_score=scores
best_model=model
plt.plot(auc_scores)
plt.show()
else:
model.fit(X_train, y_train, nb_epoch=nb_epoch, batch_size=256)
y_pre = model.predict_proba(X_test)[:,1]
print roc_auc_score(y_test,y_pre)
Error Recieved:
I have pasted it over here. Please have a look at it.
http://pastebin.com/dSw9ckkk
It looks like you have two classes, a positive class and a negative class, so that the positive class labels are 1 minus the negative class labels. In that case, you can discard the negative class labels and make it a single-class problem:
model.add(Dense(1), activation='sigmoid') # instead of Dense(nb_classes) and Activation('softmax')
Alternatively, you can still train the model on both classes and just use the positive class in the AUC calculation:
roc_auc_score(y_test[:, 1],y_pre[:, 1])
Related
I am trying to use neural network for my regression problem in python but the output of the neural network is a straight horizontal line which is zero. I have one input and obviously one output.
Here is my code:
def baseline_model():
# create model
model = Sequential()
model.add(Dense(1, input_dim=1, kernel_initializer='normal', activation='relu'))
model.add(Dense(4, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error',metrics=['mse'], optimizer='adam')
model.summary()
return model
# evaluate model
estimator = KerasRegressor(build_fn=baseline_model, epochs=50, batch_size=64,validation_split = 0.2, verbose=1)
kfold = KFold(n_splits=10)
results = cross_val_score(estimator, X_train, y_train, cv=kfold)
Here are the plots of NN prediction vs. target for both training and test data.
Training Data
Test Data
I have also tried different weight initializers (Xavier and He) with no luck!
I really appreciate your help
First of all correct your syntax while adding dense layers in model remove the double equal == with single equal = with kernal_initilizer like below
model.add(Dense(1, input_dim=1, kernel_initializer ='normal', activation='relu'))
Then to make the performance better do the followong
Increase the number of hidden neurons in the hidden layers
Increase the number of hidden layers.
If still you have same problem then try to change the optimizer and activation function. Tuning the hyperparameters may help you in converging to the solution
EDIT 1
You also have to fit the estimator after cross validation like below
estimator.fit(X_train, y_train)
and then you can test on the test data as follow
prediction = estimator.predict(X_test)
from sklearn.metrics import accuracy_score
accuracy_score(Y_test, prediction)
I am trying to use GloVe embeddings to train a rnn model based on this article.
I have a labeled data: text(tweets) on one column, labels on another (hate, offensive or neither).
However the model seems to predict only one class in the result.
This is the LSTM model:
model = Sequential()
hidden_layer = 3
gru_node = 32
# model embedding matrix here....
for i in range(0,hidden_layer):
model.add(GRU(gru_node,return_sequences=True, recurrent_dropout=0.2))
model.add(Dropout(dropout))
model.add(GRU(gru_node, recurrent_dropout=0.2))
model.add(Dropout(dropout))
model.add(Dense(64, activation='softmax'))
model.add(Dense(nclasses, activation='softmax'))
start=time.time()
model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
fitting the model:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 1)
X_train_Glove,X_test_Glove, word_index, embeddings_index = loadData_Tokenizer(X_train, X_test)
model_RNN = Build_Model_RNN_Text(word_index,embeddings_index, 20)
model_RNN.fit(X_train_Glove,y_train,
validation_data=(X_test_Glove, y_test),
epochs=4,
batch_size=128,
verbose=2)
y_preds = model_RNN.predict_classes(X_test_Glove)
print(metrics.classification_report(y_test, y_preds))
Results:
classification report
Confusion matrix
Am I missing something here?
Update:
this is what the distribution looks like
and the model summary, more or less
How the distribution of your data looks like? The first suggestion is to stratify train/test split (here is the link for the documentation).
The second question is how much data do you have in comparison with the complexity of the model? Maybe, your model is so complex, that just do overfitting. You can use the command model.summary() to see the number of trainable parameters.
I am learning how to use keras and I keep getting some problems. I will try to be as much specific as possible.
My task: I am trying to create a neural network to predict opening status for a domestic residence.
I have a dataset with 524729 examples. I use 70% as training set and 30% as test set. I am reaching 70+% of acc in my tests but for some reason every time that I try to predict an output I get the same values.
Right now, I have the following topology:
model = Sequential()
model.add(Dense(15, input_shape=(13, ), kernel_initializer='random_normal'))
model.add(Dense(15, activation='softplus'))
model.add(Dense(15, activation='softplus'))
model.add(Dense(10, activation='sigmoid'))
model.summary()
sgd = optimizers.SGD(lr=0.1, decay=1e-6, momentum=0.3, nesterov=True)
model.compile(optimizer=sgd, loss='mean_squared_error', metrics=['mae', 'acc'])
model.fit(X_training, Y_training, validation_data=(X_test, Y_test), epochs=1, batch_size=32)
and I use:
model.predict(np_inputRN, verbose=0)
to predict the output but for some reason I keep getting the same values.
0.0172018650919,0.498908281326,0.984391093254,0.485811322927,0.480756670237,0.984736263752,0.536143004894,0.475958675146,0.494080305099,0.488458126783
Can some one help me?
==========================================================================
#Aiven :
Data Set: 524729
Test Set[30%]: 157418
Training Set [70%]: 367310
X_training.shape: (367311, 13)
Y_training.shape: (367311, 10)
X_test.shape: (157419, 13)
Y_test.shape: (157419, 10)
np_inputRN.shape: (1, 13)
For a school project, I'm trying to predict data using the keras framework, but it's returning 'nan' loss and values when I try to get predicted data.
Source code :
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=5)
# create model
model = Sequential()
model.add(Dense(950, input_shape=(425,), activation='relu'))
model.add(Dense(425, activation='relu'))
model.add(Dense(200, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile model
sgd = optimizers.SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer='sgd')
# Fit the model
model.fit(X_train, y_train, epochs=20, batch_size=1, verbose=1)
#evaluate the model
y_pred = model.predict(X_test)
score = model.evaluate(X_test, y_test,verbose=1)
print(score)
# calculate predictions
predictions = model.predict(X_pred)
Data :
X_train and X_test are (panda)dataframes of 5000 rows(nber of samples) * 425 columns (number of dimensions).
y_train and y_test look like :
array([ 1.17899644, 1.46080518, 0.9662137 , ..., 2.40157461,
0.53870386, 1.3192718 ])
Can you help me with that ?
Thank you for you help!
Usually, this means that something converges to infinity. As #desertnaut pointed out in the comment, reducing the learning rate might help.
But the root of the issue is your input data. What do these 425 data points mean? Are they from different sources, different features, different parameters? Finding outliners or normalizing the data, could help.
Your code looks fine otherwise.
Make sure your target output is in range (0, 1) as you have sigmoid in the last layer.
sigmoid has an output between zero and one so if the target output is not in this range then (a) change the activation function or (b) normalize outputs in the required range.
Make sure the purpose of this model is the regression.
After considering the above three points, play around with learning rate (decrease) and the optimiser (replace with any other).
Try changing your optimizer to 'Adam' instead of SGD
You initialized your SGD optimizer in variable sgd but you're not using it in compile
I'm working on a regression problem using Keras+Tensorflow. And I've found something interesting.
1) here are the two models, which are actually the same except that the first model is using a globally defined 'optimizer'.
optimizer = Adam() #as a global variable
def OneHiddenLayer_Model():
model = Sequential()
model.add(Dense(300 * inputDim, input_dim=inputDim, kernel_initializer='normal', activation=activationFunc))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_squared_error', optimizer=optimizer)
return model
def OneHiddenLayer_Model2():
model = Sequential()
model.add(Dense(300 * inputDim, input_dim=inputDim, kernel_initializer='normal', activation=activationFunc))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_squared_error', optimizer=Adam())
return model
2) Then, I use two schemes to train the datasets (training set(scaleX, Y); testing set(scaleTestX, testY)).
2.1) Scheme1. two successive fitting with the first model
numpy.random.seed(seed)
model = OneHiddenLayer_Model()
model.fit(scaleX, Y, validation_data=(scaleTestX, testY), epochs=250, batch_size=numBatch, verbose=0)
numpy.random.seed(seed)
model = OneHiddenLayer_Model()
history = model.fit(scaleX, Y, validation_data=(scaleTestX, testY), epochs=500, batch_size=numBatch, verbose=0)
predictY = model.predict(scaleX)
predictTestY = model.predict(scaleTestX)
2.2) Scheme2. one fitting with the second model
numpy.random.seed(seed)
model = OneHiddenLayer_Model2()
history = model.fit(scaleX, Y, validation_data=(scaleTestX, testY), epochs=500, batch_size=numBatch, verbose=0)
predictY = model.predict(scaleX)
predictTestY = model.predict(scaleTestX)
3). Finally, the results are plotted for each scheme, as shown below, (model loss history --> predict on scaleX --> predict on scaleTestX),
3.1) Scheme1
3.2) Scheme2 (with 500 epochs)
3.3) add one more test with Scheme2 and set epochs = 1000
From the images above, I've found that Scheme1 is better than Scheme2, even if Scheme2 is set with more epochs.
Can anyone help to explain why Scheme1 is better? Thanks a lot!!!