model.predict(x) where x is the same np array i used to train the model(x is obviously without the validation values).
Running this I just get the same value for all 1733 lines of numpy array. If you need code or an example for the np arrays used ask me.
the model is:
dataset = pd.read_csv('BNB.csv')
x = dataset.drop(columns=["Valuable"])
x = np.asarray(x).astype('float32')
y = dataset["Valuable"]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(256, input_shape=x_train.shape, activation='sigmoid'))
model.add(tf.keras.layers.Dense(256, activation='sigmoid'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=1000)
The numpy array (csv file) I used to train and test looks like this:
Valuable,Open,High,Low,Close,EMA8,EMA14,EMA50,ht,sar,MorningStar,Engulfing
-1,355.48,355.82,355.21,355.76,355.21,355.51,357.96,356.63,351.08,0,0
0,355.77,356.2,355.52,355.79,355.34,355.54,357.87,356.51,351.08,0,0
0,355.82,356.61,355.5,356.23,355.54,355.63,357.81,356.44,351.08,0,0
0,356.14,356.17,354.63,354.92,355.4,355.54,357.69,356.46,351.08,0,0
0,354.88,355.54,354.81,354.96,355.3,355.46,357.59,356.55,351.08,0,0
0,354.91,354.91,353.71,354.11,355.04,355.28,357.45,356.59,351.08,0,0
0,354.12,354.93,353.89,354.72,354.97,355.21,357.34,356.44,351.08,0,0
0,354.72,355.2,354.01,354.7,354.91,355.14,357.24,356.21,351.08,0,0
0,354.69,355.46,354.43,355.23,354.98,355.15,357.16,355.9,351.08,0,100
0,355.27,355.47,354.54,355.39,355.07,355.18,357.09,355.57,351.08,0,0
0,355.37,356.0,355.22,355.81,355.24,355.27,357.04,355.31,351.08,0,0
0,355.79,356.23,355.11,355.54,355.3,355.3,356.98,355.15,351.08,0,0
0,355.56,355.67,354.78,355.21,355.28,355.29,356.91,355.08,351.08,0,0
0,355.2,355.63,354.88,355.2,355.26,355.28,356.84,355.06,351.08,0,0
0,355.2,355.99,355.2,355.76,355.37,355.34,356.8,355.08,351.08,0,0
0,355.74,355.97,355.17,355.37,355.37,355.35,356.75,355.14,351.08,0,0
0,355.37,355.38,354.51,354.69,355.22,355.26,356.67,355.19,351.08,0,0
0,354.78,355.4,354.64,355.02,355.18,355.23,356.6,355.23,351.08,0,0
I want to predict whether Valuable is 0, -1, -2, 1 or 2 (my csv file is about 1700 lines long).
There are few problems with your model.
First:
You should use sparse categorical cross entropy loss instead of binary loss for your model if you have more than two classes in output.
Second:
Use softmax activation for the last/output layer.
Third:
Use as many neurons in the last layer as there are classes.
I consider the distinct values in valuable column are: [-1,-2,0,1,2].
First encode your target column like this:
y = dataset["Valuable"] # after this
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y = le.fit_transform(y)
Then Change your model definition like this:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
model = tf.keras.models.Sequential()
# changes
model.add(tf.keras.layers.Dense(256, input_shape=x_train.shape, activation="relu"))
model.add(tf.keras.layers.Dense(256, activation="relu"))
model.add(tf.keras.layers.Dense(5, activation="softmax"))
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=1000)
For a school project, I'm trying to predict data using the keras framework, but it's returning 'nan' loss and values when I try to get predicted data.
Source code :
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=5)
# create model
model = Sequential()
model.add(Dense(950, input_shape=(425,), activation='relu'))
model.add(Dense(425, activation='relu'))
model.add(Dense(200, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile model
sgd = optimizers.SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer='sgd')
# Fit the model
model.fit(X_train, y_train, epochs=20, batch_size=1, verbose=1)
#evaluate the model
y_pred = model.predict(X_test)
score = model.evaluate(X_test, y_test,verbose=1)
print(score)
# calculate predictions
predictions = model.predict(X_pred)
Data :
X_train and X_test are (panda)dataframes of 5000 rows(nber of samples) * 425 columns (number of dimensions).
y_train and y_test look like :
array([ 1.17899644, 1.46080518, 0.9662137 , ..., 2.40157461,
0.53870386, 1.3192718 ])
Can you help me with that ?
Thank you for you help!
Usually, this means that something converges to infinity. As #desertnaut pointed out in the comment, reducing the learning rate might help.
But the root of the issue is your input data. What do these 425 data points mean? Are they from different sources, different features, different parameters? Finding outliners or normalizing the data, could help.
Your code looks fine otherwise.
Make sure your target output is in range (0, 1) as you have sigmoid in the last layer.
sigmoid has an output between zero and one so if the target output is not in this range then (a) change the activation function or (b) normalize outputs in the required range.
Make sure the purpose of this model is the regression.
After considering the above three points, play around with learning rate (decrease) and the optimiser (replace with any other).
Try changing your optimizer to 'Adam' instead of SGD
You initialized your SGD optimizer in variable sgd but you're not using it in compile
I have just started using Keras and was trying to train a model using Keras deep learning kit. Works till the epochs are runned but crashes just after it.
np.random.seed(1778) # for reproducibility
need_normalise=True
need_validataion=True
nb_epoch=2#8
#Creating model
model = Sequential()
model.add(Dense(512, input_shape=(dims,)))
model.add(PReLU())
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
opt=Adadelta(lr=1,decay=0.995,epsilon=1e-5)
model.compile(loss='binary_crossentropy', optimizer=opt)
auc_scores=[]
best_score=-1
best_model=None
print('Training model...')
if need_validataion:
for i in range(nb_epoch):
#early_stopping=EarlyStopping(monitor='val_loss', patience=0, verbose=1)
#model.fit(X_train, y_train, nb_epoch=nb_epoch,batch_size=256,validation_split=0.01,callbacks=[early_stopping])
model.fit(X_train, y_train, nb_epoch=2,batch_size=256,validation_split=0.15)
y_pre = model.predict_proba(X_valid)
scores = roc_auc_score(y_valid,y_pre)
auc_scores.append(scores)
print (i,scores)
if scores>best_score:
best_score=scores
best_model=model
plt.plot(auc_scores)
plt.show()
else:
model.fit(X_train, y_train, nb_epoch=nb_epoch, batch_size=256)
y_pre = model.predict_proba(X_test)[:,1]
print roc_auc_score(y_test,y_pre)
Error Recieved:
I have pasted it over here. Please have a look at it.
http://pastebin.com/dSw9ckkk
It looks like you have two classes, a positive class and a negative class, so that the positive class labels are 1 minus the negative class labels. In that case, you can discard the negative class labels and make it a single-class problem:
model.add(Dense(1), activation='sigmoid') # instead of Dense(nb_classes) and Activation('softmax')
Alternatively, you can still train the model on both classes and just use the positive class in the AUC calculation:
roc_auc_score(y_test[:, 1],y_pre[:, 1])