I'm new to Python (although not to programming - I'm usually programming in JavaScript) and I'm very interested in AI development.
Recently I've been trying to develop a deep learning algorithm by following this article.
My goal is to predict a set of 7 numbers, based on a CSV file that contains a large list, with each row having 7 numbers as well. The order of the list matters.
I ended up having the following code:
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
from numpy import loadtxt, random
random.seed(seed)
dataset = loadtxt("data/Lotto.csv", delimiter=",", skiprows=1)
X = dataset[:, 0:7]
Y = dataset[:, 6]
(X_train, X_test, Y_train, Y_test) = train_test_split(X, Y, test_size=0.33, random_state=4)
model = Sequential()
model.add(Dense(8, input_dim=7, kernel_initializer="uniform", activation="relu"))
model.add(Dense(6, kernel_initializer="uniform", activation="relu"))
model.add(Dense(1, kernel_initializer="uniform", activation="sigmoid"))
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
model.fit(X_train, Y_train, validation_data=(X_test, Y_test), epochs=100, batch_size=5, shuffle=False)
scores = model.evaluate(X_test, Y_test)
print("Accuracy: %.2f%%" %(scores[1] * 100))
After running it in Google Colaboratory, while I'm not getting any errors - I noticed that for each epoch, the loss result doesn't change, and as a result, I keep getting low accuracy (~6%).
What am I doing wrong?
Try changing optimizer to RMSprop with learning rate of around 0.0001.
RMSprop is usually better than most optimizers and gives better accuracy and lesser loss than others. You could alternatively try SGD, which is also a good optimizer.
Also increase number of parameters, as more trainable parameters leads to the model being trained with more precision and gives a much accurate prediction.
You could update the code to tensorflow 2.x and change the code to :
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from sklearn.model_selection import train_test_split
from numpy import loadtxt, random
#Rest of the code
.......
.......
.......
model = Sequential()
model.add(Dense(64, input_shape=(7,), activation='relu', kernel_initializer='uniform'))
model.add(Dense(64, kernel_initializer="uniform", activation="relu"))
model.add(Dense(1, kernel_initializer="uniform", activation="sigmoid"))
model.compile(loss="binary_crossentropy", optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.0001), metrics=["accuracy"])
model.fit(X_train, Y_train, validation_data=(X_test, Y_test), epochs=100, batch_size=5, shuffle=False)
scores = model.evaluate(X_test, Y_test)
print("Accuracy: %.2f%%" %(scores[1] * 100))
Correct me if I'm wrong, but by the looks of it your input is a list of 7 numbers, and you want to output the 7th number in that list. Now by using a sigmoid activation in your last layer, you're restricting your model output to the interval (0,1). Are you sure that your data is in this interval?
Also, your model is way to complicated for that task. You really need only one dense layer without an activation or bias to be able to do this.
Related
I am trying to define a sequential model for a program that analyses text reviews using sentiment analysis. I am having trouble however when defining one of the models I am trying to use.
Below is the section of my code (with the relevant imports) I am having problems with:
from sklearn.model_selection import train_test_split
from tensorflow.keras import Model
from tensorflow.keras import Input
from tensorflow.keras.layers import Dense
from tensorflow.keras import Sequential
X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=42)
n_features = X_train.shape[1]
model = Sequential()
model.add(Dense(20, activation='relu', kernel_initializer='he_normal',
input_shape=(n_features,)))
model.add(Dense(10, activation='tanh', kernel_initializer='he_normal'))
model.add(Dense(8, activation='sigmoid', kernel_initializer='he_normal'))
model.add(Dense(6, activation='softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(X_train, y_train, epochs=150, batch_size=32, verbose=0)
loss, acc = model.evaluate(X_test, y_test, verbose=0)
print('Test Accuracy: %.3f' % (acc * 100))
Below is the warning message that I am receiving:
WARNING:tensorflow:Falling back from v2 loop because of error: Failed to find data adapter that can handle input: <class 'scipy.sparse.csr.csr_matrix'>, <class 'NoneType'>
WARNING:tensorflow:Falling back from v2 loop because of error: Failed to find data adapter that can handle input: <class 'scipy.sparse.csr.csr_matrix'>, <class 'NoneType'>
Test Accuracy: 78.668
As you can see I am still receiving the accuracy for the model but also the warning. This is the first time creating something like this so I am a little confused and any help would be much appreciated. I am programming in Python and I am using a Jupyter Notebook.
I am trying to use neural network for my regression problem in python but the output of the neural network is a straight horizontal line which is zero. I have one input and obviously one output.
Here is my code:
def baseline_model():
# create model
model = Sequential()
model.add(Dense(1, input_dim=1, kernel_initializer='normal', activation='relu'))
model.add(Dense(4, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error',metrics=['mse'], optimizer='adam')
model.summary()
return model
# evaluate model
estimator = KerasRegressor(build_fn=baseline_model, epochs=50, batch_size=64,validation_split = 0.2, verbose=1)
kfold = KFold(n_splits=10)
results = cross_val_score(estimator, X_train, y_train, cv=kfold)
Here are the plots of NN prediction vs. target for both training and test data.
Training Data
Test Data
I have also tried different weight initializers (Xavier and He) with no luck!
I really appreciate your help
First of all correct your syntax while adding dense layers in model remove the double equal == with single equal = with kernal_initilizer like below
model.add(Dense(1, input_dim=1, kernel_initializer ='normal', activation='relu'))
Then to make the performance better do the followong
Increase the number of hidden neurons in the hidden layers
Increase the number of hidden layers.
If still you have same problem then try to change the optimizer and activation function. Tuning the hyperparameters may help you in converging to the solution
EDIT 1
You also have to fit the estimator after cross validation like below
estimator.fit(X_train, y_train)
and then you can test on the test data as follow
prediction = estimator.predict(X_test)
from sklearn.metrics import accuracy_score
accuracy_score(Y_test, prediction)
Today I've ran into some very strange behavior of Keras. When I try to do a classification run on the iris-dataset with a simple model, keras version 1.2.2 gives me +- 95% accuracy, whereas a keras version of 2.0+ predicts the same class for every training example (leading to an accuracy of +- 35%, as there are three types of iris). The only thing that makes my model predict +-95% accuracy is downgrading keras to a version below 2.0:
I think it is a problem with Keras, as I have tried the following things, all do not make a difference;
Switching activation function in the last layer (from Sigmoid to softmax).
Switching backend (Theano and Tensorflow both give roughly same performance).
Using a random seed.
Varying the number of neurons in the hidden layer (I only have 1 hidden layer in this simple model).
Switching loss-functions.
As the model is very simple and it runs on it's own (You just need the easy-to-obtain iris.csv dataset) I decided to include the entire code;
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import LabelEncoder
#Load data
data_frame = pd.read_csv("iris.csv", header=None)
data_set = data_frame.values
X = data_set[:, 0:4].astype(float)
Y = data_set[:, 4]
#Encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
# convert integers to dummy variables (i.e. one hot encoded)
dummy_y = np_utils.to_categorical(encoded_Y)
def baseline_model():
#Create & Compile model
model = Sequential()
model.add(Dense(8, input_dim=4, init='normal', activation='relu'))
model.add(Dense(3, init='normal', activation='sigmoid'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
#Create Wrapper For Neural Network Model For Use in scikit-learn
estimator = KerasClassifier(build_fn=baseline_model, nb_epoch=200, batch_size=5, verbose=0)
#Create kfolds-cross validation
kfold = KFold(n_splits=10, shuffle=True)
#Evaluate our model (Estimator) on dataset (X and dummy_y) using a 10-fold cross-validation procedure (kfold).
results = cross_val_score(estimator, X, dummy_y, cv=kfold)
print("Accuracy: {:2f}% ({:2f}%)".format(results.mean()*100, results.std()*100))
if anyone wants to replicate the error here are the dependencies I used to observe the problem:
numpy=1.16.4
pandas=0.25.0
sk-learn=0.21.2
theano=1.0.4
tensorflow=1.14.0
In Keras 2.0, many parameters changed names, there is compatibility layer to keep things working, but somehow it did not apply when using KerasClassifier.
In this part of the code:
estimator = KerasClassifier(build_fn=baseline_model, nb_epoch=200, batch_size=5, verbose=0)
You are using the old name nb_epoch instead of the modern name of epochs. The default value is epochs=1, meaning that your model was only being trained for one epoch, producing very low quality predictions.
Also note that here:
model.add(Dense(3, init='normal', activation='sigmoid'))
You should be using a softmax activation instead of sigmoid, as you are using the categorical cross-entropy loss:
model.add(Dense(3, init='normal', activation='softmax'))
I've managed to isolate the issue, if you change nb_epoch to epochs, (All else being exactly equal) the model predicts very good again, in keras 2 as well. I don't know if this is intended behavior or a bug.
For a school project, I'm trying to predict data using the keras framework, but it's returning 'nan' loss and values when I try to get predicted data.
Source code :
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=5)
# create model
model = Sequential()
model.add(Dense(950, input_shape=(425,), activation='relu'))
model.add(Dense(425, activation='relu'))
model.add(Dense(200, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile model
sgd = optimizers.SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer='sgd')
# Fit the model
model.fit(X_train, y_train, epochs=20, batch_size=1, verbose=1)
#evaluate the model
y_pred = model.predict(X_test)
score = model.evaluate(X_test, y_test,verbose=1)
print(score)
# calculate predictions
predictions = model.predict(X_pred)
Data :
X_train and X_test are (panda)dataframes of 5000 rows(nber of samples) * 425 columns (number of dimensions).
y_train and y_test look like :
array([ 1.17899644, 1.46080518, 0.9662137 , ..., 2.40157461,
0.53870386, 1.3192718 ])
Can you help me with that ?
Thank you for you help!
Usually, this means that something converges to infinity. As #desertnaut pointed out in the comment, reducing the learning rate might help.
But the root of the issue is your input data. What do these 425 data points mean? Are they from different sources, different features, different parameters? Finding outliners or normalizing the data, could help.
Your code looks fine otherwise.
Make sure your target output is in range (0, 1) as you have sigmoid in the last layer.
sigmoid has an output between zero and one so if the target output is not in this range then (a) change the activation function or (b) normalize outputs in the required range.
Make sure the purpose of this model is the regression.
After considering the above three points, play around with learning rate (decrease) and the optimiser (replace with any other).
Try changing your optimizer to 'Adam' instead of SGD
You initialized your SGD optimizer in variable sgd but you're not using it in compile
I have just started using Keras and was trying to train a model using Keras deep learning kit. Works till the epochs are runned but crashes just after it.
np.random.seed(1778) # for reproducibility
need_normalise=True
need_validataion=True
nb_epoch=2#8
#Creating model
model = Sequential()
model.add(Dense(512, input_shape=(dims,)))
model.add(PReLU())
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
opt=Adadelta(lr=1,decay=0.995,epsilon=1e-5)
model.compile(loss='binary_crossentropy', optimizer=opt)
auc_scores=[]
best_score=-1
best_model=None
print('Training model...')
if need_validataion:
for i in range(nb_epoch):
#early_stopping=EarlyStopping(monitor='val_loss', patience=0, verbose=1)
#model.fit(X_train, y_train, nb_epoch=nb_epoch,batch_size=256,validation_split=0.01,callbacks=[early_stopping])
model.fit(X_train, y_train, nb_epoch=2,batch_size=256,validation_split=0.15)
y_pre = model.predict_proba(X_valid)
scores = roc_auc_score(y_valid,y_pre)
auc_scores.append(scores)
print (i,scores)
if scores>best_score:
best_score=scores
best_model=model
plt.plot(auc_scores)
plt.show()
else:
model.fit(X_train, y_train, nb_epoch=nb_epoch, batch_size=256)
y_pre = model.predict_proba(X_test)[:,1]
print roc_auc_score(y_test,y_pre)
Error Recieved:
I have pasted it over here. Please have a look at it.
http://pastebin.com/dSw9ckkk
It looks like you have two classes, a positive class and a negative class, so that the positive class labels are 1 minus the negative class labels. In that case, you can discard the negative class labels and make it a single-class problem:
model.add(Dense(1), activation='sigmoid') # instead of Dense(nb_classes) and Activation('softmax')
Alternatively, you can still train the model on both classes and just use the positive class in the AUC calculation:
roc_auc_score(y_test[:, 1],y_pre[:, 1])