I'm experimenting with Keras to develop a multi-layer perceptron for binary classification purposes and I'm surprised at the (poor) performance I obtain (57% accuracy on training set). A logistic regression correctly classifies 100% of the samples.
I created a data set with 2 inputs: input A is a sine function. Input B = input A plus a lag. When input A >= input B, output = 1. Otherwise, output = 0.
SIN SIN-1 Direction
0 0 1
0.06279052 0 1
0.125333234 0.06279052 1
0.187381315 0.125333234 1
0.248689887 0.187381315 1
0.309016994 0.248689887 1
0.368124553 0.309016994 1
0.425779292 0.368124553 1
0.481753674 0.425779292 1
0.535826795 0.481753674 1
0.587785252 0.535826795 1
0.63742399 0.587785252 1
The issue I have is similar to what is described in "Keras low accuracy classification task". The answer points in the direction of the data set which I do not believe is the problem here.
See code below. What am I missing to improve the accuracy of the model? Adding neurons to the layers or changing the activation function of the output layer to softmax hasn't yielded any better results.
import numpy
import pandas
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load the dataset
dataframe = pandas.read_csv('SIN.csv', usecols=[0,1,2], engine='python')
dataset = dataframe.values
X = dataset[:,0:2].astype(float)
Y = dataset[:,2].astype(int)
# split into train and test sets
train_size = int(len(X) * 0.80)
test_size = len(X) - train_size
Xtrain, Xtest = X[0:train_size,:], X[train_size:len(X),:]
Ytrain, Ytest = Y[0:train_size], Y[train_size:len(Y)]
print(len(Xtrain), len(Xtest))
# create and fit Multilayer Perceptron model
model = Sequential()
model.add(Dense(2, input_dim=2, init='uniform', activation='relu'))
model.add(Dense(2, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(Xtrain, Ytrain, nb_epoch=20, batch_size=2, verbose=2)
# Estimate model performance
trainScore = model.evaluate(Xtrain, Ytrain, verbose=2)
print('Train Score: %.2f' % trainScore[1])
testScore = model.evaluate(Xtest, Ytest, verbose=2)
print('Test Score: %.2f' % testScore[1])
Related
Here the problem, I have a dataset 2200x39, I know... very poor. Where 38 are the features (texture and statistic) and the last column is the output class which could be 0 or 1. My dataset is balanced (1100 "1" and 1100 "0").
I'm trying to improve my performance which is stuck in 0.69 for loss and 0.49 for accuracy. I tried to add a layer, to add neurons, different parameters. Nothing, values of accuracy and loss change just a bit.
So, first of all, I import all the stuff I need
import numpy as np
from sklearn.model_selection import train_test_split
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, BatchNormalization, Conv1D
from tensorflow.keras.optimizers import SGD
import matplotlib.pyplot as plt
Then I prepar my data and split 80% training set and 20% validation test
# fix a seed for reproducing same results if we wish to train and evaluate our network more than once
seed = 9
np.random.seed(seed)
# load dataset
dataset = np.loadtxt('tr_set.csv', delimiter=',', skiprows=1)
# Show the first 10 rows
print(dataset[1:10])
# Delete the first column with the patient index
dataset = dataset[:,1:42]
# Split into input (features) and output variables
X = dataset[:,2:40]
Y = dataset[:,40]
# Counting elements in class 0 and in class 1
count_0 = 0
count_1 = 0
for i in Y:
if i == 0:
count_0 = count_0 + 1
if i == 1:
count_1 = count_1 + 1
print("Number of elements in 0 class:", count_0)
print("Number of elements in 1 class:", count_1)
# The dataset is balanced
# Split into training set(80%) and validation set (20%)
(X_train, X_val, Y_train, Y_val) = train_test_split(X, Y, test_size=0.2, random_state=seed)
And here my model after I reshape X_train and X_val due to using Conv1D
# Create the model
opt = SGD(lr=0.00001)
model = Sequential()
model.add(Dense(1024, activation='relu', kernel_initializer='random_uniform', input_shape=(1,38)))
model.add(BatchNormalization()) # It is used to normalize the input layer by adjusting and scaling the activations.
model.add(Dense(512, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(128, activation='relu'))
model.summary()
model.add(Conv1D(64, 3, padding="same", activation="relu"))
# model.add(MaxPooling1D(2))
model.summary()
model.add(Dense(1, activation='sigmoid'))
model.summary()
# compile the model
model.compile(loss='binary_crossentropy', optimizer= opt, metrics=['accuracy'])
# fit the model
history = model.fit(X_train, Y_train, validation_data=(X_val, Y_val), epochs=15, batch_size=10)
# w_data = model.get_weights()
What it is wrong, I delete the max-pooling because I have problems with dimension (Something like subtracting 2 from 1)?
I have the below code which works perfectly for a neural network. I know I need the confusion matrix library to find the false positive and false negative rates but I'm not sure how to do it as I'm no expert in programming. Can someone help please?
import pandas as pd
from sklearn import preprocessing
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
# read the csv file and convert into arrays for the machine to process
df = pd.read_csv('dataset_ori.csv')
dataset = df.values
# split the dataset into input features and the feature to predict
X = dataset[:,0:7]
Y = dataset[:,7]
# scale the dataset using sigmoid function min_max_scaler so that all the input features lie between 0 and 1
min_max_scaler = preprocessing.MinMaxScaler()
# store the dataset into an array
X_scale = min_max_scaler.fit_transform(X)
# split the dataset into 30% testing and the rest to train
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(X_scale, Y, test_size=0.3)
# split the val_and_test size equally to the validation set and the test set.
X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)
# specify the sequential model and describe the layers that will form architecture of the neural network
model = Sequential([Dense(7, activation='relu', input_shape=(7,)), Dense(32, activation='relu'), Dense(5, activation='relu'), Dense(1, activation='sigmoid'),])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# training the data
hist = model.fit(X_train, Y_train, batch_size=32, epochs=100, validation_data=(X_val, Y_val))
# to find the accuracy of the mf the classifier
scores = model.evaluate(X_test, Y_test)
print("Accuracy: %.2f%%" % (scores[1]*100))
This is the code provided in the answer below. response, model are both highlighted with red for unreslove references
from keras import models
from keras.layers import Dense, Dropout
from keras.utils import to_categorical
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from keras.models import Sequential
from keras.layers import Dense, Activation
from sklearn import metrics
from sklearn.preprocessing import StandardScaler
# read the csv file and convert into arrays for the machine to process
df = pd.read_csv('dataset_ori.csv')
dataset = df.values
# split the dataset into input features and the feature to predict
X = dataset[:,0:7]
Y = dataset[:,7]
# Splitting into Train and Test Set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(dataset,
response,
test_size = 0.2,
random_state = 0)
# Initialising the ANN
classifier = Sequential()
# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu', input_dim =7 ))
model.add(Dropout(0.5))
# Adding the second hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dropout(0.5))
# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
# Fitting the ANN to the Training set
classifier.fit(X_train, y_train, batch_size = 10, epochs = 20)
# Train model
scaler = StandardScaler()
classifier.fit(scaler.fit_transform(X_train.values), y_train)
# Summary of neural network
classifier.summary()
# Predicting the Test set results & Giving a threshold probability
y_prediction = classifier.predict_classes(scaler.transform(X_test.values))
print ("\n\naccuracy" , np.sum(y_prediction == y_test) / float(len(y_test)))
y_prediction = (y_prediction > 0.5)
#Let's see how our model performed
from sklearn.metrics import classification_report
print(classification_report(y_test, y_prediction))
Your input to confusion_matrix must be an array of int not one hot encodings.
# Predicting the Test set results
y_pred = model.predict(X_test)
y_pred = (y_pred > 0.5)
matrix = metrics.confusion_matrix(y_test.argmax(axis=1), y_pred.argmax(axis=1))
Below output would have come in that manner so by giving a probability threshold .5 will transform this to Binary.
output(y_pred):
[0.87812372 0.77490434 0.30319547 0.84999743]
The sklearn.metrics.accuracy_score(y_true, y_pred) method defines y_pred as:
y_pred : 1d array-like, or label indicator array / sparse matrix. Predicted labels, as returned by a classifier.
Which means y_pred has to be an array of 1's or 0's (predicated labels). They should not be probabilities.
the root cause of your error is a theoretical and not computational issue: you are trying to use a classification metric (accuracy) in a regression (i.e. numeric prediction) model (Neural Logistic Model), which is meaningless.
Just like the majority of performance metrics, accuracy compares apples to apples (i.e true labels of 0/1 with predictions again of 0/1); so, when you ask the function to compare binary true labels (apples) with continuous predictions (oranges), you get an expected error, where the message tells you exactly what the problem is from a computational point of view:
Classification metrics can't handle a mix of binary and continuous target
Despite that the message doesn't tell you directly that you are trying to compute a metric that is invalid for your problem (and we shouldn't actually expect it to go that far), it is certainly a good thing that scikit-learn at least gives you a direct and explicit warning that you are attempting something wrong; this is not necessarily the case with other frameworks - see for example the behavior of Keras in a very similar situation, where you get no warning at all, and one just ends up complaining for low "accuracy" in a regression setting...
from keras import models
from keras.layers import Dense, Dropout
from keras.utils import to_categorical
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from keras.models import Sequential
from keras.layers import Dense, Activation
from sklearn.cross_validation import train_test_split
from sklearn import metrics
from sklearn.cross_validation import KFold, cross_val_score
from sklearn.preprocessing import StandardScaler
# read the csv file and convert into arrays for the machine to process
df = pd.read_csv('dataset_ori.csv')
dataset = df.values
# split the dataset into input features and the feature to predict
X = dataset[:,0:7]
Y = dataset[:,7]
# Splitting into Train and Test Set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(dataset,
response,
test_size = 0.2,
random_state = 0)
# Initialising the ANN
classifier = Sequential()
# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu', input_dim =7 ))
model.add(Dropout(0.5))
# Adding the second hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dropout(0.5))
# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
# Fitting the ANN to the Training set
classifier.fit(X_train, y_train, batch_size = 10, epochs = 20)
# Train model
scaler = StandardScaler()
classifier.fit(scaler.fit_transform(X_train.values), y_train)
# Summary of neural network
classifier.summary()
# Predicting the Test set results & Giving a threshold probability
y_prediction = classifier.predict_classes(scaler.transform(X_test.values))
print ("\n\naccuracy" , np.sum(y_prediction == y_test) / float(len(y_test)))
y_prediction = (y_prediction > 0.5)
## EXTRA: Confusion Matrix Visualize
from sklearn.metrics import confusion_matrix,accuracy_score
cm = confusion_matrix(y_test, y_pred) # rows = truth, cols = prediction
df_cm = pd.DataFrame(cm, index = (0, 1), columns = (0, 1))
plt.figure(figsize = (10,7))
sn.set(font_scale=1.4)
sn.heatmap(df_cm, annot=True, fmt='g')
print("Test Data Accuracy: %0.4f" % accuracy_score(y_test, y_pred))
#Let's see how our model performed
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))
As you already loaded the confusion_matrix from scikit.learn, you can use this one:
cutoff = 0.5
y_predict = model.predict(x_test)
y_pred_classes = np.zeros_like(y_pred) # initialise a matrix full with zeros
y_pred_classes[y_pred > cutoff] = 1
y_test_classes = np.zeros_like(y_pred)
y_test_classes[y_test > cutoff] = 1
print(confusion_matrix(y_test_classes, y_pred_classes)
the confusion matrix always is ordered like this:
True Positives False negatives
False Positives True negatives
for tn and so on you can run this:
tn, fp, fn, tp = confusion_matrix(y_test_classes, y_pred_classes).ravel()
(tn, fp, fn, tp)
I have a data like this
there are 29 column ,out of which I have to predict winPlacePerc(extreme end of dataframe) which is between 1(high perc) to 0(low perc)
Out of 29 column 25 are numerical data 3 are ID(object) 1 is categorical
I dropped all the Id column(since they're all unique) and also encoded the categorical(matchType) data into one hot encoding
After doing all this I am left with 41 column(after one hot)
This is how i am creating data
X = df.drop(columns=['winPlacePerc'])
#creating a dataframe with only the target column
y = df[['winPlacePerc']]
Now my X have 40 column and this is my label data looks like
> y.head()
winPlacePerc
0 0.4444
1 0.6400
2 0.7755
3 0.1667
4 0.1875
I also happen to have very large amount of data like 400k data ,so for testing purpose I am training on fraction of that,doing that using sckit
X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.997, random_state=32)
which gives almost 13k data for training
For model I'm using Keras sequential model
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dense, Dropout, Activation
from keras.layers.normalization import BatchNormalization
from keras import optimizers
n_cols = X_train.shape[1]
model = Sequential()
model.add(Dense(40, activation='relu', input_shape=(n_cols,)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error',
optimizer='Adam',
metrics=['accuracy'])
model.fit(X_train, y_train,
epochs=50,
validation_split=0.2,
batch_size=20)
Since my y-label data is between 0 & 1 ,I'm using sigmoid layer as my output layer
this is training & validation loss & accuracy plot
I also tried to convert label into binary using step function and binary cross entropy loss function
after that y-label data looks like
> y.head()
winPlacePerc
0 0
1 1
2 1
3 0
4 0
and changing loss function
model.compile(loss='binary_crossentropy',
optimizer='Adam',
metrics=['accuracy'])
this method was more worse than previous
as you can see its not learning after certain epoch,and this also happens even if I am taking all data rather than fraction of it
after this did not work I also used dropout and tried adding more layer,but nothing works here
Now my question ,what I am doing wrong here is it wrong layer or in data how can I improve upon this?
To clear things out - this is a Regression problem so using accuracy doesn't really makes sense, because you will never be able to predict the exact value of 0.23124.
First of all you certainly want to normalise your values (not the one hot encoded) before passing it to the network. Try using a StandardScaler as a start.
Second, I would recommend changing the activation function in the output layer - try with linear and as a loss mean_squared_error should be fine.
In order to validate you model "accuracy" plot the predicted together with the actual - this should give you a chance of validating the results visually. However, that being said your loss already looks quite decent.
Check this post, should give you a good grasp of what (activation & loss functions) and when to use.
from sklearn.preprocessing import StandardScaler
n_cols = X_train.shape[1]
ss = StandardScaler()
X_train = ss.fit_transform(X_train)
model = Sequential()
model.add(Dense(128, activation='relu', input_shape=(n_cols,)))
model.add(Dense(64, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error',
optimizer='Adam',
metrics=['mean_squared_error'])
model.fit(X_train, y_train,
epochs=50,
validation_split=0.2,
batch_size=20)
Normalize data
Add more depth to your network
Make the last layer linear
Accuracy is not a good metric for regression. Let's see an example
predictions: [0.9999999, 2.0000001, 3.000001]
ground Truth: [1, 2, 3]
Accuracy = No:of Correct / Total => 0 /3 = 0
Accuracy is 0, but the predictions are pretty close to the ground truth. On the other hand, MSE will be very low pointing that the deviation of the predictions from the ground truth is very less.
I'm doing a Neural Network for the "Default of credit card clients" from http://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients.
But the accuracy of my models is pretty bad, worse than if I predicted all zeros. I already did some research on it and did oversample to correct the classes, and change the optimizer, because adam was not increasing the accuracy.
What else could I do?
import pandas
import numpy
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
import keras
from imblearn.over_sampling import SMOTE
seed = 8
numpy.random.seed(seed)
base = pandas.read_csv('base_nao_trabalhada.csv')
train, test = train_test_split(base, test_size = 0.2)
train=train.values
test=test.values
X_train = train[:,1:23]
Y_train = train[:,24]
X_test = test[:,1:23]
Y_test = test[:,24]
sm = SMOTE(kind='regular')
X_resampled, Y_resampled = sm.fit_sample(X_train, Y_train)
# Model Creation
model = Sequential()
model.add(Dense(40, input_dim=22, init='uniform', activation='relu'))
model.add(Dense(4, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
#activation='relu'
opt = keras.optimizers.SGD(lr=0.000001)
# Compile model
model.compile(loss='binary_crossentropy', optimizer=opt , metrics=['accuracy'])
#loss=binary_crossentropy
#optimizer='adam'
# creating .fit
model.fit(X_resampled, Y_resampled, nb_epoch=10000, batch_size=30)
# evaluate the model
scores = model.evaluate(X_test, Y_test)
print ()
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
I am comparing Keras Neural-Net with simple Logistic Regression from Scikit-learn on IRIS data. I expect that Keras-NN will perform better, as suggested by this post.
But why by mimicking the code there, the result of Keras-NN is lower than
Logistic regression?
import seaborn as sns
import numpy as np
from sklearn.cross_validation import train_test_split
from sklearn.linear_model import LogisticRegressionCV
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.utils import np_utils
# Prepare data
iris = sns.load_dataset("iris")
X = iris.values[:, 0:4]
y = iris.values[:, 4]
# Make test and train set
train_X, test_X, train_y, test_y = train_test_split(X, y, train_size=0.5, random_state=0)
################################
# Evaluate Logistic Regression
################################
lr = LogisticRegressionCV()
lr.fit(train_X, train_y)
pred_y = lr.predict(test_X)
print("Test fraction correct (LR-Accuracy) = {:.2f}".format(lr.score(test_X, test_y)))
################################
# Evaluate Keras Neural Network
################################
# Make ONE-HOT
def one_hot_encode_object_array(arr):
'''One hot encode a numpy array of objects (e.g. strings)'''
uniques, ids = np.unique(arr, return_inverse=True)
return np_utils.to_categorical(ids, len(uniques))
train_y_ohe = one_hot_encode_object_array(train_y)
test_y_ohe = one_hot_encode_object_array(test_y)
model = Sequential()
model.add(Dense(16, input_shape=(4,)))
model.add(Activation('sigmoid'))
model.add(Dense(3))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
# Actual modelling
model.fit(train_X, train_y_ohe, verbose=0, batch_size=1)
score, accuracy = model.evaluate(test_X, test_y_ohe, batch_size=16, verbose=0)
print("Test fraction correct (NN-Score) = {:.2f}".format(score))
print("Test fraction correct (NN-Accuracy) = {:.2f}".format(accuracy))
I'm using this version of Keras
In [2]: keras.__version__
Out[2]: '1.0.1'
The result shows:
Test fraction correct (LR-Accuracy) = 0.83
Test fraction correct (NN-Score) = 0.75
Test fraction correct (NN-Accuracy) = 0.60
According to that post, the accuracy of Keras should be 0.99. What went wrong?
The default number of epochs was reduced from 100 in Keras version 0 to 10 in Keras version 1, just released this month (April 2016). Try:
model.fit(train_X, train_y_ohe, verbose=0, batch_size=1, nb_epoch=100)
Your neural network is quite simple. Try creating Deep neural network by adding more neurons and layers into it. Also, it's important to scale your features. Try glorot_uniform initializer. Last but not least, increase epoch and see if loss is decreasing with each epoch.
So here you go:
model = Sequential()
model.add(Dense(input_dim=4, output_dim=512, init='glorot_uniform'))
model.add(PReLU(input_shape=(512,)))
model.add(BatchNormalization((512,)))
model.add(Dropout(0.5))
model.add(Dense(input_dim=512, output_dim=512, init='glorot_uniform'))
model.add(PReLU(input_shape=(512,)))
model.add(BatchNormalization((512,)))
model.add(Dropout(0.5))
model.add(Dense(input_dim=512, output_dim=512, init='glorot_uniform'))
model.add(PReLU(input_shape=(512,)))
model.add(BatchNormalization((512,)))
model.add(Dropout(0.5))
model.add(Dense(input_dim=512, output_dim=512, init='glorot_uniform'))
model.add(PReLU(input_shape=(512,)))
model.add(BatchNormalization((512,)))
model.add(Dropout(0.5))
model.add(Dense(input_dim=512, output_dim=512, init='glorot_uniform'))
model.add(PReLU(input_shape=(512,)))
model.add(BatchNormalization((512,)))
model.add(Dropout(0.5))
model.add(Dense(input_dim=512, output_dim=3, init='glorot_uniform'))
model.add(Activation('softmax'))
This reaches around 0.97 in 120th epoch