How to create a neural network for regression? - python

I am trying to use Keras to make a neural network. The data I am using is My code is as follows:
import numpy as np
from keras.layers import Dense, Activation
from keras.models import Sequential
from sklearn.model_selection import train_test_split
data = np.genfromtxt(r"""file location""", delimiter=',')
model = Sequential()
model.add(Dense(32, activation = 'relu', input_dim = 6))
model.compile(optimizer='adam', loss='mean_squared_error', metrics = ['accuracy'])
Y = data[:,-1]
X = data[:, :-1]
From here I have tried using, Y), but the accuracy of the model appears to remain at 0. I am new to Keras so this is probably an easy solution, apologies in advance.
My question is what is the best way to add regression to the model so that the accuracy increases? Thanks in advance.

First of all, you have to split your dataset into training set and test set using train_test_split class from sklearn.model_selection library.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.08, random_state = 0)
Also, you have to scale your values using StandardScaler class.
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
Then, you should add more layers in order to get better results.
Usually it's a good practice to apply following formula in order to find out the total number of hidden layers needed.
Nh = Ns/(α∗ (Ni + No))
Ni = number of input neurons.
No = number of output neurons.
Ns = number of samples in training data set.
α = an arbitrary scaling factor usually 2-10.
So our classifier becomes:
# Initialising the ANN
model = Sequential()
# Adding the input layer and the first hidden layer
model.add(Dense(32, activation = 'relu', input_dim = 6))
# Adding the second hidden layer
model.add(Dense(units = 32, activation = 'relu'))
# Adding the third hidden layer
model.add(Dense(units = 32, activation = 'relu'))
# Adding the output layer
model.add(Dense(units = 1))
The metric that you use- metrics=['accuracy'] corresponds to a classification problem. If you want to do regression, remove metrics=['accuracy']. That is, just use
model.compile(optimizer = 'adam',loss = 'mean_squared_error')
Here is a list of keras metrics for regression and classification
Also, you have to define the batch_size and epochs values for fit method., y_train, batch_size = 10, epochs = 100)
After you trained your network you can predict the results for X_test using model.predict method.
y_pred = model.predict(X_test)
Now, you can compare the y_pred that we obtained from neural network prediction and y_test which is real data. For this, you can create a plot using matplotlib library.
plt.plot(y_test, color = 'red', label = 'Real data')
plt.plot(y_pred, color = 'blue', label = 'Predicted data')
It seems that our neural network learns very good
Here is how the plot looks.
Here is the full code
import numpy as np
from keras.layers import Dense, Activation
from keras.models import Sequential
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
# Importing the dataset
dataset = np.genfromtxt("data.txt", delimiter='')
X = dataset[:, :-1]
y = dataset[:, -1]
# Splitting the dataset into the Training set and Test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.08, random_state = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# Initialising the ANN
model = Sequential()
# Adding the input layer and the first hidden layer
model.add(Dense(32, activation = 'relu', input_dim = 6))
# Adding the second hidden layer
model.add(Dense(units = 32, activation = 'relu'))
# Adding the third hidden layer
model.add(Dense(units = 32, activation = 'relu'))
# Adding the output layer
model.add(Dense(units = 1))
# Compiling the ANN
model.compile(optimizer = 'adam', loss = 'mean_squared_error')
# Fitting the ANN to the Training set, y_train, batch_size = 10, epochs = 100)
y_pred = model.predict(X_test)
plt.plot(y_test, color = 'red', label = 'Real data')
plt.plot(y_pred, color = 'blue', label = 'Predicted data')


loss becomes 'nan' in all of epochs in Keras ANN (regression)

I have a regression problem and I've made a Keras ANN.
All of my data type (both feature and target matrix) are float;
I have 3 layers(5 input nodes, two hidden_layers with 3 nodes, and 1 output node);
optimizer='adam', batch_size='10', loss='mean_squared_error' and activation functions are 'relu' except the last one which is 'linear'.
due to training, all of my epochs losses are 'nan', and also my predicted values are all 'nan'!
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
dataset = pd.read_csv('/content/drive/MyDrive/Miss Derakhshan data (1).csv')
X = dataset.iloc[:, 1:6].values
y = dataset.iloc[:, 8].values
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20)
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
import keras
from keras.models import Sequential
from keras.layers import Dense
regression = Sequential()
regression.add(Dense(units = 4, input_dim = 5, kernel_initializer= 'uniform',
activation = 'relu'))
regression.add(Dense(units = 3, kernel_initializer = 'uniform' , activation = 'relu'))
regression.add(Dense(units = 1 , kernel_initializer = 'uniform', activation = 'linear'))
regression.compile(optimizer = 'adam', loss = 'mean_squared_error' , metrics = ['accuracy']), y_train, batch_size = 5 , epochs = 100)
y_pred = regression.predict(X_test)

Loss is always nan when training a deep learning model from tabular data

I'm trying to train a model from a dataset of about a few thousands of entries with 51 numerical features and a labeled column, Example:
when training the model to predict the 3 labels (candidate, false positive, confirmed) the loss is always nan and the accuracy stabilizes very fast on a specific value.
The code:
import tensorflow as tf
import numpy as np
import pandas as pd
import sklearn.preprocessing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, OneHotEncoder, StandardScaler, RobustScaler
from sklearn.preprocessing import OrdinalEncoder
from tensorflow import optimizers
from tensorflow.python.keras.layers import Dense, Dropout, Normalization
from tensorflow.python.keras.models import Sequential, Model
def load_dataset(data_folder_csv):
# load the dataset as a pandas DataFrame
data = pd.read_csv(data_folder_csv, header=0)
# retrieve numpy array
dataset = data.values
# split into input (X) and output (y) variables
X = dataset[:, :-1]
y = dataset[:, -1]
# format all fields as floats
X = X.astype(np.float)
# reshape the output variable to be one column (e.g. a 2D shape)
y = y.reshape((len(y), 1))
return X, y
# prepare input data using min/max scaler.
def prepare_inputs(X_train, X_test):
oe = RobustScaler().fit_transform(X_train)
X_train_enc = oe.transform(X_train)
X_test_enc = oe.transform(X_test)
return X_train_enc, X_test_enc
# prepare target
def prepare_targets(y_train, y_test):
le = LabelEncoder()
ohe = OneHotEncoder()
y_train_enc = ohe.fit_transform(y_train).toarray()
y_test_enc = ohe.fit_transform(y_test).toarray()
return y_train_enc, y_test_enc
X, y = load_dataset("csv_ready.csv")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
print('Train', X_train.shape, y_train.shape)
print('Test', X_test.shape, y_test.shape)
X_train_enc, X_test_enc = X_train, X_test
print('Finished preparing inputs.'
# prepare output data
y_train_enc, y_test_enc = prepare_targets(y_train, y_test)
norm_layer = Normalization()
model = Sequential()
model.add(Dense(128, input_dim=X_train.shape[1], activation="tanh", kernel_initializer='he_normal'))
model.add(Dense(64, input_dim=X_train.shape[1], activation='relu'))
model.add(Dense(32, input_dim=X_train.shape[1], activation='relu'))
model.add(Dense(3, activation='sigmoid'))
opt = optimizers.Adam(lr=0.01, decay=1e-6)
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
model.summary(), y_train_enc, epochs=20, batch_size=128, verbose=1, use_multiprocessing=True)
_, accuracy = model.evaluate(X_test, y_test_enc, verbose=0)
print('Accuracy: %.2f' % (accuracy * 100))
I tried increasing/decreasing the learning rate, changing the optimizer, lowering and increasing the number of neurons and layers, and playing with batch sizes but nothing seems to bring the model to get good results. I think I'm missing something here but can't put my finger on it.
Result example:
EDIT: More lines from the csv:
EDIT2: Tried l2 regularization also and didnt did anything.
One of the reasons:
Check whether your dataset have NaN values or not. NaN values can cause problem to the model while learning.
Some of the major bugs in your code:
You are using sigmoid activation function instead of softmax for output layer having 3 neurons
You are fitting both train and test set while using encoders which is wrong. You should fit_transform for your train data and only use transform for test sets
Also you are using input for all layers which is wrong, Only the first layer should accept the input tensor.
You forgot to use prepare_inputs function for X_train and X_test
Your model should be fit with X_train_enc not X_train
Use this instead
import tensorflow as tf
import numpy as np
import pandas as pd
import sklearn.preprocessing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, OneHotEncoder, StandardScaler, MinMaxScaler
from sklearn.preprocessing import OrdinalEncoder
from tensorflow import optimizers
from tensorflow.python.keras.layers import Dense, Dropout, Normalization
from tensorflow.python.keras.models import Sequential, Model
def load_dataset(data_folder_csv):
# load the dataset as a pandas DataFrame
data = pd.read_csv(data_folder_csv, header=0)
# retrieve numpy array
dataset = data.values
# split into input (X) and output (y) variables
X = dataset[:, :-1]
y = dataset[:, -1]
# format all fields as floats
X = X.astype(np.float)
# reshape the output variable to be one column (e.g. a 2D shape)
y = y.reshape((len(y), 1))
return X, y
# prepare input data using min/max scaler.
def prepare_inputs(X_train, X_test):
oe = MinMaxScaler()
X_train_enc = oe.fit_transform(X_train)
X_test_enc = oe.transform(X_test)
return X_train_enc, X_test_enc
# prepare target
def prepare_targets(y_train, y_test):
le = LabelEncoder()
ohe = OneHotEncoder()
y_train = le.fit_transform(y_train)
y_test = le.transform(y_test)
y_train_enc = ohe.fit_transform(y_train).toarray()
y_test_enc = ohe.transform(y_test).toarray()
return y_train_enc, y_test_enc
X, y = load_dataset("csv_ready.csv")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
print('Train', X_train.shape, y_train.shape)
print('Test', X_test.shape, y_test.shape)
#prepare_input function missing here
X_train_enc, X_test_enc = prepare_inputs(X_train, X_test)
print('Finished preparing inputs.')
# prepare output data
y_train_enc, y_test_enc = prepare_targets(y_train, y_test)
model = Sequential()
model.add(Dense(128, input_dim=X_train.shape[1], activation="relu"))
model.add(Dense(128, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(3, activation='softmax'))
#opt = optimizers.Adam(lr=0.01, decay=1e-6)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary(), y_train_enc, epochs=20, batch_size=32, verbose=1, use_multiprocessing=True)
_, accuracy = model.evaluate(X_test_enc, y_test_enc, verbose=0)
print('Accuracy: %.2f' % (accuracy * 100))
You want to change your model definition to this:
model = Sequential()
model.add(Dense(128, input_shape=X_train.shape[1:], activation="tanh", kernel_initializer='he_normal'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(3, activation='softmax'))
You only need to define the input shape for the first layer, Keras will automatically determine the proper shape for the subsequent layers. You leave out the batch size when defining the input_shape, which is the first dimension, hence input_shape=X_train.shape[1:].
A sigmoid activation will actually work (because the output will vary between 0 and 1), but what you really want is a softmax activation (which makes sure all the outputs sum to 1, which is what probability dictates -- the probability that something happened is 100%, not the 120% that sigmoid could end up giving you).
Also, you're not using your LabelEncoder anywhere. I think what you mean to do is this:
def prepare_targets(y_train, y_test):
le = LabelEncoder()
ohe = OneHotEncoder()
# teach the label encoder our labels
# turn our strings into integers
y_train_transformed = le.transform(y_train)
y_test_transformed = le.transform(y_test)
# turn our integers into one-hot-encoded arrays
y_train_enc = ohe.fit_transform(y_train_transformed).toarray()
y_test_enc = ohe.transform(y_test_transformed).toarray()
return y_train_enc, y_test_enc

Keras CNN always predicts same class

EDIT: it seems like I did not even run the model for enough epochs, so I will try that out and return with my results
I am trying to create a CNN that classifies 3D brain images. However, the CNN program always predict the same class when I run it and am not sure what other methods I can do to prevent this. I have searched up this problem with many plausible solutions, but they did not work
So far, I have tried:
Decreasing the learning rate
Normalize the data to [0, 1]
Change optimizers
Only use sigmoid and binary_crossentropy
Add/remove dropout layers
Changed into a simpler CNN model
Balance the dataset
Added augmented data using a custom 3D imagedatagenerator()
For context, I am classifying between two groups. The amount of images I am using is a total of 200 3D brain images (about 100 for each category). To increase my training size, I used a custom data augmentation I found from github
Looking at the learning curve, the accuracy and loss rates are completely random. Some runs they would be decreasing, some increasing, and some fluctuating within a range
Any help would be appreciated!
import os
import csv
import tensorflow as tf # 2.0
import nibabel as nib
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder, LabelEncoder
from keras.models import Model
from keras.layers import Conv3D, MaxPooling3D, Dense, Dropout, Activation, Flatten
from keras.layers import Input, concatenate
from keras import optimizers
from keras.utils import to_categorical
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt
from augmentedvolumetricimagegenerator.generator import customImageDataGenerator
from keras.callbacks import EarlyStopping
# Administrative items
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# Where the file is located
path = r'C:\Users\jesse\OneDrive\Desktop\Research\PD\decline'
folder = os.listdir(path)
target_size = (96, 96, 96)
# creating x - converting images to array
def read_image(path, folder):
mri = []
for i in range(len(folder)):
files = os.listdir(path + '\\' + folder[i])
for j in range(len(files)):
image = np.array(nib.load(path + '\\' + folder[i] + '\\' + files[j]).get_fdata())
image = np.resize(image, target_size)
image = np.expand_dims(image, axis=3)
image /= 255.
return mri
# creating y - one hot encoder
def create_y():
excel_file = r'C:\Users\jesse\OneDrive\Desktop\Research\PD\decline_label.xlsx'
excel_read = pd.read_excel(excel_file)
excel_array = np.array(excel_read['Label'])
label = LabelEncoder().fit_transform(excel_array)
label = label.reshape(len(label), 1)
onehot = OneHotEncoder(sparse=False).fit_transform(label)
return onehot
# Splitting image train/test
x = np.asarray(read_image(path, folder))
y = np.asarray(create_y())
x_split, x_test, y_split, y_test = train_test_split(x, y, test_size=.2, stratify=y)
x_train, x_val, y_train, y_val = train_test_split(x_split, y_split, test_size=.25, stratify=y_split)
print(x_train.shape, x_val.shape, x_test.shape, y_train.shape, y_val.shape, y_test.shape)
batch_size = 10
num_classes = len(folder)
inputs = Input((96, 96, 96, 1))
conv1 = Conv3D(32, [3, 3, 3], padding='same', activation='relu')(inputs)
conv1 = Conv3D(32, [3, 3, 3], padding='same', activation='relu')(conv1)
pool1 = MaxPooling3D(pool_size=(2, 2, 2), padding='same')(conv1)
drop1 = Dropout(0.5)(pool1)
conv2 = Conv3D(64, [3, 3, 3], padding='same', activation='relu')(drop1)
conv2 = Conv3D(64, [3, 3, 3], padding='same', activation='relu')(conv2)
pool2 = MaxPooling3D(pool_size=(2, 2, 2), padding='same')(conv2)
drop2 = Dropout(0.5)(pool2)
conv3 = Conv3D(128, [3, 3, 3], padding='same', activation='relu')(drop2)
conv3 = Conv3D(128, [3, 3, 3], padding='same', activation='relu')(conv3)
pool3 = MaxPooling3D(pool_size=(2, 2, 2), padding='same')(conv3)
drop3 = Dropout(0.5)(pool3)
flat1 = Flatten()(drop3)
dense1 = Dense(128, activation='relu')(flat1)
drop5 = Dropout(0.5)(dense1)
dense2 = Dense(num_classes, activation='sigmoid')(drop5)
model = Model(inputs=[inputs], outputs=[dense2])
opt = optimizers.Adagrad(lr=1e-5)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
train_datagen = customImageDataGenerator(
val_datagen = customImageDataGenerator()
training_set = train_datagen.flow(x_train, y_train, batch_size=batch_size)
validation_set = val_datagen.flow(x_val, y_val, batch_size=batch_size)
callbacks = EarlyStopping(monitor='val_loss', patience=3)
history = model.fit_generator(training_set,
steps_per_epoch = 10,
epochs = 20,
validation_steps = 5,
callbacks = [callbacks],
validation_data = validation_set)
score = model.evaluate(x_test, y_test, batch_size=batch_size)
y_pred = model.predict(x_test, batch_size=batch_size)
y_test = np.argmax(y_test, axis=1)
y_pred = np.argmax(y_pred, axis=1)
confusion = confusion_matrix(y_test, y_pred)
map = sns.heatmap(confusion, annot=True)
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
plt.legend(['train', 'test'], loc='best')
plt.legend(['train', 'test'], loc='best')
You can find the outputs here:
It is kind of hard to help without the dataset itself. Though one or two things I would test:
I find the ReLU activation inappropriate for Dense layer, which could lead to the mono-class prediction. Try replacing the relu from your Dense(128) layer by something else (sigmoid, tanh)
Dropout is not really appropriate for images in general, you might want to look at DropBlock
Initial learning rate is pretty low, I would start with something between 1e-3 or 1e-4
Stupid thing that happened to me way too often: have you visualize the image / label combinaison to make sure each image has the right label?
Again, not sure it will fix everything, but I hope it might help!
This could be any number of things, but it is possible that the misbehaviour is being caused by the data itself.
Just from looking at the code, it seems like you haven't normalized the testing data before calling model.predict or model.evaluate in the same way as you have done for the training and validation data.
I had a similar problem once and it turned out this was the cause. As a quick test you can just rescale the test data and see if that helps.

Using a trained Keras model to make predictions on new csv data

so I'm making a project where basically i have to predict whether or not a house price is above or below its median price and to do that, I'm using this dataset from Kaggle( 1 means "Above Median" and 0 means "Below Median". I wrote this code to train a neural network and save it as a .h5 file:
import pandas as pd
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
import h5py
df = pd.read_csv('housepricedata.csv')
dataset = df.values
X = dataset[:,0:10]
Y = dataset[:,10]
min_max_scaler = preprocessing.MinMaxScaler()
X_scale = min_max_scaler.fit_transform(X)
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(X_scale, Y, test_size=0.3)
X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)
model = Sequential([
Dense(32, activation='relu', input_shape=(10,)),
Dense(32, activation='relu'),
Dense(1, activation='sigmoid'),
hist =, Y_train,
batch_size=32, epochs=100,
validation_data=(X_val, Y_val))"house_price.h5")
After running it, it successfully saves the .h5 file to my directory. What I want to do now is use my trained model to make predictions on a new .csv file and determine whether or not each of those are above or below median price. This is an image of the csv file in VSCode that i want it to make predictions on:
csv file image As you can see, this file doesn't contain a 1(above median) or 0(below median) because that's what I want it to predict. This is the code I wrote to do that:
import pandas as pd
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
from keras.models import load_model
import h5py
df = pd.read_csv('data.csv')
dataset = df.values
X = dataset[:,0:10]
Y = dataset[:,10]
min_max_scaler = preprocessing.MinMaxScaler()
X_scale = min_max_scaler.fit_transform(X)
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(X_scale, Y, test_size=0.3)
X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)
model = load_model("house_price.h5")
y_pred = model.predict(X_test)
It's output is [[0.00101464]] I have no clue what that is and why it's only returning one value even though the csv file has 4 rows. Does anyone know how I can fix that and be able to predict either a 1 or a 0 for each row in the csv file?
Thank You!
As much I understand what you want!
Let's Try ! This code work for me
import tensorflow
model = tensorflow.keras.models.load_model("house_price.h5")
still you are not able to get visit following site
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('C:\\Users\\acer\\Downloads\\housepricedata.csv')
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
classifier = Sequential()
# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation =
'relu', input_dim = 10))
# classifier.add(Dropout(p = 0.1))
# Adding the second hidden layer
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation
= 'relu'))
# classifier.add(Dropout(p = 0.1))
# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation
= 'sigmoid'))
# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics
= ['accuracy']), y_train, batch_size = 10, epochs = 100)
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)
import tensorflow
model = tensorflow.keras.models.load_model("house_price.h5")
y_pred = (y_pred > 0.5)
Both y_pred produce same output for me
Here one thing you not y_pred not contain 0 and 1 because you use sigmoid function which determine predication in probability
so if(y_pred>0.5) it mean value is one
#True rep one
#false rep zero
#you can use replace function or map function of pandas to get convert true
into 1

How do I find the false positive and false negative rates for a neural network?

I have the below code which works perfectly for a neural network. I know I need the confusion matrix library to find the false positive and false negative rates but I'm not sure how to do it as I'm no expert in programming. Can someone help please?
import pandas as pd
from sklearn import preprocessing
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
# read the csv file and convert into arrays for the machine to process
df = pd.read_csv('dataset_ori.csv')
dataset = df.values
# split the dataset into input features and the feature to predict
X = dataset[:,0:7]
Y = dataset[:,7]
# scale the dataset using sigmoid function min_max_scaler so that all the input features lie between 0 and 1
min_max_scaler = preprocessing.MinMaxScaler()
# store the dataset into an array
X_scale = min_max_scaler.fit_transform(X)
# split the dataset into 30% testing and the rest to train
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(X_scale, Y, test_size=0.3)
# split the val_and_test size equally to the validation set and the test set.
X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)
# specify the sequential model and describe the layers that will form architecture of the neural network
model = Sequential([Dense(7, activation='relu', input_shape=(7,)), Dense(32, activation='relu'), Dense(5, activation='relu'), Dense(1, activation='sigmoid'),])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# training the data
hist =, Y_train, batch_size=32, epochs=100, validation_data=(X_val, Y_val))
# to find the accuracy of the mf the classifier
scores = model.evaluate(X_test, Y_test)
print("Accuracy: %.2f%%" % (scores[1]*100))
This is the code provided in the answer below. response, model are both highlighted with red for unreslove references
from keras import models
from keras.layers import Dense, Dropout
from keras.utils import to_categorical
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from keras.models import Sequential
from keras.layers import Dense, Activation
from sklearn import metrics
from sklearn.preprocessing import StandardScaler
# read the csv file and convert into arrays for the machine to process
df = pd.read_csv('dataset_ori.csv')
dataset = df.values
# split the dataset into input features and the feature to predict
X = dataset[:,0:7]
Y = dataset[:,7]
# Splitting into Train and Test Set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(dataset,
test_size = 0.2,
random_state = 0)
# Initialising the ANN
classifier = Sequential()
# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu', input_dim =7 ))
# Adding the second hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu'))
# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
# Fitting the ANN to the Training set, y_train, batch_size = 10, epochs = 20)
# Train model
scaler = StandardScaler(), y_train)
# Summary of neural network
# Predicting the Test set results & Giving a threshold probability
y_prediction = classifier.predict_classes(scaler.transform(X_test.values))
print ("\n\naccuracy" , np.sum(y_prediction == y_test) / float(len(y_test)))
y_prediction = (y_prediction > 0.5)
#Let's see how our model performed
from sklearn.metrics import classification_report
print(classification_report(y_test, y_prediction))
Your input to confusion_matrix must be an array of int not one hot encodings.
# Predicting the Test set results
y_pred = model.predict(X_test)
y_pred = (y_pred > 0.5)
matrix = metrics.confusion_matrix(y_test.argmax(axis=1), y_pred.argmax(axis=1))
Below output would have come in that manner so by giving a probability threshold .5 will transform this to Binary.
[0.87812372 0.77490434 0.30319547 0.84999743]
The sklearn.metrics.accuracy_score(y_true, y_pred) method defines y_pred as:
y_pred : 1d array-like, or label indicator array / sparse matrix. Predicted labels, as returned by a classifier.
Which means y_pred has to be an array of 1's or 0's (predicated labels). They should not be probabilities.
the root cause of your error is a theoretical and not computational issue: you are trying to use a classification metric (accuracy) in a regression (i.e. numeric prediction) model (Neural Logistic Model), which is meaningless.
Just like the majority of performance metrics, accuracy compares apples to apples (i.e true labels of 0/1 with predictions again of 0/1); so, when you ask the function to compare binary true labels (apples) with continuous predictions (oranges), you get an expected error, where the message tells you exactly what the problem is from a computational point of view:
Classification metrics can't handle a mix of binary and continuous target
Despite that the message doesn't tell you directly that you are trying to compute a metric that is invalid for your problem (and we shouldn't actually expect it to go that far), it is certainly a good thing that scikit-learn at least gives you a direct and explicit warning that you are attempting something wrong; this is not necessarily the case with other frameworks - see for example the behavior of Keras in a very similar situation, where you get no warning at all, and one just ends up complaining for low "accuracy" in a regression setting...
from keras import models
from keras.layers import Dense, Dropout
from keras.utils import to_categorical
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from keras.models import Sequential
from keras.layers import Dense, Activation
from sklearn.cross_validation import train_test_split
from sklearn import metrics
from sklearn.cross_validation import KFold, cross_val_score
from sklearn.preprocessing import StandardScaler
# read the csv file and convert into arrays for the machine to process
df = pd.read_csv('dataset_ori.csv')
dataset = df.values
# split the dataset into input features and the feature to predict
X = dataset[:,0:7]
Y = dataset[:,7]
# Splitting into Train and Test Set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(dataset,
test_size = 0.2,
random_state = 0)
# Initialising the ANN
classifier = Sequential()
# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu', input_dim =7 ))
# Adding the second hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu'))
# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
# Fitting the ANN to the Training set, y_train, batch_size = 10, epochs = 20)
# Train model
scaler = StandardScaler(), y_train)
# Summary of neural network
# Predicting the Test set results & Giving a threshold probability
y_prediction = classifier.predict_classes(scaler.transform(X_test.values))
print ("\n\naccuracy" , np.sum(y_prediction == y_test) / float(len(y_test)))
y_prediction = (y_prediction > 0.5)
## EXTRA: Confusion Matrix Visualize
from sklearn.metrics import confusion_matrix,accuracy_score
cm = confusion_matrix(y_test, y_pred) # rows = truth, cols = prediction
df_cm = pd.DataFrame(cm, index = (0, 1), columns = (0, 1))
plt.figure(figsize = (10,7))
sn.heatmap(df_cm, annot=True, fmt='g')
print("Test Data Accuracy: %0.4f" % accuracy_score(y_test, y_pred))
#Let's see how our model performed
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))
As you already loaded the confusion_matrix from scikit.learn, you can use this one:
cutoff = 0.5
y_predict = model.predict(x_test)
y_pred_classes = np.zeros_like(y_pred) # initialise a matrix full with zeros
y_pred_classes[y_pred > cutoff] = 1
y_test_classes = np.zeros_like(y_pred)
y_test_classes[y_test > cutoff] = 1
print(confusion_matrix(y_test_classes, y_pred_classes)
the confusion matrix always is ordered like this:
True Positives False negatives
False Positives True negatives
for tn and so on you can run this:
tn, fp, fn, tp = confusion_matrix(y_test_classes, y_pred_classes).ravel()
(tn, fp, fn, tp)
