How to interpret binary neural network results?

How to interpret binary neural network results? - python

I have using a dataset from Kaggle focused on heart disease predictions with 8 inputs and binary outputs. I have tested the data using logistic regression, random forests, SVM, and a simple neural network. Surprisingly, the best model is random forests. However, I'm having trouble with the neural network. The result, from the data, is either a 1 or a 0, so binary. The neural network works, but the results are odd, at least to me.
import tensorflow as tf
from tensorflow import keras
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
from keras.models import Sequential
from keras.layers import Dense
NN = Sequential()
NN = keras.Sequential([
keras.layers.Flatten(input_shape=(8,)),
keras.layers.Dense(16, activation=tf.nn.relu),
keras.layers.Dense(16, activation=tf.nn.relu),
keras.layers.Dense(1, activation=tf.nn.sigmoid),
])
NN.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
NN.fit(X_train, y_train, epochs=50, batch_size=1)
test_loss, test_acc = NN.evaluate(X_test, y_test)
y_predd = NN.predict(X_test)
I expect my results to be a value between 0 and 1, so anything over 0.5 is 1 and anything less is 0. But my prediction of the test set gives me values from 0 to 9. Any idea what I did wrong? Could I be overfitting the data?
I tested the network on a known datapoint:
X_new = [[70,1,0,145,174,0,1,125]]
X_new = sc.fit_transform(X_new)
NN.predict(X_new)
and I get the following result, which is wrong: 0.99--it should be 0.

Related

How to fix the Keras example in the VS Code documentation?

I've never used Keras or Tensorflow before, and was going through this example in the Visual Studio code documentation, but it seems to have a bug. The documentation shows that their trained model has a 61% accuracy against the test data, which matches what I get when I run it. However, no matter how you modify the neural network parameters, you always get the exact same accuracy. You can even skip the compile and fit commands and still get 61% accuracy.
It turns out that the prediction results they got were all zeroes (which happened to be right 61% of the time against the test data), and no matter how I modify the network it only outputs all zeroes, so it seems like there's some mistake in their code. But since I don't know Keras or TF, I haven't been able to figure out how to make it work.
Here's what I think all the relevant code is, but you can check the link above for everything:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(data[['sex','pclass','age','relatives','fare']], data.survived, test_size=0.2, random_state=0)
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(x_train)
X_test = sc.transform(x_test)
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(5, kernel_initializer = 'uniform', activation = 'relu', input_dim = 5))
model.add(Dense(5, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dense(1, kernel_initializer = 'uniform', activation = 'sigmoid'))
model.compile(optimizer="adam", loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, batch_size=32, epochs=50)
y_pred = np.argmax(model.predict(X_test), axis=-1)
print(metrics.accuracy_score(y_test, y_pred))

(as mentioned by #Frightera)
np.argmax() is generally used to get max index value when there are more than 2 class probabilities. As it is a binary classification model and you have used Sigmoid activation function in the last layer which always returns the output value between 0 to 1.
Which means
For small values (< 0.5), the output will be classified as zero (0),
and
for large values (>0.5), the result will be classified as 1.
Hence, you need to replace the final few lines of your code as below:
preds = model.predict(X_test)
y_pred = np.where(preds > 0.5, 1, 0)
#y_pred = np.argmax(model.predict(X_test), axis=-1)
print(metrics.accuracy_score(y_test, y_pred))
Output:
1.0

ValueError: could not broadcast input array from shape (2712,1) into shape (2712,)

I am building a neural network model for a classification problem that determines whether a customer will churn or not, and the output is a binary 0 and 1. I also used Random Forest Model and XGboost model. They all worked. I combined the random forest with XGBoost, and it worked fine.
However, when I combined the random forest, XGBoost , with the neural network (Keras classifier) using the voting classifier, I got the error ValueError: could not broadcast input array from the shape (2712,1) into shape (2712,)
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import cross_val_score
import numpy
# Function to create model, required for KerasClassifier
def create_model():
# create model
model = Sequential()
model.add(Dense(12, input_dim=17, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1,activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# create model
Kc_model = KerasClassifier(build_fn=create_model)
#model.set_params(epochs=100, batch_size=10, verbose=0)
Kc_model._estimator_type = "classifier"
Kc_model.fit(X_train, y_train, epochs=100,batch_size=10)
print("The accuracy score for Keras Model is")
print("Test set: {}%".format(round(Kc_model.score(X_test, y_test)*100)))
The code for the voting classifier below:
from keras.wrappers.scikit_learn import KerasClassifier
import scikeras
from tensorflow import keras
voting = VotingClassifier(
estimators = [('rf',rf),('xgboost_model',xgboost_model),('Kc_model',Kc_model) ],
voting='hard')
#reshaping=y_test.reshape(2712,1)
voting_model =voting.fit(X_train, y_train)
voting_pred = voting_model.predict(X_test)
#Model Score
print("The accuracy score for Voting Classifier is")
print("Training:{}%".format(round(voting_model.score(X_train, y_train)*100)))
print("Test set: {}%".format(round(voting_model.score(X_test, y_test)*100)))

It seems like one of your classifiers outputs prediction with shape (2712,1) (2D), and one with shape (2712,) (1D). VotingClassifier can't combine 2D and 1D predictions, so you obtain an error.
I suggest looking carefully at predictions of your classifiers. Try to make them output predictions with the same number of dimensions.

Accuracy is zero all the time

I wanna use eight features to predict a target feature, and while I am using keras, I got accuracy to be zeros all the time. I am new to machine learning, and I am quite confused.
Have tried different activation, I thought this could be a regression problem so I used 'linear' as the last activation function, and it turns out that the accuracy is still zero
from sklearn import preprocessing
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
import pandas as pd
# Step 2 - Load our data
zeolite_13X_error = pd.read_csv("zeolite_13X_error.csv", delimiter=",")
dataset = zeolite_13X_error.values
X = dataset[:, 0:8]
Y = dataset[:, 10] # Purity
min_max_scaler = preprocessing.MinMaxScaler()
X_scale = min_max_scaler.fit_transform(X)
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(X_scale, Y, test_size=0.3)
X_val, X_text, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)
# Building and training first NN
model = Sequential([
Dense(32, activation='relu', input_shape=(8,)),
Dense(32, activation='relu'),
Dense(1, activation='linear'),
])
model.compile(optimizer='sgd',
loss='binary_crossentropy',
metrics=['accuracy'])
hist = model.fit(X_train, Y_train,
batch_size=32, epochs=10,
validation_data=(X_val, Y_val))

If you decide to treat this as a regression problem, then
Your loss should be mean_squared_error, or some other loss appropriate for regression, but not binary_crossentropy, which is appropriate for binary classification only, and
Accuracy is meaningless - it is meaningful only for classification settings; in regression settings, we normally use the loss itself for performance evaluation - see own answer in What function defines accuracy in Keras when the loss is mean squared error (MSE)? for more.
If you decide to tackle this as a classification problem, you should change the activation of your last layer to sigmoid.
In any case, the combination you show here - loss='binary_crossentropy' and activation='linear' for the single-node last layer - is meaningless.

Check the output of your model to check the values. The model is predicting probabilities, instead of binary 0/1 decision which i believe is your case as you are using accuracy as a metric. If the model is predicting probabilities then convert them into 0 or 1 by rounding them based on threshold (of your choice i.e. if prediction > 0.5 then 1 else 0).
Also increase the number of epochs. Also use sigmoid activation in the output layer.

How do I find the false positive and false negative rates for a neural network?

I have the below code which works perfectly for a neural network. I know I need the confusion matrix library to find the false positive and false negative rates but I'm not sure how to do it as I'm no expert in programming. Can someone help please?
import pandas as pd
from sklearn import preprocessing
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
# read the csv file and convert into arrays for the machine to process
df = pd.read_csv('dataset_ori.csv')
dataset = df.values
# split the dataset into input features and the feature to predict
X = dataset[:,0:7]
Y = dataset[:,7]
# scale the dataset using sigmoid function min_max_scaler so that all the input features lie between 0 and 1
min_max_scaler = preprocessing.MinMaxScaler()
# store the dataset into an array
X_scale = min_max_scaler.fit_transform(X)
# split the dataset into 30% testing and the rest to train
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(X_scale, Y, test_size=0.3)
# split the val_and_test size equally to the validation set and the test set.
X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)
# specify the sequential model and describe the layers that will form architecture of the neural network
model = Sequential([Dense(7, activation='relu', input_shape=(7,)), Dense(32, activation='relu'), Dense(5, activation='relu'), Dense(1, activation='sigmoid'),])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# training the data
hist = model.fit(X_train, Y_train, batch_size=32, epochs=100, validation_data=(X_val, Y_val))
# to find the accuracy of the mf the classifier
scores = model.evaluate(X_test, Y_test)
print("Accuracy: %.2f%%" % (scores[1]*100))
This is the code provided in the answer below. response, model are both highlighted with red for unreslove references
from keras import models
from keras.layers import Dense, Dropout
from keras.utils import to_categorical
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from keras.models import Sequential
from keras.layers import Dense, Activation
from sklearn import metrics
from sklearn.preprocessing import StandardScaler
# read the csv file and convert into arrays for the machine to process
df = pd.read_csv('dataset_ori.csv')
dataset = df.values
# split the dataset into input features and the feature to predict
X = dataset[:,0:7]
Y = dataset[:,7]
# Splitting into Train and Test Set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(dataset,
response,
test_size = 0.2,
random_state = 0)
# Initialising the ANN
classifier = Sequential()
# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu', input_dim =7 ))
model.add(Dropout(0.5))
# Adding the second hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dropout(0.5))
# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
# Fitting the ANN to the Training set
classifier.fit(X_train, y_train, batch_size = 10, epochs = 20)
# Train model
scaler = StandardScaler()
classifier.fit(scaler.fit_transform(X_train.values), y_train)
# Summary of neural network
classifier.summary()
# Predicting the Test set results & Giving a threshold probability
y_prediction = classifier.predict_classes(scaler.transform(X_test.values))
print ("\n\naccuracy" , np.sum(y_prediction == y_test) / float(len(y_test)))
y_prediction = (y_prediction > 0.5)
#Let's see how our model performed
from sklearn.metrics import classification_report
print(classification_report(y_test, y_prediction))

Your input to confusion_matrix must be an array of int not one hot encodings.
# Predicting the Test set results
y_pred = model.predict(X_test)
y_pred = (y_pred > 0.5)
matrix = metrics.confusion_matrix(y_test.argmax(axis=1), y_pred.argmax(axis=1))
Below output would have come in that manner so by giving a probability threshold .5 will transform this to Binary.
output(y_pred):
[0.87812372 0.77490434 0.30319547 0.84999743]
The sklearn.metrics.accuracy_score(y_true, y_pred) method defines y_pred as:
y_pred : 1d array-like, or label indicator array / sparse matrix. Predicted labels, as returned by a classifier.
Which means y_pred has to be an array of 1's or 0's (predicated labels). They should not be probabilities.
the root cause of your error is a theoretical and not computational issue: you are trying to use a classification metric (accuracy) in a regression (i.e. numeric prediction) model (Neural Logistic Model), which is meaningless.
Just like the majority of performance metrics, accuracy compares apples to apples (i.e true labels of 0/1 with predictions again of 0/1); so, when you ask the function to compare binary true labels (apples) with continuous predictions (oranges), you get an expected error, where the message tells you exactly what the problem is from a computational point of view:
Classification metrics can't handle a mix of binary and continuous target
Despite that the message doesn't tell you directly that you are trying to compute a metric that is invalid for your problem (and we shouldn't actually expect it to go that far), it is certainly a good thing that scikit-learn at least gives you a direct and explicit warning that you are attempting something wrong; this is not necessarily the case with other frameworks - see for example the behavior of Keras in a very similar situation, where you get no warning at all, and one just ends up complaining for low "accuracy" in a regression setting...
from keras import models
from keras.layers import Dense, Dropout
from keras.utils import to_categorical
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from keras.models import Sequential
from keras.layers import Dense, Activation
from sklearn.cross_validation import train_test_split
from sklearn import metrics
from sklearn.cross_validation import KFold, cross_val_score
from sklearn.preprocessing import StandardScaler
# read the csv file and convert into arrays for the machine to process
df = pd.read_csv('dataset_ori.csv')
dataset = df.values
# split the dataset into input features and the feature to predict
X = dataset[:,0:7]
Y = dataset[:,7]
# Splitting into Train and Test Set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(dataset,
response,
test_size = 0.2,
random_state = 0)
# Initialising the ANN
classifier = Sequential()
# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu', input_dim =7 ))
model.add(Dropout(0.5))
# Adding the second hidden layer
classifier.add(Dense(units = 10, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dropout(0.5))
# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
# Fitting the ANN to the Training set
classifier.fit(X_train, y_train, batch_size = 10, epochs = 20)
# Train model
scaler = StandardScaler()
classifier.fit(scaler.fit_transform(X_train.values), y_train)
# Summary of neural network
classifier.summary()
# Predicting the Test set results & Giving a threshold probability
y_prediction = classifier.predict_classes(scaler.transform(X_test.values))
print ("\n\naccuracy" , np.sum(y_prediction == y_test) / float(len(y_test)))
y_prediction = (y_prediction > 0.5)
## EXTRA: Confusion Matrix Visualize
from sklearn.metrics import confusion_matrix,accuracy_score
cm = confusion_matrix(y_test, y_pred) # rows = truth, cols = prediction
df_cm = pd.DataFrame(cm, index = (0, 1), columns = (0, 1))
plt.figure(figsize = (10,7))
sn.set(font_scale=1.4)
sn.heatmap(df_cm, annot=True, fmt='g')
print("Test Data Accuracy: %0.4f" % accuracy_score(y_test, y_pred))
#Let's see how our model performed
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))

As you already loaded the confusion_matrix from scikit.learn, you can use this one:
cutoff = 0.5
y_predict = model.predict(x_test)
y_pred_classes = np.zeros_like(y_pred) # initialise a matrix full with zeros
y_pred_classes[y_pred > cutoff] = 1
y_test_classes = np.zeros_like(y_pred)
y_test_classes[y_test > cutoff] = 1
print(confusion_matrix(y_test_classes, y_pred_classes)
the confusion matrix always is ordered like this:
True Positives False negatives
False Positives True negatives
for tn and so on you can run this:
tn, fp, fn, tp = confusion_matrix(y_test_classes, y_pred_classes).ravel()
(tn, fp, fn, tp)

How to use keras RNN for text classification in a dataset?

I have coded ANN classifiers using keras and now I am learning myself to code RNN in keras for text and time series prediction. After searching a while in web I found this tutorial by Jason Brownlee which is decent for a novice learner in RNN. The original article is using IMDb dataset for text classification with LSTM but because of its large dataset size I changed it to a small sms spam detection dataset.
# LSTM with dropout for sequence classification in the IMDB dataset
import numpy
from keras.datasets import imdb
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
import pandaas as pd
from sklearn.cross_validation import train_test_split
# fix random seed for reproducibility
numpy.random.seed(7)
url = 'https://raw.githubusercontent.com/justmarkham/pydata-dc-2016-tutorial/master/sms.tsv'
sms = pd.read_table(url, header=None, names=['label', 'message'])
# convert label to a numerical variable
sms['label_num'] = sms.label.map({'ham':0, 'spam':1})
X = sms.message
y = sms.label_num
print(X.shape)
print(y.shape)
# load the dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)
top_words = 5000
# truncate and pad input sequences
max_review_length = 500
X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)
# create the model
embedding_vecor_length = 32
model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length, dropout=0.2))
model.add(LSTM(100, dropout_W=0.2, dropout_U=0.2))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, nb_epoch=3, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
I have successfully processed the dataset into training and testing set but now how should I model my RNN for this dataset?

You need to represent raw text data as numeric vector before training a neural network model. For this, you can use CountVectorizer or TfidfVectorizer provided by scikit-learn. After converting from raw text format to numeric vector representation, you can train a RNN/LSTM/CNN for text classification problem.

If you are still stuck on this, check out this example by Jason Brownlee. Looks like you are most of the way there. You need to add an LSTM layer and a Dense layer to get a model that should work.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to interpret binary neural network results? - python

Related

How to fix the Keras example in the VS Code documentation?

ValueError: could not broadcast input array from shape (2712,1) into shape (2712,)

Accuracy is zero all the time

How do I find the false positive and false negative rates for a neural network?

How to use keras RNN for text classification in a dataset?

Categories

Resources