Keras Sequential model loss won't decrease & remains constant through all epochs - python

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
df = pd.read_csv('insurance.csv')
X = df.drop(['sex', 'children', 'smoker', 'region'], axis = 1)
X = X.values
y = df['charges']
y = y.values.reshape(1331,1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 75)
from keras import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(5, activation = 'sigmoid'))
model.add(Dense(4, activation = 'sigmoid'))
model.add(Dense(1, activation = 'sigmoid'))
from keras import optimizers
sgd = optimizers.SGD(lr=0.1)
model.compile(sgd, 'mse')
model.fit(X_train, y_train, 32, 100, shuffle=False)
This right here is my code, the data I am feeding in is all numerical, and I have tried different hyperparameters, nothing seems to work.
Any help would be much appreciated.
I just don't know what is going wrong over here.

If you are indeed in a regression setting (as implied by your choice of loss, MSE) and not in a classification one, the basic mistake in your code is the activation of your last layer, which should be linear:
model.add(Dense(1, activation = 'linear'))
Of course, there can be several other things going wrong with your approach, including the architecture of your model itself (there is not any kind of "guarantee" that, whatever model architecture you throw in your data, it will produce decent results, and your model looks too simple), the activation functions of the other layers (usually today we start with relu) etc., but it is impossible to say more without knowing your data.

Related

How to fix the Keras example in the VS Code documentation?

I've never used Keras or Tensorflow before, and was going through this example in the Visual Studio code documentation, but it seems to have a bug. The documentation shows that their trained model has a 61% accuracy against the test data, which matches what I get when I run it. However, no matter how you modify the neural network parameters, you always get the exact same accuracy. You can even skip the compile and fit commands and still get 61% accuracy.
It turns out that the prediction results they got were all zeroes (which happened to be right 61% of the time against the test data), and no matter how I modify the network it only outputs all zeroes, so it seems like there's some mistake in their code. But since I don't know Keras or TF, I haven't been able to figure out how to make it work.
Here's what I think all the relevant code is, but you can check the link above for everything:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(data[['sex','pclass','age','relatives','fare']], data.survived, test_size=0.2, random_state=0)
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(x_train)
X_test = sc.transform(x_test)
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(5, kernel_initializer = 'uniform', activation = 'relu', input_dim = 5))
model.add(Dense(5, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dense(1, kernel_initializer = 'uniform', activation = 'sigmoid'))
model.compile(optimizer="adam", loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, batch_size=32, epochs=50)
y_pred = np.argmax(model.predict(X_test), axis=-1)
print(metrics.accuracy_score(y_test, y_pred))
(as mentioned by #Frightera)
np.argmax() is generally used to get max index value when there are more than 2 class probabilities. As it is a binary classification model and you have used Sigmoid activation function in the last layer which always returns the output value between 0 to 1.
Which means
For small values (< 0.5), the output will be classified as zero (0),
and
for large values (>0.5), the result will be classified as 1.
Hence, you need to replace the final few lines of your code as below:
preds = model.predict(X_test)
y_pred = np.where(preds > 0.5, 1, 0)
#y_pred = np.argmax(model.predict(X_test), axis=-1)
print(metrics.accuracy_score(y_test, y_pred))
Output:
1.0

Keras accuracy returning 0

So basically, I am working on this bullet optimization program. I wish to study how different ballistics parameters such as weight, length, and mass affect a ballistics coefficient. However, my training accuracy is 0, although there is loss and val_loss. I've read similar Stackoverflow posts regarding this, but none have helped me so far. Perhaps I just didn't do them right; I am referencing https://stackoverflow.com/a/63513872/12349188
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.utils import shuffle
df = pd.read_csv('Bullet Optimization\ShootForum Bullet DB_2.csv')
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
dataset = df.values
X = dataset[:,0:12]
X = np.asarray(X).astype(np.float32)
y = dataset[:,13]
y = np.asarray(y).astype(np.float32)
X_train, X_val_and_test, y_train, y_val_and_test = train_test_split(X, y, test_size=0.3, shuffle=True)
X_val, X_test, y_val, y_test = train_test_split(X_val_and_test, y_val_and_test, test_size=0.5)
from keras.models import Sequential
from keras.layers import Dense, BatchNormalization
model = Sequential(
[
#2430 is the shape of X_train
#BatchNormalization(axis=-1, momentum = 0.1),
Dense(32, activation='relu'),
Dense(32, activation='relu'),
Dense(1,activation='softmax'),
]
)
model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(X_train, y_train, batch_size=64, epochs=100, validation_data=(X_val, y_val))
Did I do something wrong in my code? I know some python but I just kind of built upon the tutorials for my own purposes.
There are several problems in your code.
First this line:
Dense(1,activation='softmax')
This line will cause the output 1 every time. So even if you are making classification, your accuracy would be 50% if you had 2 classes. Softmax outputs' sum will be equal to one. So using it with one neuron does not make sense.
You need to change your loss and metric as this is a regression.
loss='mse', metrics=['mse']
Also your output neuron should be linear which means does not need any activation function. It should be like:
Dense(1)

I am having problems with tensorflow library in anaconda program

enter image description hereenter image description here
There is 1000 data, and in this 1000, 670 is train and 330 is test.
model.fit(x_train, y_train, epochs = 250)
when I write this it should be 670 train data but in my case it’s 21.
CODE
import pandas as pd
dataFrame = pd.read_excel("bisiklet_fiyatlari.xlsx")
from sklearn.model_selection import train_test_split
y=dataFrame["Fiyat"].values
x=dataFrame[["BisikletOzellik1","BisikletOzellik2"]].values
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=15)
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
scaler.fit(x_train)
x_train=scaler.transform(x_train)
x_test=scaler.transform(x_test)
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
model = Sequential()
model.add(Dense(4, activation = "relu"))
model.add(Dense(4, activation = "relu"))
model.add(Dense(4, activation = "relu"))
model.add(Dense(1))
model.compile(optimizer="rmsprop",loss="mse")
model.fit(x_train, y_train, epochs = 250)
The reason you are getting 21/21 in the step count of the fit method is because, as you didn't specified a batch_size argument to the method, it defaults to 32 (see the documentation of the fit method), and 670/32 = 20,94 that approximates to 21.
Regarding why your loss is so high, I cannot tell with the data provided, you will need to show which is the data, the code used to preprocess it, etc.

Simple Keras ML model for predicting multiplication isn't working

I have created a simple machine learning model to predict the multiplication of two given numbers. I followed a youtube tutorial to learn the basic and try to work on this simple idea.
My model has three dense layers - input, hidden, output. Input and hidden were using same activation function 'relu' which were giving me loss as NaN on model fit so I changed one of them to sigmoid which started giving me 0.00000+e... something as loss.
I don't know what is wrong. Anyone can please direct me what I am doing wrong or assuming wrong?
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('data.csv')
print(df)
x = np.array(df['X'])
y = np.array(df['Y'])
s = np.array(df['S'])
def build_model():
model = keras.Sequential()
inputLayer = layers.Dense(64, activation='sigmoid', input_shape=[2])
hiddenLayer = layers.Dense(64, activation='relu')
outputLayer = layers.Dense(1)
model.add(inputLayer)
model.add(hiddenLayer)
model.add(outputLayer)
model.compile(optimizer='sgd', loss='mean_squared_error',metrics=['accuracy'])
return model
model = build_model()
print(model.summary())
EPOCHS = 1000
# I didn't know how to provide mulitple input to my model for
# training so I checked stackoverflow here
# https://stackoverflow.com/questions/55233377/keras-sequential-model-with-multiple-inputs?noredirect=1&lq=1
merged_array = np.stack([x, y], axis=1)
history = model.fit(merged_array, s, epochs=EPOCHS, validation_split = 0.2, verbose=2)
print(history)
print(model.predict([[2,3],]))
Disclaimer: I am a beginner and I have just started using keras and python for the first time in my life.
It does work for smaller numbers with ReLU activation.
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
x = np.random.randint(0, 10, 1000)
y = np.random.randint(0, 10, 1000)
s = x*y
def build_model():
model = keras.Sequential()
model.add(layers.Dense(64, activation='relu', input_shape=[2]))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(1))
model.compile(optimizer=keras.optimizers.Adam(lr=0.01),
loss='mean_squared_error')
return model
model = build_model()
merged_array = np.stack([x, y], axis=1)
history = model.fit(merged_array, s, epochs=250,
validation_split=0.2)
test_input = [2, 3]
print('\n{} x {} ='.format(*test_input),
np.round(model.predict([test_input])[0][0]).astype(int))
2 x 3 = 6
SGD also works, but it requires standardization/normalization, which kind of defeats the purpose of your task, so I changed it. But it also works.
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
x = np.random.randint(0, 10, 1000)
y = np.random.randint(0, 10, 1000)
s = x*y
x = x/10
y = y/10
def build_model():
model = keras.Sequential()
model.add(layers.Dense(64, activation='relu', input_shape=[2]))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(1))
model.compile(optimizer=keras.optimizers.SGD(0.001), loss='mean_squared_error')
return model
model = build_model()
merged_array = np.stack([x, y], axis=1)
history = model.fit(merged_array, s, epochs=250,
validation_split=0.2, batch_size=16)
test_input = [2/10, 3/10]
print('\n{} x {} ='.format(*map(lambda l: int(l*10), test_input)),
np.round(model.predict([test_input])[0][0]).astype(int))
i noticed a couple of issues with your model:
Your input layer is not an input. You do not need to have a designated input layer in this case. The arguement input_shape=[2] is sufficient to add a proper input layer before this layer.
You do not determine any batchsize in the fit function: batches are usually a small subset of your training and validation set (commonly some base-2 numbers like 4, 8, 16, 32, ...). During training not only one sample of your set is used for backpropagating and adjusting your weights (aka "learning") but in batches, which makes it faster. Since your input data are two single floating numbers (I assume) you can choose a really high batchsize like 1024 or higher. The batch size belongs to the so called hyperparameter, which affect your overall training success.
history = model.fit(merged_array, s, batch_size=1024, epochs=EPOCHS, validation_split=0.2, verbose=2)
During training you track the "accuracy" metric. As you are working on a regression problem, this is not helping you in estimating your model's performance. (Accuracy is used for classification problems) You can leave it out
I cannnot give you more specific advice with knowledge about the data you are using, how many, datapoints you have and what kind of numbers you want to multiply (bounded to numbers between 0 and 10, float or integeres,...)
Hope this helps sofar (;

Accuracy is zero all the time

I wanna use eight features to predict a target feature, and while I am using keras, I got accuracy to be zeros all the time. I am new to machine learning, and I am quite confused.
Have tried different activation, I thought this could be a regression problem so I used 'linear' as the last activation function, and it turns out that the accuracy is still zero
from sklearn import preprocessing
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
import pandas as pd
# Step 2 - Load our data
zeolite_13X_error = pd.read_csv("zeolite_13X_error.csv", delimiter=",")
dataset = zeolite_13X_error.values
X = dataset[:, 0:8]
Y = dataset[:, 10] # Purity
min_max_scaler = preprocessing.MinMaxScaler()
X_scale = min_max_scaler.fit_transform(X)
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(X_scale, Y, test_size=0.3)
X_val, X_text, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)
# Building and training first NN
model = Sequential([
Dense(32, activation='relu', input_shape=(8,)),
Dense(32, activation='relu'),
Dense(1, activation='linear'),
])
model.compile(optimizer='sgd',
loss='binary_crossentropy',
metrics=['accuracy'])
hist = model.fit(X_train, Y_train,
batch_size=32, epochs=10,
validation_data=(X_val, Y_val))
If you decide to treat this as a regression problem, then
Your loss should be mean_squared_error, or some other loss appropriate for regression, but not binary_crossentropy, which is appropriate for binary classification only, and
Accuracy is meaningless - it is meaningful only for classification settings; in regression settings, we normally use the loss itself for performance evaluation - see own answer in What function defines accuracy in Keras when the loss is mean squared error (MSE)? for more.
If you decide to tackle this as a classification problem, you should change the activation of your last layer to sigmoid.
In any case, the combination you show here - loss='binary_crossentropy' and activation='linear' for the single-node last layer - is meaningless.
Check the output of your model to check the values. The model is predicting probabilities, instead of binary 0/1 decision which i believe is your case as you are using accuracy as a metric. If the model is predicting probabilities then convert them into 0 or 1 by rounding them based on threshold (of your choice i.e. if prediction > 0.5 then 1 else 0).
Also increase the number of epochs. Also use sigmoid activation in the output layer.

Categories