So basically, I am working on this bullet optimization program. I wish to study how different ballistics parameters such as weight, length, and mass affect a ballistics coefficient. However, my training accuracy is 0, although there is loss and val_loss. I've read similar Stackoverflow posts regarding this, but none have helped me so far. Perhaps I just didn't do them right; I am referencing https://stackoverflow.com/a/63513872/12349188
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.utils import shuffle
df = pd.read_csv('Bullet Optimization\ShootForum Bullet DB_2.csv')
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
dataset = df.values
X = dataset[:,0:12]
X = np.asarray(X).astype(np.float32)
y = dataset[:,13]
y = np.asarray(y).astype(np.float32)
X_train, X_val_and_test, y_train, y_val_and_test = train_test_split(X, y, test_size=0.3, shuffle=True)
X_val, X_test, y_val, y_test = train_test_split(X_val_and_test, y_val_and_test, test_size=0.5)
from keras.models import Sequential
from keras.layers import Dense, BatchNormalization
model = Sequential(
[
#2430 is the shape of X_train
#BatchNormalization(axis=-1, momentum = 0.1),
Dense(32, activation='relu'),
Dense(32, activation='relu'),
Dense(1,activation='softmax'),
]
)
model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(X_train, y_train, batch_size=64, epochs=100, validation_data=(X_val, y_val))
Did I do something wrong in my code? I know some python but I just kind of built upon the tutorials for my own purposes.
There are several problems in your code.
First this line:
Dense(1,activation='softmax')
This line will cause the output 1 every time. So even if you are making classification, your accuracy would be 50% if you had 2 classes. Softmax outputs' sum will be equal to one. So using it with one neuron does not make sense.
You need to change your loss and metric as this is a regression.
loss='mse', metrics=['mse']
Also your output neuron should be linear which means does not need any activation function. It should be like:
Dense(1)
Related
I am really new with deep learning. I want to do a task which asks: Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.
here is my code:
import pandas as pd
from tensorflow.python.keras import Sequential
from tensorflow.python.keras.layers import Dense
from sklearn.model_selection import train_test_split
concrete_data = pd.read_csv('https://cocl.us/concrete_data')
n_cols = concrete_data.shape[1]
model = Sequential()
model.add(Dense(units=10, activation='relu', input_shape=(n_cols-1,)))
model.compile(loss='mean_squared_error',
optimizer='adam')
y = concrete_data.Cement
x = concrete_data.drop('Cement', axis=1)
xTrain, xTest, yTrain, yTest = train_test_split(x, y, test_size = 0.3)
model.fit(xTrain, yTrain, epochs=50)
and now to evaluate mean square error I wrote this :
from sklearn.metrics import mean_squared_error
predicted_y = model.predict(xTest)
mean_squared_error(yTest, predicted_y)
and I got this error:
y_true and y_pred have different number of output (1!=10)
my predicted_y shape is : (309, 10)
I googled it and I really couldn't find an answer to solve this problem. I don't know what is wrong with my code.
Your y_test data shape is (N, 1) but because you put 10 neurons in output layer, your model makes 10 different predictions which is the error.
You need to change the number of neurons in the output layer to 1 or add a new output layer which has only 1 neuron.
The below code probably works for you.
import pandas as pd
from tensorflow.python.keras import Sequential
from tensorflow.python.keras.layers import Dense
from sklearn.model_selection import train_test_split
concrete_data = pd.read_csv('https://cocl.us/concrete_data')
n_cols = concrete_data.shape[1]
model = Sequential()
model.add(Dense(units=10, activation='relu', input_shape=(n_cols-1,)))
model.add(Dense(units=1))
model.compile(loss='mean_squared_error',
optimizer='adam')
y = concrete_data.Cement
x = concrete_data.drop('Cement', axis=1)
xTrain, xTest, yTrain, yTest = train_test_split(x, y, test_size = 0.3)
model.fit(xTrain, yTrain, epochs=50)
Actually, what you are trying to check is the mean_squared_error of y_test and the predicted_y
You have to check what your model predict on x_test, which is the prediction :
predicted_y = model.predict(x_test)
Then you can calculate the error:
mean_squared_error(y_test, predicted_y)
y_pred = model.predict(x_test).sum(axis=1)
Try this, it worked for me
enter image description hereenter image description here
There is 1000 data, and in this 1000, 670 is train and 330 is test.
model.fit(x_train, y_train, epochs = 250)
when I write this it should be 670 train data but in my case it’s 21.
CODE
import pandas as pd
dataFrame = pd.read_excel("bisiklet_fiyatlari.xlsx")
from sklearn.model_selection import train_test_split
y=dataFrame["Fiyat"].values
x=dataFrame[["BisikletOzellik1","BisikletOzellik2"]].values
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=15)
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
scaler.fit(x_train)
x_train=scaler.transform(x_train)
x_test=scaler.transform(x_test)
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
model = Sequential()
model.add(Dense(4, activation = "relu"))
model.add(Dense(4, activation = "relu"))
model.add(Dense(4, activation = "relu"))
model.add(Dense(1))
model.compile(optimizer="rmsprop",loss="mse")
model.fit(x_train, y_train, epochs = 250)
The reason you are getting 21/21 in the step count of the fit method is because, as you didn't specified a batch_size argument to the method, it defaults to 32 (see the documentation of the fit method), and 670/32 = 20,94 that approximates to 21.
Regarding why your loss is so high, I cannot tell with the data provided, you will need to show which is the data, the code used to preprocess it, etc.
I am really new with deep learning. I want to do a task which asks: Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.
here is my code:
import pandas as pd
from tensorflow.python.keras import Sequential
from tensorflow.python.keras.layers import Dense
from sklearn.model_selection import train_test_split
concrete_data = pd.read_csv('https://cocl.us/concrete_data')
n_cols = concrete_data.shape[1]
model = Sequential()
model.add(Dense(units=10, activation='relu', input_shape=(n_cols-1,)))
model.compile(loss='mean_squared_error',
optimizer='adam')
y = concrete_data.Cement
x = concrete_data.drop('Cement', axis=1)
xTrain, xTest, yTrain, yTest = train_test_split(x, y, test_size = 0.3)
model.fit(xTrain, yTrain, epochs=50)
and now to evaluate mean square error I wrote this :
from sklearn.metrics import mean_squared_error
predicted_y = model.predict(xTest)
mean_squared_error(yTest, predicted_y)
and I got this error:
y_true and y_pred have different number of output (1!=10)
my predicted_y shape is : (309, 10)
I googled it and I really couldn't find an answer to solve this problem. I don't know what is wrong with my code.
Your y_test data shape is (N, 1) but because you put 10 neurons in output layer, your model makes 10 different predictions which is the error.
You need to change the number of neurons in the output layer to 1 or add a new output layer which has only 1 neuron.
The below code probably works for you.
import pandas as pd
from tensorflow.python.keras import Sequential
from tensorflow.python.keras.layers import Dense
from sklearn.model_selection import train_test_split
concrete_data = pd.read_csv('https://cocl.us/concrete_data')
n_cols = concrete_data.shape[1]
model = Sequential()
model.add(Dense(units=10, activation='relu', input_shape=(n_cols-1,)))
model.add(Dense(units=1))
model.compile(loss='mean_squared_error',
optimizer='adam')
y = concrete_data.Cement
x = concrete_data.drop('Cement', axis=1)
xTrain, xTest, yTrain, yTest = train_test_split(x, y, test_size = 0.3)
model.fit(xTrain, yTrain, epochs=50)
Actually, what you are trying to check is the mean_squared_error of y_test and the predicted_y
You have to check what your model predict on x_test, which is the prediction :
predicted_y = model.predict(x_test)
Then you can calculate the error:
mean_squared_error(y_test, predicted_y)
y_pred = model.predict(x_test).sum(axis=1)
Try this, it worked for me
so I'm making a project where basically i have to predict whether or not a house price is above or below its median price and to do that, I'm using this dataset from Kaggle(https://drive.google.com/file/d/1GfvKA0qznNVknghV4botnNxyH-KvODOC/view). 1 means "Above Median" and 0 means "Below Median". I wrote this code to train a neural network and save it as a .h5 file:
import pandas as pd
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
import h5py
df = pd.read_csv('housepricedata.csv')
dataset = df.values
X = dataset[:,0:10]
Y = dataset[:,10]
min_max_scaler = preprocessing.MinMaxScaler()
X_scale = min_max_scaler.fit_transform(X)
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(X_scale, Y, test_size=0.3)
X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)
model = Sequential([
Dense(32, activation='relu', input_shape=(10,)),
Dense(32, activation='relu'),
Dense(1, activation='sigmoid'),
])
model.compile(optimizer='sgd',
loss='binary_crossentropy',
metrics=['accuracy'])
hist = model.fit(X_train, Y_train,
batch_size=32, epochs=100,
validation_data=(X_val, Y_val))
model.save("house_price.h5")
After running it, it successfully saves the .h5 file to my directory. What I want to do now is use my trained model to make predictions on a new .csv file and determine whether or not each of those are above or below median price. This is an image of the csv file in VSCode that i want it to make predictions on:
csv file image As you can see, this file doesn't contain a 1(above median) or 0(below median) because that's what I want it to predict. This is the code I wrote to do that:
import pandas as pd
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
from keras.models import load_model
import h5py
df = pd.read_csv('data.csv')
dataset = df.values
X = dataset[:,0:10]
Y = dataset[:,10]
min_max_scaler = preprocessing.MinMaxScaler()
X_scale = min_max_scaler.fit_transform(X)
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(X_scale, Y, test_size=0.3)
X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)
model = load_model("house_price.h5")
y_pred = model.predict(X_test)
print(y_pred)
It's output is [[0.00101464]] I have no clue what that is and why it's only returning one value even though the csv file has 4 rows. Does anyone know how I can fix that and be able to predict either a 1 or a 0 for each row in the csv file?
Thank You!
As much I understand what you want!
Let's Try ! This code work for me
import tensorflow
model = tensorflow.keras.models.load_model("house_price.h5")
y_pred=model.predict(X_test)
still you are not able to get visit following site
1:answer1
2:answer2
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('C:\\Users\\acer\\Downloads\\housepricedata.csv')
dataset.head()
X=dataset.iloc[:,0:10]
y=dataset.iloc[:,10]
X.head()
from sklearn.preprocessing import StandardScaler
obj=StandardScaler()
X=obj.fit_transform(X)
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split
(X,y,random_state=2020,test_size=0.25)
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
classifier = Sequential()
# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation =
'relu', input_dim = 10))
# classifier.add(Dropout(p = 0.1))
# Adding the second hidden layer
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation
= 'relu'))
# classifier.add(Dropout(p = 0.1))
# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation
= 'sigmoid'))
# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics
= ['accuracy'])
classifier.fit(X_train, y_train, batch_size = 10, epochs = 100)
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)
print(y_pred)
classifier.save("house_price.h5")
import tensorflow
model = tensorflow.keras.models.load_model("house_price.h5")
y_pred=model.predict(X_test)
y_pred = (y_pred > 0.5)
print(y_pred)
Both y_pred produce same output for me
Here one thing you not y_pred not contain 0 and 1 because you use sigmoid function which determine predication in probability
so if(y_pred>0.5) it mean value is one
#True rep one
#false rep zero
#you can use replace function or map function of pandas to get convert true
into 1
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
df = pd.read_csv('insurance.csv')
X = df.drop(['sex', 'children', 'smoker', 'region'], axis = 1)
X = X.values
y = df['charges']
y = y.values.reshape(1331,1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 75)
from keras import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(5, activation = 'sigmoid'))
model.add(Dense(4, activation = 'sigmoid'))
model.add(Dense(1, activation = 'sigmoid'))
from keras import optimizers
sgd = optimizers.SGD(lr=0.1)
model.compile(sgd, 'mse')
model.fit(X_train, y_train, 32, 100, shuffle=False)
This right here is my code, the data I am feeding in is all numerical, and I have tried different hyperparameters, nothing seems to work.
Any help would be much appreciated.
I just don't know what is going wrong over here.
If you are indeed in a regression setting (as implied by your choice of loss, MSE) and not in a classification one, the basic mistake in your code is the activation of your last layer, which should be linear:
model.add(Dense(1, activation = 'linear'))
Of course, there can be several other things going wrong with your approach, including the architecture of your model itself (there is not any kind of "guarantee" that, whatever model architecture you throw in your data, it will produce decent results, and your model looks too simple), the activation functions of the other layers (usually today we start with relu) etc., but it is impossible to say more without knowing your data.