Getting a prediction from Keras with a 1-D array - python
I have the following code:
from numpy import loadtxt
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from time import sleep
dataset = loadtxt('dataset.csv', delimiter=',')
X = dataset[:,0:8]
y = dataset[:,8]
model = Sequential()
model.add(Dense(192, input_dim=8, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=600, batch_size=10)
_, accuracy = model.evaluate(X, y)
print('Accuracy: %.2f' % (accuracy*100))
When I run it, it trains no problem, and most of the time I even get 100% accuracy, but I'm having trouble getting predictions from the model. As you can see by the following sample of the training data, the first 8 entries are inputs, and a 1 or 0 is the out put.
6,148,72,35,0,33.6,0.627,50,1
1,85,66,29,0,26.6,0.351,31,0
8,183,64,0,0,23.3,0.672,32,1
1,89,66,23,94,28.1,0.167,21,0
0,137,40,35,168,43.1,2.288,33,1
5,116,74,0,0,25.6,0.201,30,0
3,78,50,32,88,31.0,0.248,26,1
10,115,0,0,0,35.3,0.134,29,0
2,197,70,45,543,30.5,0.158,53,1
8,125,96,0,0,0.0,0.232,54,1
4,110,92,0,0,37.6,0.191,30,0
10,168,74,0,0,38.0,0.537,34,1
10,139,80,0,0,27.1,1.441,57,0
1,189,60,23,846,30.1,0.398,59,1
5,166,72,19,175,25.8,0.587,51,1
7,100,0,0,0,30.0,0.484,32,1
0,118,84,47,230,45.8,0.551,31,1
7,107,74,0,0,29.6,0.254,31,1
What I want to enter "6,148,72,35,0,33.6,0.627,50" into the code and have the model give me an output based on that. What should I do?
Alright, that was fast, but it occurred to me that I just needed to add another list define around the one that was already there, making it a 2D array, and allowing Keras to make a prediction.
Related
Train a neural network (ANN) with 8 input and 8 output features and predict a result for one unseen input feature
I tried to train a neural network with a CSV data file that contains both input (3560 x 8) and output (3560 x 8) values. import numpy as np import pandas as pd from matplotlib import pyplot as plt import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense from tensorflow import keras # Load the data dataframe = pd.read_csv("ANN.csv",header=None) dataset = dataframe.values # Assign the columns of the dataframe to the inputs for arrays for the ANN X_input_dataset = dataset[:, 0:8] Y_output_dataset = dataset[:, 8:16] # Sequential model model = Sequential() # Add the different layers model.add(keras.layers.Flatten(input_shape=(8,))) model.add(Dense(50, activation='relu')) model.add(Dense(40, activation='relu')) model.add(Dense(50, activation='relu')) model.add(Dense(1, activation='linear')) # Configure the model and start training model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mean_absolute_percentage_error']) history = model.fit(X_input_dataset, Y_output_dataset, epochs=2000, batch_size=10, verbose=1, validation_split=0.3) # Predict values x_new = X_input_dataset[:,0] y_new = model.predict(x_new) print(y_new) But during the prediction for one column of new unseen input (3560 x 1) with the training data itself, I get an error due to the input shape. The neural network is expecting 8 features (3560 x 8) as input to predict the new y (3560 x 1). Please help me with this.
Prediction Interval for Neural Net in Python
I'm currently using keras to create a neural net in python. I have a basic model and the code looks like this: from keras.layers import Dense from keras.models import Sequential model = Sequential() model.add(Dense(23, input_dim=23, kernel_initializer='normal', activation='relu')) model.add(Dense(500, kernel_initializer='normal', activation='relu')) model.add(Dense(1, kernel_initializer='normal', activation="relu")) model.compile(loss='mean_squared_error', optimizer='adam') It works well and gives me good predictions for my use case. However, I would like to be able to use a Variational Gaussian Process layer to give me an estimate for the prediction interval as well. I'm new to this type of layer and am struggling a bit to implement it. The tensorflow documentation on it can be found here: https://www.tensorflow.org/probability/api_docs/python/tfp/layers/VariationalGaussianProcess However, I'm not seeing that same layer in the keras library. For further reference, I'm trying to do something similar to what was done in this article: https://blog.tensorflow.org/2019/03/regression-with-probabilistic-layers-in.html There seems to be a bit more complexity when you have 23 inputs vs one that I'm not understanding. I'm also open to other methods to achieving the target objective. Any examples on how to do this or insights on other approaches would be greatly appreciated!
tensorflow_probability is a separate library but suitable to use with Keras and TensorFlow. You can add those custom layers in your code and change it to a probabilistic model. If your goal is just to get a prediction interval it would be simpler to use the DistributionLambda layer. So your code would be as follows: from keras.layers import Dense from keras.models import Sequential from sklearn.datasets import make_regression import tensorflow_probability as tfp import tensorflow as tf tfd = tfp.distributions # Sample data X, y = make_regression(n_samples=100, n_features=23, noise=4.0, bias=15) # loss function Negative log likelyhood negloglik = lambda y, p_y: -p_y.log_prob(y) # Model model = Sequential() model.add(Dense(23, input_dim=23, kernel_initializer='normal', activation='relu')) model.add(Dense(500, kernel_initializer='normal', activation='relu')) model.add(Dense(2)) model.add(tfp.layers.DistributionLambda( lambda t: tfd.Normal(loc=t[..., :1], scale=1e-3 + tf.math.softplus(0.05 * t[..., 1:])))) model.compile(loss=negloglik, optimizer='adam') model.fit(X,y, epochs=250, verbose=None) After training your model, you can get your prediction distribution with the following lines: yhat = model(X) # make predictions means = yhat.mean() # prediction means stds = yhat.stddev() # prediction standard deviation
Unable to train neural network properly
I am trying to train a Neural Network(NN) implemented through Keras to implement the following function. y(n) = y(n-1)*0.9 + x(n)*0.1 So the idea is to have a signal as train_x data and pass through the above function to get a train_y data, giving us a (train_x, train_y) training data. import numpy as np from keras.models import Sequential from keras.layers.core import Activation, Dense from keras.callbacks import EarlyStopping import matplotlib.pyplot as plt train_x = np.concatenate((np.ones(100)*120,np.ones(150)*150,np.ones(150)*90,np.ones(100)*110), axis=None) train_y = np.ones(train_x.size)*train_x[0] alpha = 0.9 for i in range(train_x.size): train_y[i] = train_y[i-1]*alpha + train_x[i]*(1 - alpha) train_x data vs train_y data plot The function under question y(n) is a low pass function and makes the x(n) value to not change abruptly, as shown in the plot. Then I make a NN and fit it with (train_x, train_y) and plot the model = Sequential() model.add(Dense(128, kernel_initializer='normal', input_dim=1, activation='relu')) model.add(Dense(256, kernel_initializer='normal', activation='relu')) model.add(Dense(256, kernel_initializer='normal', activation='relu')) model.add(Dense(256, kernel_initializer='normal', activation='relu')) model.add(Dense(1, kernel_initializer='normal', activation='linear')) model.compile(loss='mean_absolute_error', optimizer='adam', metrics=['accuracy']) history = model.fit(train_x, train_y, epochs=200, verbose=0) print(history.history['loss'][-1]) plt.plot(history.history['loss']) plt.show() loss_plot_200_epoch And the final loss value is approximately 2.9, which I thought was pretty good. But then the accuracy plot was like this accuracy_plot_200_epochs So when I check the prediction of the neural network over the data it was trained on plt.plot(model.predict(train_x)) plt.plot(train_x) plt.show() train_x_vs_predict_x The values have just offsetted by a little and that's all. I tried changing the activation functions, number of neurons and layers but the result still is the same. What am I doing wrong? ---- Edit ---- Made the NN to accept 2 dimensional input and it works as intended import numpy as np from keras.models import Sequential from keras.layers.core import Activation, Dense from keras.callbacks import EarlyStopping import matplotlib.pyplot as plt train_x = np.concatenate((np.ones(100)*120,np.ones(150)*150,np.ones(150)*90,np.ones(100)*110), axis=None) train_y = np.ones(train_x.size)*train_x[0] alpha = 0.9 for i in range(train_x.size): train_y[i] = train_y[i-1]*alpha + train_x[i]*(1 - alpha) train = np.empty((500,2)) for i in range(500): train[i][0]=train_x[i] train[i][1]=train_y[i] model = Sequential() model.add(Dense(128, kernel_initializer='normal', input_dim=2, activation='relu')) model.add(Dense(256, kernel_initializer='normal', activation='relu')) model.add(Dense(256, kernel_initializer='normal', activation='relu')) model.add(Dense(256, kernel_initializer='normal', activation='relu')) model.add(Dense(1, kernel_initializer='normal', activation='linear')) model.compile(loss='mean_absolute_error', optimizer='adam', metrics=['accuracy']) history = model.fit(train, train_y, epochs=100, verbose=0) print(history.history['loss'][-1]) plt.plot(history.history['loss']) plt.show()
If I execute your code, I get the following plot for the X-Y values: If I didn't miss something important here and you really feed that to your neural net, you probably can't expect better results. The reason is, that a neural net is just a function that can only calculate one output vector for one input. In your case the output vector would consist of only one element (your y value), but as you can see in the diagram above, for x=90 there is not just one single output. So what you feed to your neural net, cannot really be calculated as a function and so most likely the network tries to calculate the straight line between point ~(90, 145) and ~(150, 150). I mean, the "upper line" in the diagram.
The neural network you're building is a simple multi-layer perceptron with one input node and one output node. This means that it is essentially a function that accepts one real number and returns one real number -- context is not passed in and can therefore not be considered. The expression model.predict(train_x) Does not evaluate a vector-to-vector function for the vector train_x but evaluates a number-to-number function for every number in train_x, then returns the list of results. This is why you get flat segments in the train_x_vs_predict_x plot: the same input numbers produce the same output numbers every time. Given this constraint, the approximation is actually quite good. For example, the network has seen for x values of 150 many y values of 150 and a few lower ones but never anything above 150. So, given an x of 150, it predicts a y value of slightly lower than 150. The function you wanted, on the other hand, refers to the previous function value and will need information about this in its input. If what you're trying to build is a function that accepts a sequence of real numbers and returns a sequence of real numbers, you could do that with a many-to-many recurrent network (and you're going to need a lot more training data than one example sequence), but since you can calculate the function directly, why bother with neural networks at all? There's no need to whip out the chainsaw where a butter knife will do.
Keras and TensorFlow: I'm getting an InvalidArgumentError
I am just starting out with Keras and TensorFlow and I have started by following the tutorial (https://machinelearningmastery.com/tutorial-first-neural-network-python-keras/) Unfortunately when I run the finished code (I'm using Anaconda - not sure if this is relevant) I get the following error: Here is the code: # Create your first MLP in Keras from keras.models import Sequential from keras.layers import Dense import numpy # fix random seed for reproducibility numpy.random.seed(7) # load pima indians dataset dataset = numpy.loadtxt("D:\Applications\Python Apps\pima-indians-diabetes.csv", delimiter=",") # split into input (X) and output (Y) variables X = dataset[:,0:8] Y = dataset[:,8] # create model model = Sequential() model.add(Dense(12, input_dim=8, activation='relu')) model.add(Dense(8, activation='relu')) model.add(Dense(1, activation='sigmoid')) # Compile model model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) # Fit the model model.fit(X, Y, epochs=150, batch_size=10) # evaluate the model scores = model.evaluate(X, Y) print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100)) Here is the error: InvalidArgumentError: Input to reshape is a tensor with 10 values, but the requested shape has 0 [[Node: training/Adam/gradients/loss/dense_3_loss/Mean_1_grad/Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _class=["loc:#training/Adam/gradients/loss/dense_3_loss/Mean_1_grad/truediv"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](training/Adam/gradients/loss/dense_3_loss/mul_grad/Sum, training/Adam/gradients/loss/dense_3_loss/Mean_1_grad/DynamicStitch/_75)]] Here is an image of the entire thing which is a bit easier to read - https://i.imgur.com/ZTd3ZeT.jpg If someone is able to assist with this I would really appreciate it. Thanks Glen
It was a bug in TensorFlow version 1.09. Moving to 1.10 fixed it
Model Suggestion for Keras Regression
I am trying to solve a regression with Keras but MSE is huge, I mean like 29346217.6819 I am really new, so do you have any suggestions to make the model give reasonable mse? I am not sure even my data is OK or problematic but those are actual sales data. Data (about to 3000 lines. I use 2000 for training and 1000 for testing) Full data is here ProductNo,Day,Month,CartonSales 1,6,02,2374 1,3,02,2374 1,6,04,2374 1,6,04,2374 1,3,06,2374 1,6,09,2374 1,1,09,2374 1,6,09,2374 1,6,10,2374 Code from keras import optimizers from keras.callbacks import Callback from numpy import array from keras.models import Sequential from keras.layers import Dense, Dropout from matplotlib import pyplot import pandas as pds # prepare sequence class TestCallback(Callback): def __init__(self, test_data): self.test_data = test_data def on_epoch_end(self, epoch, logs={}): x, y = self.test_data loss, acc = self.model.evaluate(x, y, verbose=0) print('\nTesting loss: {}, acc: {}\n'.format(loss, acc)) dataframe = pds.read_csv('pmidata.csv', usecols=[0, 1, 2, 3]) dataframe = dataframe.sample(frac=1) dataframeX_train = dataframe.iloc[0:2000][['ProductNo', 'Day', 'Month']] dataframeY_train = dataframe.iloc[0:2000][['CartonSales']] dataframeX_test = dataframe.iloc[2001:3001][['ProductNo', 'Day', 'Month']] dataframeY_test = dataframe.iloc[2001:3001][['CartonSales']] # create model model = Sequential() model.add(Dense(3, input_dim=3, activation='relu')) model.add(Dropout(0.2)) model.add(Dense(1)) model.compile(loss='mse', optimizer='adam', metrics=['mse']) #sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True) #model.compile(loss='mse', optimizer=sgd, metrics=['mse']) # train model #history = model.fit(dataframe, dataframe, epochs=500, batch_size=len(X), verbose=2) history = model.fit(dataframeX_train, dataframeY_train, epochs=100, batch_size=4, verbose=2, callbacks=[TestCallback((dataframeX_test, dataframeY_test))]) # plot metrics pyplot.plot(history.history['mean_squared_error']) pyplot.show()
As far as i can tell from your code above, your y values are CartonSales. Sales can have large values and large range and that's probably why you get such a high error. You could use mean_squared_logarithmic_error instead of mean square error but i would suggest to do the following. Continue using mean square error. log transform you y values and later exp transform you predictions import numpy as np dataframeY_train = np.log(dataframeY_train) dataframeY_test = np.log(dataframeY_test ) .... predictions=model.predict(dataframeX_test)[:,0] predictions = np.exp(predictions)