LSTM model has constant accuracy and doesn't variate - python
i'm stuck as you can see, with my lstm model. I'm trying to predict the amount of tons to produce per month. When i run the model to train the accuracy is almost constant, it has a minimal variation like:
0.34406
0.34407
0.34408
I tried different combination of activations, initializers and parameters, and the acc don't increase.
I don't know if the problem here is my data, my model or this value is the max acc the model can reach.
Here is the code (if you notice some libraries unused, its because i made some changes by the first version)
import numpy as np
import pandas as pd
from pandas.tseries.offsets import DateOffset
from sklearn.preprocessing import MinMaxScaler, StandardScaler, RobustScaler
from sklearn import preprocessing
import keras
%tensorflow_version 2.x
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Dropout
from keras.optimizers import Adam
import warnings
warnings.filterwarnings("ignore")
%matplotlib inline
from plotly.offline import iplot
import matplotlib.pyplot as plt
import chart_studio.plotly as py
import plotly.offline as pyoff
import plotly.graph_objs as go
df_ventas = pd.read_csv('/content/drive/My Drive/proyectoPanimex/DEOPE.csv', parse_dates=['Data Emissão'], index_col=0, squeeze=True)
#df_ventas = df_ventas.resample('M').sum().reset_index()
df_ventas = df_ventas.drop(columns= ['weekday', 'month'], axis=1)
df_ventas = df_ventas.reset_index()
df_ventas = df_ventas.rename(columns= {'Data Emissão':'Fecha','Un':'Cantidad'})
df_ventas['dia'] = [x.day for x in df_ventas.Fecha]
df_ventas['mes']=[x.month for x in df_ventas.Fecha]
df_ventas['anio']=[x.year for x in df_ventas.Fecha]
df_ventas = df_ventas[:-48]
df_ventas = df_ventas.drop(columns='Fecha')
df_diff = df_ventas.copy()
df_diff['cantidad_anterior'] = df_diff['Cantidad'].shift(1)
df_diff = df_diff.dropna()
df_diff['diferencia'] = (df_diff['Cantidad'] - df_diff['cantidad_anterior'])
df_supervised = df_diff.drop(['cantidad_anterior'],axis=1)
#adding lags
for inc in range(1,31):
nombre_columna = 'retraso_' + str(inc)
df_supervised[nombre_columna] = df_supervised['diferencia'].shift(inc)
df_supervised = df_supervised.dropna()
df_supervisedNumpy = df_supervised.to_numpy()
train = df_supervisedNumpy
scaler = MinMaxScaler(feature_range=(0, 1))
X_train = scaler.fit(train)
train = train.reshape(train.shape[0], train.shape[1])
train_scaled = scaler.transform(train)
X_train, y_train = train_scaled[:, 1:], train_scaled[:, 0:1]
X_train = X_train.reshape(X_train.shape[0], 1, X_train.shape[1])
#LSTM MODEL
model = Sequential()
act = 'tanh'
actF = 'relu'
model.add(LSTM(200, activation = act, input_dim=34, return_sequences=True ))
model.add(Dropout(0.15))
#model.add(Flatten())
model.add(LSTM(200, activation= act))
model.add(Dropout(0.2))
#model.add(Flatten())
model.add(Dense(200, activation= act))
model.add(Dropout(0.3))
model.add(Dense(1, activation= actF))
optimizer = keras.optimizers.Adam(lr=0.00001)
model.compile(optimizer=optimizer, loss=keras.losses.binary_crossentropy, metrics=['accuracy'])
history = model.fit(X_train, y_train, batch_size = 100,
epochs = 50, verbose = 1)
hist = pd.DataFrame(history.history)
hist['Epoch'] = history.epoch
hist
History plot:
loss acc Epoch
0 0.847146 0.344070 0
1 0.769400 0.344070 1
2 0.703548 0.344070 2
3 0.698137 0.344070 3
4 0.653952 0.344070 4
As you can see the only value that change its loss, but what is going on with Acc?. I'm starting with machine learning, and i have no more knowledge to can see my errors. Thanks!
A Dense(1, activation='softmax') will always freeze and not learn anything
A Dense(1, activation='relu') will very probably freeze and not learn anything
A Dense(1, activation='sigmoid') is ideal for classification (binary) problems and somewhat good for regression with values between 0 and 1.
A Dense(1, activation='tanh') is somewhat good for regression with values between -1 and 1
A Dense(1, activation='softplus') is somewhat good for regression with values between 0 and +infinite
A Dense(1, actiavation='linear') is good for regression in general with no limits (but it's highly recommended that the data be normalized before)
For regression, you can't use accuracy, but the metrics 'mae' and 'mse' don't provide "relative" difference, they provide "absolute" mean difference, one linear, the other squared.
Your output activation should be linear for continuous prediction or softmax for classification. Also multiply your learning rate by 100. Your loss should be mean_absolute_error. You could also easily divide your lstm neurons by a factor of 10. The tanh should be replaced by relu or the likes.
For your accuracy problem, it makes no sense to use accuracy, since you're not trying to classify. For metrics, you can use mae. You're trying to know how far the prediction is from the actual target, on a continuous scale. Accuracy is for categories, not continuous data.
Related
Getting Only Zeros from Model.Predict() When Using Individual Data Points But Getting both 1's and 0's when Using Entire Test Dataset
We trained a binary classification nn with ~90% accuracy/~20% loss. When we use the model by calling model.predict() function with the testing data (which is 20% of our entire dataset), we get a relatively even distribution of 1's and 0's. But when we input individual points from the testing data as a numpy array, we only get 0's regardless of shuffling or not. Can anyone help us explain why we are getting this behavior? When we use X_test (from the split above) instead of dummyTest, binary_prediction equals 1 for the respective data point array from dummyTest), but when we use dummyTest's data individually, binary_prediction only equals 0. Full Code Shown Below: import pandas as pd import numpy as np import matplotlib.pyplot as plt from pandas.plotting import scatter_matrix from sklearn import model_selection from keras.utils.np_utils import to_categorical from keras.models import Sequential from keras.layers import Dense from tensorflow.keras.optimizers import Adam from keras.layers import Dropout import os full_dataset = pd.read_csv('noQMarkDataset.csv') data = full_dataset[~full_dataset.isin(['?'])] data = data.dropna(axis=0) data = data.apply(pd.to_numeric) X = np.array(data.drop(['target'], 1)) y = np.array(data['target']) mean = X.mean(axis=0) X -= mean std = X.std(axis=0) X /= std X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, random_state = 0, test_size = 0.2, shuffle=False) #stratify = y # convert data to categorical labels Y_train = to_categorical(y_train, num_classes=None) Y_test = to_categorical(y_test, num_classes=None) Y_train_binary = y_train.copy() Y_test_binary = y_test.copy() Y_train_binary[Y_train_binary > 0] = 1 Y_test_binary[Y_test_binary > 0] = 1 def create_binary_model(): model = Sequential() model.add(Dense(16, input_dim=7, activation='relu')) # prev input_dim =1 3 model.add(Dropout(0.25)) model.add(Dense(16, activation = 'relu')) model.add(Dropout(0.25)) model.add(Dense(16)) model.add(Dropout(0.25)) model.add(Dense(12, activation = 'relu')) model.add(Dense(12, activation = 'relu')) model.add(Dropout(0.25)) model.add(Dense(8, activation='relu')) model.add(Dropout(0.25)) model.add(Dense(1, activation='sigmoid')) adam = Adam(lr=0.001) model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy']) #optimizer = 'rmsprop' previously return model binary_model = create_binary_model() history=binary_model.fit(X_train, Y_train_binary, validation_data=(X_test, Y_test_binary), epochs=130, batch_size=15) #epochs = 130, batch_size = 20 previously (best) plt.plot(history.history['accuracy']) plt.plot(history.history['val_accuracy']) plt.title('Model Accuracy') plt.ylabel('accuracy') plt.xlabel('epoch') plt.legend(['train', 'test']) plt.show() plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) plt.title('Model Loss') plt.ylabel('loss') plt.xlabel('epoch') plt.legend(['train', 'test']) plt.show() from sklearn.metrics import classification_report, accuracy_score dummyTest = np.array([[67.0,1.0,4.0,0.0,2.0,129.0,2.0]]) binary_pred = np.round(binary_model.predict(dummyTest)).astype(int) #binary_pred = np.round(binary_model.predict(X_test)).astype(int) print(f"Binary Pred: {binary_pred}") binary_model.save(os.path.join(".", "test_model.h5")) #after loading from tensorflow.keras.models import load_model model2 = load_model(os.path.join(".", "test_model.h5")) binary_pred2 = np.round(binary_model.predict(dummyTest)).astype(int) #binary_pred2 = np.round(binary_model.predict(X_test)).astype(int) print(f"Binary Pred2 after load: {binary_pred2}") Full Dataset Shown Below age,sex,cp,fbs,restecg,thalach,ca,target 63.0,1.0,1.0,1.0,2.0,150.0,0.0,0 67.0,1.0,4.0,0.0,2.0,108.0,3.0,2 67.0,1.0,4.0,0.0,2.0,129.0,2.0,1 37.0,1.0,3.0,0.0,0.0,187.0,0.0,0 41.0,0.0,2.0,0.0,2.0,172.0,0.0,0 56.0,1.0,2.0,0.0,0.0,178.0,0.0,0 62.0,0.0,4.0,0.0,2.0,160.0,2.0,3 57.0,0.0,4.0,0.0,0.0,163.0,0.0,0 63.0,1.0,4.0,0.0,2.0,147.0,1.0,2 53.0,1.0,4.0,1.0,2.0,155.0,0.0,1 57.0,1.0,4.0,0.0,0.0,148.0,0.0,0 56.0,0.0,2.0,0.0,2.0,153.0,0.0,0 56.0,1.0,3.0,1.0,2.0,142.0,1.0,2 44.0,1.0,2.0,0.0,0.0,173.0,0.0,0 52.0,1.0,3.0,1.0,0.0,162.0,0.0,0 57.0,1.0,3.0,0.0,0.0,174.0,0.0,0 48.0,1.0,2.0,0.0,0.0,168.0,0.0,1 54.0,1.0,4.0,0.0,0.0,160.0,0.0,0 48.0,0.0,3.0,0.0,0.0,139.0,0.0,0 49.0,1.0,2.0,0.0,0.0,171.0,0.0,0 64.0,1.0,1.0,0.0,2.0,144.0,0.0,0 58.0,0.0,1.0,1.0,2.0,162.0,0.0,0 58.0,1.0,2.0,0.0,2.0,160.0,0.0,1 58.0,1.0,3.0,0.0,2.0,173.0,2.0,3 60.0,1.0,4.0,0.0,2.0,132.0,2.0,4 50.0,0.0,3.0,0.0,0.0,158.0,0.0,0 58.0,0.0,3.0,0.0,0.0,172.0,0.0,0 66.0,0.0,1.0,0.0,0.0,114.0,0.0,0 43.0,1.0,4.0,0.0,0.0,171.0,0.0,0 40.0,1.0,4.0,0.0,2.0,114.0,0.0,3 69.0,0.0,1.0,0.0,0.0,151.0,2.0,0 60.0,1.0,4.0,1.0,0.0,160.0,2.0,2 64.0,1.0,3.0,0.0,0.0,158.0,0.0,1 59.0,1.0,4.0,0.0,0.0,161.0,0.0,0 44.0,1.0,3.0,0.0,0.0,179.0,0.0,0 42.0,1.0,4.0,0.0,0.0,178.0,0.0,0 43.0,1.0,4.0,0.0,2.0,120.0,0.0,3 57.0,1.0,4.0,0.0,2.0,112.0,1.0,1 55.0,1.0,4.0,0.0,0.0,132.0,1.0,3 61.0,1.0,3.0,1.0,0.0,137.0,0.0,0 65.0,0.0,4.0,0.0,2.0,114.0,3.0,4 40.0,1.0,1.0,0.0,0.0,178.0,0.0,0 71.0,0.0,2.0,0.0,0.0,162.0,2.0,0 59.0,1.0,3.0,1.0,0.0,157.0,0.0,0 61.0,0.0,4.0,0.0,2.0,169.0,0.0,1 58.0,1.0,3.0,0.0,2.0,165.0,1.0,4 51.0,1.0,3.0,0.0,0.0,123.0,0.0,0 50.0,1.0,4.0,0.0,2.0,128.0,0.0,4 65.0,0.0,3.0,1.0,2.0,157.0,1.0,0 53.0,1.0,3.0,1.0,2.0,152.0,0.0,0 41.0,0.0,2.0,0.0,0.0,168.0,1.0,0 65.0,1.0,4.0,0.0,0.0,140.0,0.0,0 44.0,1.0,4.0,0.0,2.0,153.0,1.0,2 44.0,1.0,2.0,0.0,2.0,188.0,0.0,0 60.0,1.0,4.0,0.0,0.0,144.0,1.0,1 54.0,1.0,4.0,0.0,2.0,109.0,1.0,1 50.0,1.0,3.0,0.0,0.0,163.0,1.0,1 41.0,1.0,4.0,0.0,2.0,158.0,0.0,1 54.0,1.0,3.0,0.0,2.0,152.0,1.0,0 51.0,1.0,1.0,0.0,2.0,125.0,1.0,0 51.0,0.0,4.0,0.0,0.0,142.0,0.0,2 46.0,0.0,3.0,0.0,2.0,160.0,0.0,0 58.0,1.0,4.0,0.0,2.0,131.0,3.0,1 54.0,0.0,3.0,1.0,0.0,170.0,0.0,0 54.0,1.0,4.0,0.0,0.0,113.0,1.0,2 60.0,1.0,4.0,0.0,2.0,142.0,2.0,2 60.0,1.0,3.0,0.0,2.0,155.0,0.0,1 54.0,1.0,3.0,0.0,2.0,165.0,0.0,0 59.0,1.0,4.0,0.0,2.0,140.0,0.0,2 46.0,1.0,3.0,0.0,0.0,147.0,0.0,1 65.0,0.0,3.0,0.0,0.0,148.0,0.0,0 67.0,1.0,4.0,1.0,0.0,163.0,2.0,3 62.0,1.0,4.0,0.0,0.0,99.0,2.0,1 65.0,1.0,4.0,0.0,2.0,158.0,2.0,1 44.0,1.0,4.0,0.0,2.0,177.0,1.0,1 65.0,0.0,3.0,0.0,2.0,151.0,0.0,0 60.0,1.0,4.0,0.0,2.0,141.0,1.0,1 51.0,0.0,3.0,0.0,2.0,142.0,1.0,0 48.0,1.0,2.0,0.0,2.0,180.0,0.0,0 58.0,1.0,4.0,0.0,2.0,111.0,0.0,3 45.0,1.0,4.0,0.0,2.0,148.0,0.0,0 53.0,0.0,4.0,0.0,2.0,143.0,0.0,0 39.0,1.0,3.0,0.0,2.0,182.0,0.0,0 68.0,1.0,3.0,1.0,2.0,150.0,0.0,3 52.0,1.0,2.0,0.0,0.0,172.0,0.0,0 44.0,1.0,3.0,0.0,2.0,180.0,0.0,0 47.0,1.0,3.0,0.0,2.0,156.0,0.0,0 53.0,0.0,3.0,0.0,2.0,115.0,0.0,0 53.0,0.0,4.0,0.0,2.0,160.0,0.0,0 51.0,0.0,3.0,0.0,2.0,149.0,0.0,0 66.0,1.0,4.0,0.0,2.0,151.0,0.0,0 62.0,0.0,4.0,0.0,2.0,145.0,3.0,3 62.0,1.0,3.0,0.0,0.0,146.0,3.0,0 44.0,0.0,3.0,0.0,0.0,175.0,0.0,0 63.0,0.0,3.0,0.0,2.0,172.0,0.0,0 52.0,1.0,4.0,0.0,0.0,161.0,1.0,1 59.0,1.0,4.0,0.0,2.0,142.0,1.0,2 60.0,0.0,4.0,0.0,2.0,157.0,2.0,3 52.0,1.0,2.0,0.0,0.0,158.0,1.0,0 48.0,1.0,4.0,0.0,2.0,186.0,0.0,0 45.0,1.0,4.0,0.0,2.0,185.0,0.0,0 34.0,1.0,1.0,0.0,2.0,174.0,0.0,0 57.0,0.0,4.0,0.0,2.0,159.0,1.0,0 71.0,0.0,3.0,1.0,2.0,130.0,1.0,0 49.0,1.0,3.0,0.0,0.0,139.0,3.0,3 54.0,1.0,2.0,0.0,0.0,156.0,0.0,0 59.0,1.0,4.0,0.0,0.0,162.0,1.0,2 57.0,1.0,3.0,0.0,2.0,150.0,1.0,1 61.0,1.0,4.0,0.0,0.0,140.0,1.0,2 39.0,1.0,4.0,0.0,0.0,140.0,0.0,3 61.0,0.0,4.0,0.0,2.0,146.0,0.0,1 56.0,1.0,4.0,1.0,2.0,144.0,1.0,1 52.0,1.0,1.0,0.0,2.0,190.0,0.0,0 43.0,0.0,4.0,1.0,2.0,136.0,0.0,2 62.0,0.0,3.0,0.0,0.0,97.0,1.0,2 41.0,1.0,2.0,0.0,0.0,132.0,0.0,0 58.0,1.0,3.0,1.0,2.0,165.0,0.0,0 35.0,0.0,4.0,0.0,0.0,182.0,0.0,0 63.0,1.0,4.0,1.0,2.0,132.0,3.0,3 65.0,1.0,4.0,0.0,2.0,127.0,1.0,2 48.0,1.0,4.0,1.0,2.0,150.0,2.0,3 63.0,0.0,4.0,0.0,2.0,154.0,3.0,4 51.0,1.0,3.0,0.0,0.0,143.0,0.0,0 55.0,1.0,4.0,0.0,0.0,111.0,0.0,3 65.0,1.0,1.0,1.0,2.0,174.0,1.0,1 45.0,0.0,2.0,0.0,2.0,175.0,0.0,0 56.0,0.0,4.0,1.0,2.0,133.0,2.0,3 54.0,1.0,4.0,0.0,0.0,126.0,1.0,3 44.0,1.0,2.0,0.0,0.0,170.0,0.0,0 62.0,0.0,4.0,0.0,0.0,163.0,0.0,0 54.0,1.0,3.0,0.0,2.0,147.0,0.0,0 51.0,1.0,3.0,0.0,0.0,154.0,1.0,0 29.0,1.0,2.0,0.0,2.0,202.0,0.0,0 51.0,1.0,4.0,0.0,2.0,186.0,0.0,0 43.0,0.0,3.0,0.0,0.0,165.0,0.0,0 55.0,0.0,2.0,0.0,2.0,161.0,0.0,0 70.0,1.0,4.0,0.0,0.0,125.0,0.0,4 62.0,1.0,2.0,0.0,2.0,103.0,1.0,3 35.0,1.0,4.0,0.0,0.0,130.0,0.0,1 51.0,1.0,3.0,1.0,2.0,166.0,0.0,0 59.0,1.0,2.0,0.0,0.0,164.0,0.0,0 59.0,1.0,1.0,0.0,2.0,159.0,0.0,1 52.0,1.0,2.0,1.0,0.0,184.0,0.0,0 64.0,1.0,3.0,0.0,0.0,131.0,0.0,1 58.0,1.0,3.0,0.0,2.0,154.0,0.0,0 47.0,1.0,3.0,0.0,0.0,152.0,0.0,1 57.0,1.0,4.0,1.0,2.0,124.0,3.0,4 41.0,1.0,3.0,0.0,0.0,179.0,0.0,0 45.0,1.0,2.0,0.0,2.0,170.0,0.0,0 60.0,0.0,3.0,0.0,0.0,160.0,1.0,0 52.0,1.0,1.0,1.0,0.0,178.0,0.0,0 42.0,0.0,4.0,0.0,2.0,122.0,0.0,0 67.0,0.0,3.0,0.0,2.0,160.0,0.0,0 55.0,1.0,4.0,0.0,2.0,145.0,1.0,4 64.0,1.0,4.0,0.0,2.0,96.0,1.0,3 70.0,1.0,4.0,0.0,2.0,109.0,3.0,1 51.0,1.0,4.0,0.0,0.0,173.0,0.0,1 58.0,1.0,4.0,0.0,2.0,171.0,2.0,1 60.0,1.0,4.0,0.0,2.0,170.0,2.0,2 68.0,1.0,3.0,0.0,0.0,151.0,1.0,0 46.0,1.0,2.0,1.0,0.0,156.0,0.0,0 77.0,1.0,4.0,0.0,2.0,162.0,3.0,4 54.0,0.0,3.0,0.0,0.0,158.0,0.0,0 58.0,0.0,4.0,0.0,2.0,122.0,0.0,0 48.0,1.0,3.0,1.0,0.0,175.0,2.0,0 57.0,1.0,4.0,0.0,0.0,168.0,0.0,0 54.0,0.0,2.0,1.0,2.0,159.0,1.0,0 35.0,1.0,4.0,0.0,2.0,156.0,0.0,1 45.0,0.0,2.0,0.0,0.0,138.0,0.0,0 70.0,1.0,3.0,0.0,0.0,112.0,1.0,3 53.0,1.0,4.0,0.0,2.0,111.0,0.0,0 59.0,0.0,4.0,0.0,0.0,143.0,0.0,1 62.0,0.0,4.0,0.0,2.0,157.0,0.0,0 64.0,1.0,4.0,0.0,2.0,132.0,2.0,4 57.0,1.0,4.0,0.0,0.0,88.0,1.0,1 52.0,1.0,4.0,1.0,0.0,147.0,3.0,0 56.0,1.0,4.0,0.0,2.0,105.0,1.0,1 43.0,1.0,3.0,0.0,0.0,162.0,1.0,0 53.0,1.0,3.0,1.0,2.0,173.0,3.0,0 48.0,1.0,4.0,0.0,2.0,166.0,0.0,3 56.0,0.0,4.0,0.0,2.0,150.0,2.0,2 42.0,1.0,1.0,0.0,2.0,178.0,2.0,0 59.0,1.0,1.0,0.0,2.0,145.0,0.0,0 60.0,0.0,4.0,0.0,2.0,161.0,0.0,1 63.0,0.0,2.0,0.0,0.0,179.0,2.0,0 42.0,1.0,3.0,1.0,0.0,194.0,0.0,0 66.0,1.0,2.0,0.0,0.0,120.0,3.0,2 54.0,1.0,2.0,0.0,2.0,195.0,1.0,1 69.0,1.0,3.0,0.0,2.0,146.0,3.0,2 50.0,1.0,3.0,0.0,0.0,163.0,0.0,0 51.0,1.0,4.0,0.0,0.0,122.0,3.0,3 62.0,0.0,4.0,1.0,0.0,106.0,3.0,2 68.0,0.0,3.0,0.0,2.0,115.0,0.0,0 67.0,1.0,4.0,0.0,2.0,125.0,2.0,3 69.0,1.0,1.0,1.0,2.0,131.0,1.0,0 45.0,0.0,4.0,0.0,2.0,152.0,0.0,0 50.0,0.0,2.0,0.0,0.0,162.0,0.0,0 59.0,1.0,1.0,0.0,2.0,125.0,0.0,1 50.0,0.0,4.0,0.0,2.0,159.0,0.0,0 64.0,0.0,4.0,0.0,0.0,154.0,0.0,0 57.0,1.0,3.0,1.0,0.0,173.0,1.0,0 64.0,0.0,3.0,0.0,0.0,133.0,0.0,0 43.0,1.0,4.0,0.0,0.0,161.0,0.0,0 45.0,1.0,4.0,0.0,2.0,147.0,3.0,3 58.0,1.0,4.0,0.0,2.0,130.0,2.0,3 50.0,1.0,4.0,0.0,2.0,126.0,0.0,3 55.0,1.0,2.0,0.0,0.0,155.0,0.0,0 62.0,0.0,4.0,0.0,0.0,154.0,0.0,1 37.0,0.0,3.0,0.0,0.0,170.0,0.0,0 38.0,1.0,1.0,0.0,0.0,182.0,0.0,4 41.0,1.0,3.0,0.0,2.0,168.0,0.0,0 66.0,0.0,4.0,1.0,0.0,165.0,2.0,3 52.0,1.0,4.0,0.0,0.0,160.0,1.0,1 56.0,1.0,1.0,0.0,2.0,162.0,0.0,0 46.0,0.0,2.0,0.0,0.0,172.0,0.0,0 46.0,0.0,4.0,0.0,2.0,152.0,0.0,0 64.0,0.0,4.0,0.0,0.0,122.0,2.0,0 59.0,1.0,4.0,0.0,2.0,182.0,0.0,0 41.0,0.0,3.0,0.0,2.0,172.0,0.0,0 54.0,0.0,3.0,0.0,2.0,167.0,0.0,0 39.0,0.0,3.0,0.0,0.0,179.0,0.0,0 53.0,1.0,4.0,0.0,0.0,95.0,2.0,3 63.0,0.0,4.0,0.0,0.0,169.0,2.0,1 34.0,0.0,2.0,0.0,0.0,192.0,0.0,0 47.0,1.0,4.0,0.0,0.0,143.0,0.0,0 67.0,0.0,3.0,0.0,0.0,172.0,1.0,0 54.0,1.0,4.0,0.0,2.0,108.0,1.0,3 66.0,1.0,4.0,0.0,2.0,132.0,1.0,2 52.0,0.0,3.0,0.0,2.0,169.0,0.0,0 55.0,0.0,4.0,0.0,1.0,117.0,0.0,2 49.0,1.0,3.0,0.0,2.0,126.0,3.0,1 74.0,0.0,2.0,0.0,2.0,121.0,1.0,0 54.0,0.0,3.0,0.0,0.0,163.0,1.0,0 54.0,1.0,4.0,0.0,2.0,116.0,2.0,3 56.0,1.0,4.0,1.0,2.0,103.0,0.0,2 46.0,1.0,4.0,0.0,2.0,144.0,0.0,1 49.0,0.0,2.0,0.0,0.0,162.0,0.0,0 42.0,1.0,2.0,0.0,0.0,162.0,0.0,0 41.0,1.0,2.0,0.0,0.0,153.0,0.0,0 41.0,0.0,2.0,0.0,0.0,163.0,0.0,0 49.0,0.0,4.0,0.0,0.0,163.0,0.0,0 61.0,1.0,1.0,0.0,0.0,145.0,2.0,2 60.0,0.0,3.0,1.0,0.0,96.0,0.0,0 67.0,1.0,4.0,0.0,0.0,71.0,0.0,2 58.0,1.0,4.0,0.0,0.0,156.0,1.0,2 47.0,1.0,4.0,0.0,2.0,118.0,1.0,1 52.0,1.0,4.0,0.0,0.0,168.0,2.0,3 62.0,1.0,2.0,1.0,2.0,140.0,0.0,0 57.0,1.0,4.0,0.0,0.0,126.0,0.0,0 58.0,1.0,4.0,0.0,0.0,105.0,1.0,1 64.0,1.0,4.0,0.0,0.0,105.0,1.0,0 51.0,0.0,3.0,0.0,2.0,157.0,0.0,0 43.0,1.0,4.0,0.0,0.0,181.0,0.0,0 42.0,0.0,3.0,0.0,0.0,173.0,0.0,0 67.0,0.0,4.0,0.0,0.0,142.0,2.0,0 76.0,0.0,3.0,0.0,1.0,116.0,0.0,0 70.0,1.0,2.0,0.0,2.0,143.0,0.0,0 57.0,1.0,2.0,0.0,0.0,141.0,0.0,1 44.0,0.0,3.0,0.0,0.0,149.0,1.0,0 58.0,0.0,2.0,1.0,2.0,152.0,2.0,3 60.0,0.0,1.0,0.0,0.0,171.0,0.0,0 44.0,1.0,3.0,0.0,0.0,169.0,0.0,0 61.0,1.0,4.0,0.0,2.0,125.0,1.0,4 42.0,1.0,4.0,0.0,0.0,125.0,0.0,2 52.0,1.0,4.0,1.0,0.0,156.0,0.0,2 59.0,1.0,3.0,1.0,0.0,134.0,1.0,2 40.0,1.0,4.0,0.0,0.0,181.0,0.0,1 42.0,1.0,3.0,0.0,0.0,150.0,0.0,0 61.0,1.0,4.0,0.0,2.0,138.0,1.0,1 66.0,1.0,4.0,0.0,2.0,138.0,0.0,0 46.0,1.0,4.0,0.0,0.0,120.0,2.0,2 71.0,0.0,4.0,0.0,0.0,125.0,0.0,0 59.0,1.0,1.0,0.0,0.0,162.0,2.0,1 64.0,1.0,1.0,0.0,2.0,155.0,0.0,0 66.0,0.0,3.0,0.0,2.0,152.0,1.0,0 39.0,0.0,3.0,0.0,0.0,152.0,0.0,0 57.0,1.0,2.0,0.0,2.0,164.0,1.0,1 58.0,0.0,4.0,0.0,0.0,131.0,0.0,0 57.0,1.0,4.0,0.0,0.0,143.0,1.0,2 47.0,1.0,3.0,0.0,0.0,179.0,0.0,0 55.0,0.0,4.0,0.0,1.0,130.0,1.0,3 35.0,1.0,2.0,0.0,0.0,174.0,0.0,0 61.0,1.0,4.0,0.0,0.0,161.0,1.0,2 58.0,1.0,4.0,0.0,1.0,140.0,3.0,4 58.0,0.0,4.0,1.0,2.0,146.0,2.0,2 56.0,1.0,2.0,0.0,2.0,163.0,0.0,0 56.0,1.0,2.0,0.0,0.0,169.0,0.0,0 67.0,1.0,3.0,0.0,2.0,150.0,0.0,1 55.0,0.0,2.0,0.0,0.0,166.0,0.0,0 44.0,1.0,4.0,0.0,0.0,144.0,0.0,2 63.0,1.0,4.0,0.0,2.0,144.0,2.0,2 63.0,0.0,4.0,0.0,0.0,136.0,0.0,1 41.0,1.0,2.0,0.0,0.0,182.0,0.0,0 59.0,1.0,4.0,1.0,2.0,90.0,2.0,3 57.0,0.0,4.0,0.0,0.0,123.0,0.0,1 45.0,1.0,1.0,0.0,0.0,132.0,0.0,1 68.0,1.0,4.0,1.0,0.0,141.0,2.0,2 57.0,1.0,4.0,0.0,0.0,115.0,1.0,3 57.0,0.0,2.0,0.0,2.0,174.0,1.0,1 66.0,0.0,3.0,0.0,2.0,152.0,1.0,0 39.0,0.0,3.0,0.0,0.0,152.0,0.0,0 57.0,1.0,2.0,0.0,2.0,164.0,1.0,1 58.0,0.0,4.0,0.0,0.0,131.0,0.0,0 57.0,1.0,4.0,0.0,0.0,143.0,1.0,2 47.0,1.0,3.0,0.0,0.0,179.0,0.0,0 55.0,0.0,4.0,0.0,1.0,130.0,1.0,3 35.0,1.0,2.0,0.0,0.0,174.0,0.0,0 61.0,1.0,4.0,0.0,0.0,161.0,1.0,2 58.0,1.0,4.0,0.0,1.0,140.0,3.0,4 46.0,1.0,4.0,0.0,0.0,120.0,2.0,2 71.0,0.0,4.0,0.0,0.0,125.0,0.0,0 59.0,1.0,1.0,0.0,0.0,162.0,2.0,1 64.0,1.0,1.0,0.0,2.0,155.0,0.0,0 66.0,0.0,3.0,0.0,2.0,152.0,1.0,0 39.0,0.0,3.0,0.0,0.0,152.0,0.0,0 57.0,1.0,2.0,0.0,2.0,164.0,1.0,1 58.0,0.0,4.0,0.0,0.0,131.0,0.0,0 57.0,1.0,4.0,0.0,0.0,143.0,1.0,2 47.0,1.0,3.0,0.0,0.0,179.0,0.0,0 55.0,0.0,4.0,0.0,1.0,130.0,1.0,3 35.0,1.0,2.0,0.0,0.0,174.0,0.0,0 61.0,1.0,4.0,0.0,0.0,161.0,1.0,2 58.0,1.0,4.0,0.0,1.0,140.0,3.0,4 58.0,0.0,4.0,1.0,2.0,146.0,2.0,2 39.0,0.0,3.0,0.0,0.0,179.0,0.0,0 53.0,1.0,4.0,0.0,0.0,95.0,2.0,3 63.0,0.0,4.0,0.0,0.0,169.0,2.0,1 34.0,0.0,2.0,0.0,0.0,192.0,0.0,0 47.0,1.0,4.0,0.0,0.0,143.0,0.0,0 67.0,0.0,3.0,0.0,0.0,172.0,1.0,0 54.0,1.0,4.0,0.0,2.0,108.0,1.0,3 66.0,1.0,4.0,0.0,2.0,132.0,1.0,2 52.0,0.0,3.0,0.0,2.0,169.0,0.0,0 55.0,0.0,4.0,0.0,1.0,117.0,0.0,2 49.0,1.0,3.0,0.0,2.0,126.0,3.0,1 74.0,0.0,2.0,0.0,2.0,121.0,1.0,0 54.0,0.0,3.0,0.0,0.0,163.0,1.0,0 54.0,1.0,4.0,0.0,2.0,116.0,2.0,3 56.0,1.0,4.0,1.0,2.0,103.0,0.0,2 46.0,1.0,4.0,0.0,2.0,144.0,0.0,1 49.0,0.0,2.0,0.0,0.0,162.0,0.0,0 42.0,1.0,2.0,0.0,0.0,162.0,0.0,0 45.0,1.0,4.0,0.0,2.0,185.0,0.0,0 34.0,1.0,1.0,0.0,2.0,174.0,0.0,0 57.0,0.0,4.0,0.0,2.0,159.0,1.0,0 71.0,0.0,3.0,1.0,2.0,130.0,1.0,0 49.0,1.0,3.0,0.0,0.0,139.0,3.0,3 54.0,1.0,2.0,0.0,0.0,156.0,0.0,0 59.0,1.0,4.0,0.0,0.0,162.0,1.0,2 57.0,1.0,3.0,0.0,2.0,150.0,1.0,1 61.0,1.0,4.0,0.0,0.0,140.0,1.0,2 39.0,1.0,4.0,0.0,0.0,140.0,0.0,3 61.0,0.0,4.0,0.0,2.0,146.0,0.0,1 56.0,1.0,4.0,1.0,2.0,144.0,1.0,1 52.0,1.0,1.0,0.0,2.0,190.0,0.0,0 43.0,0.0,4.0,1.0,2.0,136.0,0.0,2 67.0,1.0,4.0,0.0,2.0,108.0,3.0,2 67.0,1.0,4.0,0.0,2.0,129.0,2.0,1 37.0,1.0,3.0,0.0,0.0,187.0,0.0,0 41.0,0.0,2.0,0.0,2.0,172.0,0.0,0 56.0,1.0,2.0,0.0,0.0,178.0,0.0,0 62.0,0.0,4.0,0.0,2.0,160.0,2.0,3 57.0,0.0,4.0,0.0,0.0,163.0,0.0,0 63.0,1.0,4.0,0.0,2.0,147.0,1.0,2 53.0,1.0,4.0,1.0,2.0,155.0,0.0,1 57.0,1.0,4.0,0.0,0.0,148.0,0.0,0 56.0,0.0,2.0,0.0,2.0,153.0,0.0,0 56.0,1.0,3.0,1.0,2.0,142.0,1.0,2 44.0,1.0,2.0,0.0,0.0,173.0,0.0,0 52.0,1.0,3.0,1.0,0.0,162.0,0.0,0 57.0,1.0,3.0,0.0,0.0,174.0,0.0,0 48.0,1.0,2.0,0.0,0.0,168.0,0.0,1 54.0,1.0,4.0,0.0,0.0,160.0,0.0,0 48.0,0.0,3.0,0.0,0.0,139.0,0.0,0 50.0,1.0,2.0,0.0,0.0,160.0,0.0,0 54.0,0.0,2.0,0.0,0.0,150.0,1.0,0 54.0,0.0,2.0,0.0,1.0,155.0,1.0,0 54.0,0.0,2.0,0.0,0.0,130.0,1.0,0 54.0,0.0,2.0,0.0,0.0,130.0,1.0,0 54.0,1.0,1.0,0.0,0.0,137.0,1.0,0 54.0,1.0,2.0,0.0,0.0,142.0,1.0,0 54.0,1.0,2.0,0.0,0.0,154.0,1.0,0 54.0,1.0,2.0,0.0,0.0,110.0,1.0,0 54.0,1.0,2.0,0.0,1.0,130.0,1.0,0 59.0,1.0,4.0,0.0,0.0,140.0,0.0,0 47.0,1.0,4.0,0.0,0.0,98.0,0.0,1 56.0,1.0,4.0,0.0,0.0,120.0,0.0,1 59.0,1.0,4.0,0.0,1.0,131.0,0.0,0
You are training your model with normalized data but predicting on data that is not normalized. X_test is normalized data so the predictions on it are as expected. dummyTest is not normalized. If you normalize the dummyTest variable before feeding it to your neural network like so: dummyTest[0] -= mean dummyTest[0] /= std You will receive the expected output(1)
Convolutional LSTM Model Dimension Incompatibility when making predictions & prediction dimension issues
I structured a Convolutional LSTM model to predict the forthcoming Bitcoin price data, using the analyzed past data of the Bitcoin close price and other features. Let me jump straight to the code: import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' import numpy as np import pandas as pd import matplotlib.pyplot as plt %matplotlib inline import tensorflow as tf import tensorflow.keras as keras import keras_tuner as kt from keras_tuner import HyperParameters as hp from keras.models import Sequential from keras.layers import InputLayer, ConvLSTM1D, LSTM, Flatten, RepeatVector, Dense, TimeDistributed from keras.callbacks import EarlyStopping from tensorflow.keras.metrics import RootMeanSquaredError from tensorflow.keras.optimizers import Adam import keras.backend as K from keras.losses import Huber from sklearn.preprocessing import MinMaxScaler scaler = MinMaxScaler() DIR = '../input/btc-features-targets' SEG_DIR = '../input/segmented' segmentized_features = os.listdir(SEG_DIR) btc_train_features = [] for seg in segmentized_features: train_features = pd.read_csv(f'{SEG_DIR}/{seg}') train_features.set_index('date', inplace=True) btc_train_features.append(scaler.fit_transform(train_features.values)) btc_train_targets = pd.read_csv(f'{DIR}/btc_train_targets.csv') btc_train_targets.set_index('date', inplace=True) btc_test_features = pd.read_csv(f'{DIR}/btc_test_features.csv') btc_tef1 = btc_test_features.iloc[:111] btc_tef2 = btc_test_features.iloc[25:] btc_tef1.set_index('date', inplace=True) btc_tef2.set_index('date', inplace=True) btc_test_targets = pd.read_csv(f'{DIR}/btc_test_targets.csv') btc_test_targets.set_index('date', inplace=True) btc_trt_log = np.log(btc_train_targets) btc_tefs1 = scaler.fit_transform(btc_tef1.values) btc_tefs2 = scaler.fit_transform(btc_tef2.values) btc_tet_log = np.log(btc_test_targets) scaled_train_features = [] for features in btc_train_features: shape = features.shape scaled_train_features.append(np.expand_dims(features, [0,3])) shape_2 = btc_tefs1.shape btc_tefs1 = np.expand_dims(btc_tefs1, [0,3]) shape_3 = btc_tefs2.shape btc_tefs2 = np.expand_dims(btc_tefs2, [0,3]) btc_trt_log = btc_trt_log.values[0] btc_tet_log = btc_tet_log.values[0] def build(hp): model = keras.Sequential() # Input Layer model.add(InputLayer(input_shape=(111,32,1))) # ConvLSTM1D convLSTM_hp_filters = hp.Int(name='convLSTM_filters', min_value=32, max_value=512, step=32) convLSTM_hp_kernel_size = hp.Choice(name='convLSTM_kernel_size', values=[3,5,7]) convLSTM_activation = hp.Choice(name='convLSTM_activation', values=['selu', 'relu']) model.add(ConvLSTM1D(filters=convLSTM_hp_filters, kernel_size=convLSTM_hp_kernel_size, padding='same', activation=convLSTM_activation, use_bias=True, bias_initializer='zeros')) # Flatten model.add(Flatten()) # RepeatVector model.add(RepeatVector(5)) # LSTM LSTM_hp_units = hp.Int(name='LSTM_units', min_value=32, max_value=512, step=32) LSTM_activation = hp.Choice(name='LSTM_activation', values=['selu', 'relu']) model.add(LSTM(units=LSTM_hp_units, activation=LSTM_activation, return_sequences=True)) # TimeDistributed Dense dense_units = hp.Int(name='dense_units', min_value=32, max_value=512, step=32) dense_activation = hp.Choice(name='dense_activation', values=['selu', 'relu']) model.add(TimeDistributed(Dense(units=dense_units, activation=dense_activation))) # TimeDistributed Dense_Output model.add(Dense(1)) # Set Learning Rate hp_learning_rate = hp.Choice(name='learning_rate', values=[1e-2, 1e-3, 1e-4]) # Compile Model model.compile(optimizer=Adam(learning_rate=hp_learning_rate), loss=Huber(), metrics=[RootMeanSquaredError()]) return model tuner = kt.Hyperband(build, objective=kt.Objective('root_mean_squared_error', direction='min'), max_epochs=10, factor=3) early_stop = EarlyStopping(monitor='root_mean_squared_error', patience=5) opt_hps = [] for train_features in scaled_train_features: tuner.search(train_features, btc_trt_log, epochs=50, callbacks=[early_stop]) opt_hps.append(tuner.get_best_hyperparameters(num_trials=1)[0]) models, epochs = ([] for _ in range(2)) for hps in opt_hps: model = tuner.hypermodel.build(hps) models.append(model) history = model.fit(train_features, btc_trt_log, epochs=70, verbose=0) rmse = history.history['root_mean_squared_error'] best_epoch = rmse.index(min(rmse)) + 1 epochs.append(best_epoch) hypermodel = tuner.hypermodel.build(opt_hps[0]) for train_features, epoch in zip(scaled_train_features, epochs): hypermodel.fit(train_features, btc_trt_log, epochs=epoch) tp1 = hypermodel.predict(btc_tefs1).flatten() tp2 = hypermodel.predict(btc_tefs2).flatten() test_predictions = np.concatenate((tp1, tp2[86:]), axis=None) The hyperparameters of the model are configured using keras_tuner; as there were ResourceExhaustError issues output by the notebook when training is done with the full features dataset, sequentially segmented datasets are used instead (and apparently, referring to the study done utilizing the similar model architecture, training is able to be efficiently done through this training approach). The input dimension of each segmented dataset is (111,32,1). There aren't any issues reported until before the last code block. The models work fine. Yet, when the .predict() function is executed, the notebook prints out an error, which states that the dimension of the input features for making predictions is incompatible with the dimension of the input features used while training. I did not understand the reason behind its occurrence, since as far as I know, the input dimensions of a train dataset for a DNN model cannot be identical as the input dimensions of a test dataset. Even though all the price data from 2018 to early 2021 are used as training datasets, predictions are only needed for the mid 2021 timeframe. The dataset used for prediction has a dimension of (136,32,1). I tried matching the dimension of this dataset to (111,32,1), through index slicing. Now this showed issues in the output dimension. While predictions should be made for 136 data points, the result only returned 10. Are there any issues relevant to the model configuration? Cannot interpret the current situation.
Python keras sequential model predicts the same value (y_train average) for all inputs
I'm trying to build a sequential neural network with keras. I generate a dataset with inserting randoms in a known function and train my model with this dataset, long enough to get a steady loss. Then I ask the model to predict the x_train values, but instead of predicting something close to y_train, it returns the same value regardless of the input x. This value also happens to be the average of y_train values. I don't understand what I'm doing wrong and why this is happening. I'm using the following function for training the model: def train_model(x_train,y_train,batch_size,input_size,layer_sizes,activations,optimizer,epochs,loss='MeanSquaredError'): assert len(layer_sizes) == len(activations) n_layers=len(layer_sizes) model = Sequential() model.add(LayerNormalization(input_dim=input_size)) model.add(Dense(layer_sizes[0],kernel_regularizer='l2',kernel_initializer='ones',activation=activations[0],input_dim=input_size,name='layer1')) for i in range(1,n_layers): model.add(Dense(layer_sizes[i],kernel_initializer='ones',activation=activations[i],name=f'layer{i+1}')) model.compile( optimizer = optimizer, loss = loss, #MeanSquaredLogarithmicError ) print(model.summary()) history = model.fit(x_train,y_train,batch_size=batch_size,epochs=epochs) loss_history = history.history['loss'] plt.scatter(x=np.arange(1,epochs+1),y=loss_history) plt.show() return model I then created an arbitrary function (just for test purposes) as: def func(x1,x2,x3,x4): y=(x1**3+(x2*x3+2))/(x4+x2*x1) return y and made a random dataset with this function: def random_points_in_range(n,ranges): points = np.empty((n,len(ranges))) for i,element in enumerate(ranges): start=min(element[1],element[0]) interval=abs(element[1]-element[0]) rand_check = np.random.rand(n) randoms = ( rand_check*interval ) + start points[:,i] = randoms.T return points def generate_random_dataset(n=200,ranges=[(0,10),(0,10),(0,10),(0,10)]): x_dataset = random_points_in_range(n,ranges) y_dataset = np.empty(n) for i in range(n): x1,x2,x3,x4 = x_dataset[i] y_dataset[i] = func(x1,x2,x3,x4) return x_dataset,y_dataset I then train a model with these functions: x_train,y_train = generate_random_dataset() layer_sizes = [6,8,10,10,1] activations = [LeakyReLU(),'relu','swish','relu','linear'] opt = Adam(learning_rate=0.001) epochs = 3000 model=train_model(x_train,y_train,5,4,layer_sizes,activations,opt,epochs,loss='MeanSquaredError') if you want to run the code these are things you need to import: import numpy as np from matplotlib import pyplot as plt from sklearn.model_selection import train_test_split import random from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense from tensorflow.keras.layers import LayerNormalization from tensorflow.keras.optimizers import Adam from tensorflow.keras import regularizers
How do you write Keras model summary to a dataframe?
First, I'll say this is not the way to run a Keras model correctly. There should be a train and test set. The assignment was strictly to develop intuition so no test set. I am running a model through several permutations of neurons, activation functions, batches and layers. Here is the code I am using. from sklearn.datasets import make_classification X1, y1 = make_classification(n_samples=90000, n_features=17, n_informative=6, n_redundant=0, n_repeated=0, n_classes=8, n_clusters_per_class=3, weights=None, flip_y=.3, class_sep=.4, hypercube=False, shift=3, scale=2, shuffle=True, random_state=840780) class_num = 8 # ---------------------------------------------------------------- import itertools final_param_list = [] # param_list_gen order is units, activation function, batch size, layers param_list_gen = [[10, 20, 50], ["sigmoid", "relu", "LeakyReLU"], [8, 16, 32], [1, 2]] for element in itertools.product(*param_list_gen): final_param_list.append(element) # -------------------------------------------------------------------------------------- from keras.models import Sequential from keras.layers import Dense, Dropout, Activation, Flatten, LeakyReLU from keras.callbacks import History import tensorflow as tf import numpy as np import pandas as pd # -------------------------------------------------------------------------------------- # -------- Model 1 - permutations of neurons, activation funtions batch size and layers -------- # for param in final_param_list: q2model1 = Sequential() # hidden layer 1 q2model1.add(Dense(param[0])) if param[1] != 'LeakyReLU': q2model1.add(Activation(param[1])) else: q2model1.add(LeakyReLU(alpha=0.1)) if param[3] == 2: # hidden layer 2 q2model1.add(Dense(param[0])) if param[1] != 'LeakyReLU': q2model1.add(Activation(param[1])) else: q2model1.add(LeakyReLU(alpha=0.1)) # output layer q2model1.add(Dense(class_num, activation='softmax')) q2model1.compile(loss='sparse_categorical_crossentropy', optimizer='RMSProp', metrics=['accuracy']) # Step 3: Fit the model history = q2model1.fit(X1, y1, epochs=20) Seems to work fine. Now, I've been tasked to output the accuracy of each epoch and include the neurons, activation function, batches, layers Now, this gives me all of the accuracies for each epoch print(history.history['acc']) This gives me the params print(param) This gives me a summary although I'm not sure if this is the best approach print(q2model1.summary()) Is there a way to print out each epoch to a pandas dataframe so it looks like this? Phase(list index + 1) | # Neurons | Activation function | Batch size | Layers | Acc epoch1 | Acc epoch2 | ......... | Acc epoch20 That's about it. If you see anything in the model itself that is blatantly wrong or if I am missing some key code please let me know
You can try out: import pandas as pd # assuming you stored your model.fit results in a 'history' variable: history = model.fit(x_train, y_train, epochs=20) # convert the history.history dictionary to a pandas dataframe: hist_df = pd.DataFrame(history.history) # checkout result with print e.g.: print(hist_df) # or the describe() method: hist_df.describe() Keras also have a CSVLogger: https://keras.io/callbacks/#csvlogger which may be of interest.
"Same" network on MATLAB and Keras has very different results
I have been trying to replicate the same simple network structure in MATLAB and Keras. The problem is the accuracy I get is very different. MATLAB code gets accuracy near 0.84 and loss near 17 and Keras code gets accuracy near 0.63 and loss near 130, with Keras using double epochs to train and the same data. I think the difference is too big to be a matter of implementation, so I think I'm missing something. The original code is from a MATLAB example in which I have made a little change to avoid normalization in the first layer. Here is the MATLAB code: % Load the digit training set as 4-D array data using % |digitTrain4DArrayData|. [trainImages,~,trainAngles] = digitTrain4DArrayData; disp("Train Images:") disp(trainImages(:,:,:,1)) % Display 20 random sample training digits using |imshow|. numTrainImages = size(trainImages,4); figure idx = randperm(numTrainImages,20); for i = 1:numel(idx) subplot(4,5,i) imshow(trainImages(:,:,:,idx(i))) drawnow end %% % Combine all the layers together in a |Layer| array. layers = [ ... imageInputLayer([28 28 1], 'Normalization', 'none') convolution2dLayer(12,25) reluLayer fullyConnectedLayer(1) regressionLayer]; %% Train Network' options = trainingOptions('sgdm','InitialLearnRate',0.001, ... 'MaxEpochs',15) net = trainNetwork(trainImages,trainAngles,layers,options) net.Layers %% Test Network [testImages,~,testAngles] = digitTest4DArrayData; predictedTestAngles = predict(net,testImages); % *Evaluate Performance* predictionError = testAngles - predictedTestAngles; thr = 10; numCorrect = sum(abs(predictionError) < thr); numTestImages = size(testImages,4); accuracy = numCorrect/numTestImages %% % Use the root-mean-square error (RMSE) to measure the differences between % the predicted and actual angles of rotation. squares = predictionError.^2; rmse = sqrt(mean(squares)) %% % *Display Box Plot of Residuals for Each Digit Class* residuals = testAngles - predictedTestAngles; residualMatrix = reshape(residuals,500,10); figure boxplot(residualMatrix, ... 'Labels',{'0','1','2','3','4','5','6','7','8','9'}) xlabel('Digit Class') ylabel('Degrees Error') title('Residuals') idx = randperm(numTestImages,49); for i = 1:numel(idx) image = testImages(:,:,:,idx(i)); predictedAngle = predictedTestAngles(idx(i)); imagesRotated(:,:,:,i) = imrotate(image,predictedAngle,'bicubic','crop'); end figure subplot(1,2,1) montage(testImages(:,:,:,idx)) title('Original') subplot(1,2,2) montage(imagesRotated) title('Corrected') Here is the Keras code: import numpy as np import scipy.io import matplotlib.pyplot as plt from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dropout, Flatten, Dense, Activation, Conv2D, BatchNormalization, AveragePooling2D from keras.optimizers import SGD from keras.utils import np_utils from keras import regularizers np.random.seed(1671) # for reproducibility # network and training NB_EPOCH = 30 BATCH_SIZE = 128 VERBOSE = 1 NB_CLASSES = 10 # number of outputs = number of digits OPTIMIZER = SGD() # SGD optimizer, explained later in this chapter N_HIDDEN = 128 VALIDATION_SPLIT=0.2 # how much TRAIN is reserved for VALIDATION data = scipy.io.loadmat('RegressionImageData.mat') XTrain = np.rollaxis(data['XTrain'],3,0) XTest = np.rollaxis(data['XTest'],3,0) YTest = np.squeeze(data['YTest']) YTrain = np.squeeze(data['YTrain']) print("Train Images:") print(XTrain.shape) print(type(XTrain)) print(XTrain) XTrain_test = np.reshape(XTrain, (5000,28,28)) with open("./test.txt", "a+") as file: np.set_printoptions(threshold=np.nan) file.write(np.array2string(XTrain_test[0], max_line_width=np.inf)) model = Sequential() model.add(Conv2D(25,(12,12), input_shape=(28,28,1), strides=(1,1), activation = "relu")) model.add(Flatten()) model.add(Dense(1)) model.summary() sgd = SGD(lr=0.001, decay=0.1, momentum=0.9, nesterov=False) model.compile(loss='mean_squared_error', optimizer=sgd) history = model.fit(XTrain, YTrain, batch_size=BATCH_SIZE, epochs=NB_EPOCH, verbose=VERBOSE, validation_split=VALIDATION_SPLIT, shuffle=False) predictions= model.predict(XTrain) [np.transpose(predictions[1:50]), np.transpose(YTrain[1:50]), np.abs(np.transpose(predictions[1:50])- np.transpose(YTrain[1:50]))] plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) plt.title('rmse') plt.ylabel('loss') plt.xlabel('epoch') plt.legend(['train', 'test'], loc='upper left') plt.show() Ypred_test= model.predict(XTest) Ypred_test=np.reshape(Ypred_test, (5000,)) predictionError = Ypred_test - YTest thr = 10; numCorrect = np.sum((np.abs(predictionError) < thr)*1) numValidationImages = len(YTest) accuracy = numCorrect/numValidationImages print(accuracy) squares = np.power(predictionError,2) rmse = np.sqrt(np.mean(squares)) print(rmse) Anyone know where could be the gap?