How do you write Keras model summary to a dataframe? - python
First, I'll say this is not the way to run a Keras model correctly. There should be a train and test set. The assignment was strictly to develop intuition so no test set.
I am running a model through several permutations of neurons, activation functions, batches and layers. Here is the code I am using.
from sklearn.datasets import make_classification
X1, y1 = make_classification(n_samples=90000, n_features=17, n_informative=6, n_redundant=0, n_repeated=0, n_classes=8, n_clusters_per_class=3, weights=None, flip_y=.3, class_sep=.4, hypercube=False, shift=3, scale=2, shuffle=True, random_state=840780)
class_num = 8
# ----------------------------------------------------------------
import itertools
final_param_list = []
# param_list_gen order is units, activation function, batch size, layers
param_list_gen = [[10, 20, 50], ["sigmoid", "relu", "LeakyReLU"], [8, 16, 32], [1, 2]]
for element in itertools.product(*param_list_gen):
final_param_list.append(element)
# --------------------------------------------------------------------------------------
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten, LeakyReLU
from keras.callbacks import History
import tensorflow as tf
import numpy as np
import pandas as pd
# --------------------------------------------------------------------------------------
# -------- Model 1 - permutations of neurons, activation funtions batch size and layers -------- #
for param in final_param_list:
q2model1 = Sequential()
# hidden layer 1
q2model1.add(Dense(param[0]))
if param[1] != 'LeakyReLU':
q2model1.add(Activation(param[1]))
else:
q2model1.add(LeakyReLU(alpha=0.1))
if param[3] == 2:
# hidden layer 2
q2model1.add(Dense(param[0]))
if param[1] != 'LeakyReLU':
q2model1.add(Activation(param[1]))
else:
q2model1.add(LeakyReLU(alpha=0.1))
# output layer
q2model1.add(Dense(class_num, activation='softmax'))
q2model1.compile(loss='sparse_categorical_crossentropy', optimizer='RMSProp', metrics=['accuracy'])
# Step 3: Fit the model
history = q2model1.fit(X1, y1, epochs=20)
Seems to work fine. Now, I've been tasked to output the accuracy of each epoch and include the neurons, activation function, batches, layers
Now, this gives me all of the accuracies for each epoch
print(history.history['acc'])
This gives me the params
print(param)
This gives me a summary although I'm not sure if this is the best approach
print(q2model1.summary())
Is there a way to print out each epoch to a pandas dataframe so it looks like this?
Phase(list index + 1) | # Neurons | Activation function | Batch size | Layers | Acc epoch1 | Acc epoch2 | ......... | Acc epoch20
That's about it. If you see anything in the model itself that is blatantly wrong or if I am missing some key code please let me know
You can try out:
import pandas as pd
# assuming you stored your model.fit results in a 'history' variable:
history = model.fit(x_train, y_train, epochs=20)
# convert the history.history dictionary to a pandas dataframe:
hist_df = pd.DataFrame(history.history)
# checkout result with print e.g.:
print(hist_df)
# or the describe() method:
hist_df.describe()
Keras also have a CSVLogger: https://keras.io/callbacks/#csvlogger which may be of interest.
Related
What causes tensorflow keras Conv1D to only run the 1st epoch?
currently I am using tensorflow to create a neural network with a 1D convolutional layer and Dense layer to predict a single output value. The input array for the neural network is an array of 1500 samples; each sample is an array of 27x13 values. I started training in the same manner as I did without the 1D conv layer, but the training stopped during the first epoch without warning. I found that multiprocessing might be the cause and for that, I should turn multiprocessing off as discussed here: https://github.com/stellargraph/stellargraph/issues/1006 basically adding this to my keras model: use_multiprocessing=False That did not change anything, after which I found that I should probably use a DataSet to bypass multiprocessing issues according to https://github.com/stellargraph/stellargraph/issues/1206 Replace tf.keras.Sequence objects with tf.data.Dataset #1206 after struggling with the difference between tf.data.Dataset.from_tensors and tf.data.Dataset.from_tensor_slices I found the following code to start executing the model.fit block again. As you might have guessed, it still stops running after the first epoch: main loop started Epoch 1/5 Press any key to continue . . . Can someone pinpoint the source of the halting of the program? This is my code: import random import numpy as np from keras import backend as K import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers from keras.models import load_model from keras.callbacks import CSVLogger EPOCHS = 5 BATCH_SIZE = 16 def tfdata_generator(x, y, is_training, batch_size=BATCH_SIZE): '''Construct a data generator using `tf.Dataset`. ''' dataset = tf.data.Dataset.from_tensor_slices((x, y)) if is_training: dataset = dataset.shuffle(1500) # depends on sample size dataset = dataset.batch(BATCH_SIZE) dataset = dataset.repeat() dataset = dataset.prefetch(1) return dataset def main(): print("main loop started") X_train = np.random.randn(1500, 27, 13) Y_train = np.random.randn(1500, 1) training_set = tfdata_generator(X_train, Y_train, is_training=True) data = np.random.randn(1500, 27, 13), Y_train training_set = tf.data.Dataset.from_tensors((X_train, Y_train)) logstring = "C:\Documents\Conv1D" csv_logger = CSVLogger((logstring + ".csv"), append=True, separator=';') early_stopper = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20, min_delta=0.00001) model = keras.Sequential() model.add(layers.Conv1D( filters=10, kernel_size=9, strides=3, padding="valid")) model.add(layers.Flatten()) model.add(layers.Dense(70, activation='relu', name="layer2")) model.add(layers.Dense(1)) optimizer =keras.optimizers.Adam(learning_rate=0.0001) model.compile(optimizer=optimizer, loss="mean_squared_error") # WARNING:tensorflow:multiprocessing can interact badly with TensorFlow, causing nondeterministic deadlocks. For high performance data pipelines tf.data is recommended. model.fit(training_set, epochs = EPOCHS, batch_size=BATCH_SIZE, verbose = 2, #validation_split=0.2, use_multiprocessing=False); model.summary() modelstring = "C:\Documents\Conv1D_finishedmodel" model.save(modelstring, overwrite=True) model = load_model(modelstring) main()
Convolutional LSTM Model Dimension Incompatibility when making predictions & prediction dimension issues
I structured a Convolutional LSTM model to predict the forthcoming Bitcoin price data, using the analyzed past data of the Bitcoin close price and other features. Let me jump straight to the code: import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' import numpy as np import pandas as pd import matplotlib.pyplot as plt %matplotlib inline import tensorflow as tf import tensorflow.keras as keras import keras_tuner as kt from keras_tuner import HyperParameters as hp from keras.models import Sequential from keras.layers import InputLayer, ConvLSTM1D, LSTM, Flatten, RepeatVector, Dense, TimeDistributed from keras.callbacks import EarlyStopping from tensorflow.keras.metrics import RootMeanSquaredError from tensorflow.keras.optimizers import Adam import keras.backend as K from keras.losses import Huber from sklearn.preprocessing import MinMaxScaler scaler = MinMaxScaler() DIR = '../input/btc-features-targets' SEG_DIR = '../input/segmented' segmentized_features = os.listdir(SEG_DIR) btc_train_features = [] for seg in segmentized_features: train_features = pd.read_csv(f'{SEG_DIR}/{seg}') train_features.set_index('date', inplace=True) btc_train_features.append(scaler.fit_transform(train_features.values)) btc_train_targets = pd.read_csv(f'{DIR}/btc_train_targets.csv') btc_train_targets.set_index('date', inplace=True) btc_test_features = pd.read_csv(f'{DIR}/btc_test_features.csv') btc_tef1 = btc_test_features.iloc[:111] btc_tef2 = btc_test_features.iloc[25:] btc_tef1.set_index('date', inplace=True) btc_tef2.set_index('date', inplace=True) btc_test_targets = pd.read_csv(f'{DIR}/btc_test_targets.csv') btc_test_targets.set_index('date', inplace=True) btc_trt_log = np.log(btc_train_targets) btc_tefs1 = scaler.fit_transform(btc_tef1.values) btc_tefs2 = scaler.fit_transform(btc_tef2.values) btc_tet_log = np.log(btc_test_targets) scaled_train_features = [] for features in btc_train_features: shape = features.shape scaled_train_features.append(np.expand_dims(features, [0,3])) shape_2 = btc_tefs1.shape btc_tefs1 = np.expand_dims(btc_tefs1, [0,3]) shape_3 = btc_tefs2.shape btc_tefs2 = np.expand_dims(btc_tefs2, [0,3]) btc_trt_log = btc_trt_log.values[0] btc_tet_log = btc_tet_log.values[0] def build(hp): model = keras.Sequential() # Input Layer model.add(InputLayer(input_shape=(111,32,1))) # ConvLSTM1D convLSTM_hp_filters = hp.Int(name='convLSTM_filters', min_value=32, max_value=512, step=32) convLSTM_hp_kernel_size = hp.Choice(name='convLSTM_kernel_size', values=[3,5,7]) convLSTM_activation = hp.Choice(name='convLSTM_activation', values=['selu', 'relu']) model.add(ConvLSTM1D(filters=convLSTM_hp_filters, kernel_size=convLSTM_hp_kernel_size, padding='same', activation=convLSTM_activation, use_bias=True, bias_initializer='zeros')) # Flatten model.add(Flatten()) # RepeatVector model.add(RepeatVector(5)) # LSTM LSTM_hp_units = hp.Int(name='LSTM_units', min_value=32, max_value=512, step=32) LSTM_activation = hp.Choice(name='LSTM_activation', values=['selu', 'relu']) model.add(LSTM(units=LSTM_hp_units, activation=LSTM_activation, return_sequences=True)) # TimeDistributed Dense dense_units = hp.Int(name='dense_units', min_value=32, max_value=512, step=32) dense_activation = hp.Choice(name='dense_activation', values=['selu', 'relu']) model.add(TimeDistributed(Dense(units=dense_units, activation=dense_activation))) # TimeDistributed Dense_Output model.add(Dense(1)) # Set Learning Rate hp_learning_rate = hp.Choice(name='learning_rate', values=[1e-2, 1e-3, 1e-4]) # Compile Model model.compile(optimizer=Adam(learning_rate=hp_learning_rate), loss=Huber(), metrics=[RootMeanSquaredError()]) return model tuner = kt.Hyperband(build, objective=kt.Objective('root_mean_squared_error', direction='min'), max_epochs=10, factor=3) early_stop = EarlyStopping(monitor='root_mean_squared_error', patience=5) opt_hps = [] for train_features in scaled_train_features: tuner.search(train_features, btc_trt_log, epochs=50, callbacks=[early_stop]) opt_hps.append(tuner.get_best_hyperparameters(num_trials=1)[0]) models, epochs = ([] for _ in range(2)) for hps in opt_hps: model = tuner.hypermodel.build(hps) models.append(model) history = model.fit(train_features, btc_trt_log, epochs=70, verbose=0) rmse = history.history['root_mean_squared_error'] best_epoch = rmse.index(min(rmse)) + 1 epochs.append(best_epoch) hypermodel = tuner.hypermodel.build(opt_hps[0]) for train_features, epoch in zip(scaled_train_features, epochs): hypermodel.fit(train_features, btc_trt_log, epochs=epoch) tp1 = hypermodel.predict(btc_tefs1).flatten() tp2 = hypermodel.predict(btc_tefs2).flatten() test_predictions = np.concatenate((tp1, tp2[86:]), axis=None) The hyperparameters of the model are configured using keras_tuner; as there were ResourceExhaustError issues output by the notebook when training is done with the full features dataset, sequentially segmented datasets are used instead (and apparently, referring to the study done utilizing the similar model architecture, training is able to be efficiently done through this training approach). The input dimension of each segmented dataset is (111,32,1). There aren't any issues reported until before the last code block. The models work fine. Yet, when the .predict() function is executed, the notebook prints out an error, which states that the dimension of the input features for making predictions is incompatible with the dimension of the input features used while training. I did not understand the reason behind its occurrence, since as far as I know, the input dimensions of a train dataset for a DNN model cannot be identical as the input dimensions of a test dataset. Even though all the price data from 2018 to early 2021 are used as training datasets, predictions are only needed for the mid 2021 timeframe. The dataset used for prediction has a dimension of (136,32,1). I tried matching the dimension of this dataset to (111,32,1), through index slicing. Now this showed issues in the output dimension. While predictions should be made for 136 data points, the result only returned 10. Are there any issues relevant to the model configuration? Cannot interpret the current situation.
How to solve the TypeError about Deep Neural Network using my CSV file?
I have a CSV file to train my model. Here is my dataset: Time,Emoji_NUM?,Website_NUM?,Y)1,Y)2,Y)3,Y)4,Y)5,Y)6,Y)7,Y)8,Y)9,Y)10,Y)11,Y)12,Y)13,Y)14,Y)15,Y)16,Y)17,Y)18,Y)19,Y)20,Y)21,Y)22,Y)23,Y)24,Y)25,Y)26,Y)27,Y)28,Y)29,Y)30,Y)31,Y)32,Y)33,Y)34,Y)35,Y)36,Y)37,Y)38,Y)39,Y)40,Y)41,Y)42,Y)43,Y)44,Y)45,Y)46,Y)47,Y)48,Y)49,B)1,B)2,B)3,B)4,B)5,B)6,B)7,B)8,B)9,B)10,B)11,B)12,B)13,B)14,B)15,B)16,B)17,B)18,B)19,B)20,B)21,B)22,B)23,B)24,B)25,B)26,B)27,Target 0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 3,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 3,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 23,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0 23,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0 9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 8,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 8,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 9,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 16,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 12,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 10,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 17,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 1,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 13,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 13,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 14,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 12,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 13,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0 1,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 12,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 20,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 16,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 16,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 13,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 13,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 13,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 13,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 16,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1 The Target column is the output, and the rest of the columns are the features. The following is my code: import pandas as pd import numpy as np input_file = 'data.csv' df = pd.read_csv(input_file, encoding='utf-8') X_Data = df[['Time','Emoji_NUM?','Website_NUM?','Y)1','Y)2','Y)3','Y)4','Y)5','Y)6','Y)7','Y)8','Y)9','Y)10','Y)11','Y)12','Y)13','Y)14','Y)15','Y)16','Y)17','Y)18','Y)19','Y)20','Y)21','Y)22','Y)23','Y)24','Y)25','Y)26','Y)27','Y)28','Y)29','Y)30','Y)31','Y)32','Y)33','Y)34','Y)35','Y)36','Y)37','Y)38','Y)39','Y)40','Y)41','Y)42','Y)43','Y)44','Y)45','Y)46','Y)47','Y)48','Y)49','B)1','B)2','B)3','B)4','B)5','B)6','B)7','B)8','B)9','B)10','B)11','B)12','B)13','B)14','B)15','B)16','B)17','B)18','B)19','B)20','B)21','B)22','B)23','B)24','B)25','B)26','B)27']].values y_Data = df['Target'].values X_Data.shape y_Data.shape #assert len(set(train_X).intersection(valid_X).intersection(test_X)) == 0 print(f"Train has {len(train_y)} data") print(f"Valid has {len(valid_y)} data") print(f"Test has {len(test_y)} data") from sklearn import preprocessing from keras.models import Sequential from keras.layers import Dense, Dropout from keras.optimizers import SGD, Adam model = Sequential() model.add(Dense(64, input_dim=3, activation='relu')) model.add(Dense(1)) model.compile(loss='mse', optimizer=SGD(lr=0.1), metrics=['mse','mape']) import math print("Starting training ") batch_size = math.floor(len(train_y)/5000) dnn = model.fit(train_X, train_y, epochs=20,batch_size=batch_size) When runnning dnn = model.fit(train_X, train_y, epochs=20,batch_size=batch_size), I got the following Error: ValueError: Input 0 of layer sequential_1 is incompatible with the layer: expected axis -1 of input shape to have value 3 but received input with shape (None, 79) How to resolve this?
Ignoring that train_X, train_y etc. aren't defined in your code - right now you're instructing the first layer to take only three values, but want to use 79. This is actually stated in the error message - if you change the input_dim=3 to input_dim=79, it will work.
LSTM model has constant accuracy and doesn't variate
i'm stuck as you can see, with my lstm model. I'm trying to predict the amount of tons to produce per month. When i run the model to train the accuracy is almost constant, it has a minimal variation like: 0.34406 0.34407 0.34408 I tried different combination of activations, initializers and parameters, and the acc don't increase. I don't know if the problem here is my data, my model or this value is the max acc the model can reach. Here is the code (if you notice some libraries unused, its because i made some changes by the first version) import numpy as np import pandas as pd from pandas.tseries.offsets import DateOffset from sklearn.preprocessing import MinMaxScaler, StandardScaler, RobustScaler from sklearn import preprocessing import keras %tensorflow_version 2.x import tensorflow as tf from tensorflow import keras from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Flatten from tensorflow.keras.layers import LSTM from tensorflow.keras.layers import Dropout from keras.optimizers import Adam import warnings warnings.filterwarnings("ignore") %matplotlib inline from plotly.offline import iplot import matplotlib.pyplot as plt import chart_studio.plotly as py import plotly.offline as pyoff import plotly.graph_objs as go df_ventas = pd.read_csv('/content/drive/My Drive/proyectoPanimex/DEOPE.csv', parse_dates=['Data Emissão'], index_col=0, squeeze=True) #df_ventas = df_ventas.resample('M').sum().reset_index() df_ventas = df_ventas.drop(columns= ['weekday', 'month'], axis=1) df_ventas = df_ventas.reset_index() df_ventas = df_ventas.rename(columns= {'Data Emissão':'Fecha','Un':'Cantidad'}) df_ventas['dia'] = [x.day for x in df_ventas.Fecha] df_ventas['mes']=[x.month for x in df_ventas.Fecha] df_ventas['anio']=[x.year for x in df_ventas.Fecha] df_ventas = df_ventas[:-48] df_ventas = df_ventas.drop(columns='Fecha') df_diff = df_ventas.copy() df_diff['cantidad_anterior'] = df_diff['Cantidad'].shift(1) df_diff = df_diff.dropna() df_diff['diferencia'] = (df_diff['Cantidad'] - df_diff['cantidad_anterior']) df_supervised = df_diff.drop(['cantidad_anterior'],axis=1) #adding lags for inc in range(1,31): nombre_columna = 'retraso_' + str(inc) df_supervised[nombre_columna] = df_supervised['diferencia'].shift(inc) df_supervised = df_supervised.dropna() df_supervisedNumpy = df_supervised.to_numpy() train = df_supervisedNumpy scaler = MinMaxScaler(feature_range=(0, 1)) X_train = scaler.fit(train) train = train.reshape(train.shape[0], train.shape[1]) train_scaled = scaler.transform(train) X_train, y_train = train_scaled[:, 1:], train_scaled[:, 0:1] X_train = X_train.reshape(X_train.shape[0], 1, X_train.shape[1]) #LSTM MODEL model = Sequential() act = 'tanh' actF = 'relu' model.add(LSTM(200, activation = act, input_dim=34, return_sequences=True )) model.add(Dropout(0.15)) #model.add(Flatten()) model.add(LSTM(200, activation= act)) model.add(Dropout(0.2)) #model.add(Flatten()) model.add(Dense(200, activation= act)) model.add(Dropout(0.3)) model.add(Dense(1, activation= actF)) optimizer = keras.optimizers.Adam(lr=0.00001) model.compile(optimizer=optimizer, loss=keras.losses.binary_crossentropy, metrics=['accuracy']) history = model.fit(X_train, y_train, batch_size = 100, epochs = 50, verbose = 1) hist = pd.DataFrame(history.history) hist['Epoch'] = history.epoch hist History plot: loss acc Epoch 0 0.847146 0.344070 0 1 0.769400 0.344070 1 2 0.703548 0.344070 2 3 0.698137 0.344070 3 4 0.653952 0.344070 4 As you can see the only value that change its loss, but what is going on with Acc?. I'm starting with machine learning, and i have no more knowledge to can see my errors. Thanks!
A Dense(1, activation='softmax') will always freeze and not learn anything A Dense(1, activation='relu') will very probably freeze and not learn anything A Dense(1, activation='sigmoid') is ideal for classification (binary) problems and somewhat good for regression with values between 0 and 1. A Dense(1, activation='tanh') is somewhat good for regression with values between -1 and 1 A Dense(1, activation='softplus') is somewhat good for regression with values between 0 and +infinite A Dense(1, actiavation='linear') is good for regression in general with no limits (but it's highly recommended that the data be normalized before) For regression, you can't use accuracy, but the metrics 'mae' and 'mse' don't provide "relative" difference, they provide "absolute" mean difference, one linear, the other squared.
Your output activation should be linear for continuous prediction or softmax for classification. Also multiply your learning rate by 100. Your loss should be mean_absolute_error. You could also easily divide your lstm neurons by a factor of 10. The tanh should be replaced by relu or the likes. For your accuracy problem, it makes no sense to use accuracy, since you're not trying to classify. For metrics, you can use mae. You're trying to know how far the prediction is from the actual target, on a continuous scale. Accuracy is for categories, not continuous data.
Get Cell, Input Gate, Output Gate and Forget Gate activation values for LSTM network using Keras
I want to get the activation values for a given input of a trained LSTM network, specifically the values for the cell, the input gate, the output gate and the forget gate. According to this Keras issue and this Stackoverflow question I'm able to get some activation values with the following code: (basically I'm trying to classify 1-dimensional timeseries using one label per timeseries, but that doesn't really matter for this general question) import random from pprint import pprint import keras.backend as K import numpy as np from keras.layers import Dense from keras.layers.recurrent import LSTM from keras.models import Sequential from keras.utils import to_categorical def getOutputLayer(layerNumber, model, X): return K.function([model.layers[0].input], [model.layers[layerNumber].output])([X]) model = Sequential() model.add(LSTM(10, batch_input_shape=(1, 1, 1), stateful=True)) model.add(Dense(2, activation='softmax')) model.compile( loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam') # generate some test data for i in range(10): # generate a random timeseries of 100 numbers X = np.random.rand(10) X = X.reshape(10, 1, 1) # generate a random label for the whole timeseries between 0 and 1 y = to_categorical([random.randint(0, 1)] * 10, num_classes=2) # train the lstm for this one timeseries model.fit(X, y, epochs=1, batch_size=1, verbose=0) model.reset_states() # to keep the output simple use only 5 steps for the input of the timeseries X_test = np.random.rand(5) X_test = X_test.reshape(5, 1, 1) # get the activations for the output lstm layer pprint(getOutputLayer(0, model, X_test)) Using that I get the following activation values for the LSTM layer: [array([[-0.04106992, -0.00327154, -0.01524276, 0.0055838 , 0.00969929, -0.01438944, 0.00211149, -0.04286387, -0.01102304, 0.0113989 ], [-0.05771339, -0.00425535, -0.02032563, 0.00751972, 0.01377549, -0.02027745, 0.00268653, -0.06011265, -0.01602218, 0.01571197], [-0.03069103, -0.00267129, -0.01183739, 0.00434298, 0.00710012, -0.01082268, 0.00175544, -0.0318702 , -0.00820942, 0.00871707], [-0.02062054, -0.00209525, -0.00834482, 0.00310852, 0.0045242 , -0.00741894, 0.00141046, -0.02104726, -0.0056723 , 0.00611038], [-0.05246543, -0.0039417 , -0.01877101, 0.00691551, 0.01250046, -0.01839472, 0.00250443, -0.05472757, -0.01437504, 0.01434854]], dtype=float32)] So I get for each input value 10 values, because I specified in the Keras model to use a LSTM with 10 neurons. But which one is a cell, which is is the input gate, which one the output gate, which one the forget gate?
Well, these are the output values, to get and look into the value of each gate look into this issue I paste the essential part here for i in range(epochs): print('Epoch', i, '/', epochs) model.fit(cos, expected_output, batch_size=batch_size, verbose=1, nb_epoch=1, shuffle=False) for layer in model.layers: if 'LSTM' in str(layer): print('states[0] = {}'.format(K.get_value(layer.states[0]))) print('states[1] = {}'.format(K.get_value(layer.states[1]))) print('Input') print('b_i = {}'.format(K.get_value(layer.b_i))) print('W_i = {}'.format(K.get_value(layer.W_i))) print('U_i = {}'.format(K.get_value(layer.U_i))) print('Forget') print('b_f = {}'.format(K.get_value(layer.b_f))) print('W_f = {}'.format(K.get_value(layer.W_f))) print('U_f = {}'.format(K.get_value(layer.U_f))) print('Cell') print('b_c = {}'.format(K.get_value(layer.b_c))) print('W_c = {}'.format(K.get_value(layer.W_c))) print('U_c = {}'.format(K.get_value(layer.U_c))) print('Output') print('b_o = {}'.format(K.get_value(layer.b_o))) print('W_o = {}'.format(K.get_value(layer.W_o))) print('U_o = {}'.format(K.get_value(layer.U_o))) # output of the first batch value of the batch after the first fit(). first_batch_element = np.expand_dims(cos[0], axis=1) # (1, 1) to (1, 1, 1) print('output = {}'.format(get_LSTM_output([first_batch_element])[0].flatten())) model.reset_states() print('Predicting') predicted_output = model.predict(cos, batch_size=batch_size) print('Ploting Results') plt.subplot(2, 1, 1) plt.plot(expected_output) plt.title('Expected') plt.subplot(2, 1, 2) plt.plot(predicted_output) plt.title('Predicted') plt.show()