Keras model doesn't learn at all - python

My model weights (I output them to weights_before.txt and weights_after.txt) are precisely the same before and after the training, i.e. the training doesn't change anything, there's no fitting happening.
My data look like this (I basically want the model to predict the sign of feature, result is 0 if feature is negative, 1 if positive):
,feature,zerosColumn,result
0,-5,0,0
1,5,0,1
2,-3,0,0
3,5,0,1
4,3,0,1
5,3,0,1
6,-3,0,0
...
Brief summary of my approach:
Load the data.
Split it column-wise to x (feature) and y (result), split these two row-wise to test and validation sets.
Transform these sets into TimeseriesGenerators (not necessary in this scenario but I want to get this setup working and I don't see any reason why it shouldn't).
Create and compile simple Sequential model with few Dense layers and softmax activation on its output layer, use binary_crossentropy as loss function.
Train the model... nothing happens!
Complete code follows:
import keras
import pandas as pd
import numpy as np
np.random.seed(570)
TIMESERIES_LENGTH = 1
TIMESERIES_SAMPLING_RATE = 1
TIMESERIES_BATCH_SIZE = 1024
TEST_SET_RATIO = 0.2 # the portion of total data to be used as test set
VALIDATION_SET_RATIO = 0.2 # the portion of total data to be used as validation set
RESULT_COLUMN_NAME = 'feature'
FEATURE_COLUMN_NAME = 'result'
def create_network(csv_path, save_model):
before_file = open("weights_before.txt", "w")
after_file = open("weights_after.txt", "w")
data = pd.read_csv(csv_path)
data[RESULT_COLUMN_NAME] = data[RESULT_COLUMN_NAME].shift(1)
data = data.dropna()
x = data.ix[:, 1:2]
y = data.ix[:, 3]
test_set_length = int(round(len(x) * TEST_SET_RATIO))
validation_set_length = int(round(len(x) * VALIDATION_SET_RATIO))
x_train_and_val = x[:-test_set_length]
y_train_and_val = y[:-test_set_length]
x_train = x_train_and_val[:-validation_set_length].values
y_train = y_train_and_val[:-validation_set_length].values
x_val = x_train_and_val[-validation_set_length:].values
y_val = y_train_and_val[-validation_set_length:].values
train_gen = keras.preprocessing.sequence.TimeseriesGenerator(
x_train,
y_train,
length=TIMESERIES_LENGTH,
sampling_rate=TIMESERIES_SAMPLING_RATE,
batch_size=TIMESERIES_BATCH_SIZE
)
val_gen = keras.preprocessing.sequence.TimeseriesGenerator(
x_val,
y_val,
length=TIMESERIES_LENGTH,
sampling_rate=TIMESERIES_SAMPLING_RATE,
batch_size=TIMESERIES_BATCH_SIZE
)
model = keras.models.Sequential()
model.add(keras.layers.Dense(10, activation='relu', input_shape=(TIMESERIES_LENGTH, 1)))
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Dense(10, activation='relu'))
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(1, activation='softmax'))
for item in model.get_weights():
before_file.write("%s\n" % item)
model.compile(
loss=keras.losses.binary_crossentropy,
optimizer="adam",
metrics=[keras.metrics.binary_accuracy]
)
history = model.fit_generator(
train_gen,
epochs=10,
verbose=1,
validation_data=val_gen
)
for item in model.get_weights():
after_file.write("%s\n" % item)
before_file.close()
after_file.close()
create_network("data/sign_data.csv", False)
Do you have any ideas?

The problem is that you are using softmax as the activation function of last layer. Essentially, softmax normalizes its input to make the sum of the elements to be one. Therefore, if you use it on a layer with only one unit (i.e. Dense(1,...)), then it would always output 1. To fix this, change the activation function of last layer to sigmoid which outputs a value in the range (0,1).

Related

Bidirectional neural network input does not process the data as predicted, what is wrong here?

i want to do a LSTM/BD neural network with Tensorflow in Python. The input arguments takes 5 arguments and looks like that:
def predict_stock_data_bidirectional(train_data : pd.DataFrame, train_labels, test_data, test_labels, pivot):
train_data_X_length = len(train_data.columns)
train_data_Y_length = len(train_data.index)
print(train_data_X_length)
print("Trying to predict stock data...")
standard_dropout_factor = 0.25
early_stop = keras.callbacks.EarlyStopping(monitor='loss', patience=10, restore_best_weights=True)
model = keras.Sequential()
model.add(BD(LSTM(units=train_data_X_length, return_sequences=True), merge_mode='concat', input_shape=(train_data_X_length, 5)))
model.add(BD(LSTM(units=8, activation='relu', return_sequences=True)))
model.add(BD(LSTM(units=8, activation='relu', return_sequences=True)))
model.add(BD(LSTM(units=8, activation='relu', return_sequences=True)))
model.add(keras.layers.Dense(units = pivot))
model.compile(loss='mse', optimizer='rmsprop')
history = model.fit(train_data, train_labels, epochs = 1000, callbacks=[early_stop], verbose=1)
The pivot variable is the size of a single datapoint (in my case it is 5) in train_data, e.g.:
As you can see, my dataset has a 276x23x5 dataframe. There is no incosisntence in this part of the data, so the mistake can't be here. When i proceed my method to the following snippet: history = model.fit(train_data, train_labels, epochs = 1000, callbacks=[early_stop], verbose=1) i get a strange error:
So, even though, i have pandas dataframe as input (the 4 data sets, train_data etc.), my VSC tells me it has problems with a numpy array.
To give you some more information, here is how he input of my method gets created:
def create_metadata(bounded_data_dict : dict):
data_raw = pd.DataFrame.from_dict(bounded_data_dict, dtype='float')
train_data_X_length = len(data_raw.columns)
train_data_Y_length = len(data_raw.index)
dif = 15
train_data = data_raw.iloc[0:train_data_Y_length - dif, :]
train_labels = train_data.iloc[:, 0]
test_data = data_raw.iloc[train_data_Y_length - dif : train_data_Y_length, :]
train_data = train_data.iloc[:, 1:data_raw.columns.size]
test_labels = test_data.iloc[:, 0]
test_data = test_data.iloc[:, 1:data_raw.columns.size]
return [train_data, train_labels, test_data, test_labels]
Another helpful information to mention is that my "inner" content of the dataset, the 5 datapoints get initialized with a empty list my_list = [] whereas the the 5 elements get appended to it and then filled int the datapoint.

Keras ValueError: No gradients provided for any variable

I've read related threads but not been able to solve my problem.
I'm currently trying to get my model to run in order to classify 5000 different events, which all currently fall under the same category (so my "labels" dataset consists of 5000 1s).
I'm using one hot encoding for my labels data set:
labels = np.loadtxt("/content/drive/My Drive/5000labels1.csv")
from keras.utils import to_categorical
labels=to_categorical(labels) # convert labels to one-hot encoding
I then define my model like so:
inputs = keras.Input(shape=(29,29,1))
x=inputs
x = keras.layers.Conv2D(16, kernel_size=(3,3), name='Conv_1')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool2D((2,2), name='MaxPool_1')(x)
x = keras.layers.Conv2D(16, kernel_size=(3,3), name='Conv_2')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool2D((2,2), name='MaxPool_2')(x)
x = keras.layers.Conv2D(32, kernel_size=(3,3), name='Conv_3')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool2D((2,2), name='MaxPool_3')(x)
x = keras.layers.Flatten(name='Flatten')(x)
x = keras.layers.Dense(64, name='Dense_1')(x)
x = keras.layers.ReLU(name='ReLU_dense_1')(x)
x = keras.layers.Dense(64, name='Dense_2')(x)
x = keras.layers.ReLU(name='ReLU_dense_2')(x)
outputs = keras.layers.Dense(4, activation='softmax', name='Output')(x)
model = keras.Model(inputs=inputs, outputs=outputs, name='VGGlike_CNN')
model.summary()
keras.utils.plot_model(model, show_shapes=True)
OPTIMIZER = tf.keras.optimizers.Adam(learning_rate=LR_ST)
model.compile(optimizer=OPTIMIZER,
loss='categorical_crossentropy',
metrics=['accuracy'],
run_eagerly=False)
def lr_decay(epoch):
if epoch < 10:
return LR_ST
else:
return LR_ST * tf.math.exp(0.2 * (10 - epoch))
lr_scheduler = keras.callbacks.LearningRateScheduler(lr_decay)
model_checkpoint = keras.callbacks.ModelCheckpoint(
filepath='mycnn_best',
monitor='val_accuracy',
save_weights_only=True,
save_best_only=True,
save_freq='epoch')
callbacks = [ lr_scheduler, model_checkpoint ]
print('X_train.shape = ',X_train.shape)
history = model.fit(X_train, epochs=50,
validation_data=X_test, shuffle=True, verbose=1,
callbacks=callbacks)
I get the error: "No gradients provided for any variable: ['Conv_1_2/kernel:0', 'Conv_1_2/bias:0', 'Conv_2_2/kernel:0', 'Conv_2_2/bias:0', 'Conv_3_2/kernel:0', 'Conv_3_2/bias:0', 'Dense_1_2/kernel:0', 'Dense_1_2/bias:0', 'Dense_2_2/kernel:0', 'Dense_2_2/bias:0', 'Output_2/kernel:0', 'Output_2/bias:0']. "
From what I've read, it seems most likely due to a problem with the loss function - but I don't understand what the problem can be. Eventually I want the network to classify events into one of 4 categories, so I used the categorical cross-entropy in order to get a probability associated with each value of number of events.
Can anyone help me? If needed I can provide a link to the google colab file of my original code.
Thanks in advance!
you miss your target
model.fit(X_train, y_train, ..., validation_data = (X_test, y_test))

What this error means: `y` argument is not supported when using python generator as input

I try to develop a network, and use python generator as data provider. Everything looks OK until the model starts to fit, then I receive this error:
ValueError: `y` argument is not supported when using dataset as input.
I proofed every line and, I think the problem is in the format of x_test and y_test feed to the network. After hours of googling, and changing the format several times, the error is still there.
Can you help me to fix it? You can find the whole code below:
import os
import numpy as np
import pandas as pd
import re # To match regular expression for extracting labels
import tensorflow as tf
print(tf.__version__)
def xfiles(filename):
if re.match('^\w{12}_x\.csv$', filename) is None:
return False
else:
return True
def data_generator():
folder = "i:/Stockpred/csvdbase/datasets/DS0002"
file_list = os.listdir(folder)
x_files = list(filter(xfiles, file_list))
x_files.sort()
np.random.seed(1729)
np.random.shuffle(x_files)
for file in x_files:
filespec = folder + '/' + file
xs = pd.read_csv(filespec, header=None)
yfile = file.replace('_x', '_y')
yfilespec = folder + '/' + yfile
ys = pd.read_csv(open(yfilespec, 'r'), header=None, usecols=[1])
xs = np.asarray(xs, dtype=np.float32)
ys = np.asarray(ys, dtype=np.float32)
for i in range(xs.shape[0]):
yield xs[i][1:169], ys[i][0]
dataset = tf.data.Dataset.from_generator(
data_generator,
(tf.float32, tf.float32),
(tf.TensorShape([168, ]), tf.TensorShape([])))
dataset = dataset.shuffle(buffer_size=16000, seed=1729)
# dataset = dataset.batch(4000, drop_remainder=True)
dataset = dataset.cache('R:/Temp/model')
def is_test(i, d):
return i % 4 == 0
def is_train(i, d):
return not is_test(i, d)
recover = lambda i, d: d
test_dataset = dataset.enumerate().filter(is_test).map(recover)
train_dataset = dataset.enumerate().filter(is_train).map(recover)
x_test = test_dataset.map(lambda x, y: x)
y_test = test_dataset.map(lambda x, y: y)
x_train = train_dataset.map(lambda x, y: x)
y_train = train_dataset.map(lambda x, y: y)
print(x_train.element_spec)
print(y_train.element_spec)
print(x_test.element_spec)
print(y_test.element_spec)
# define an object (initializing RNN)
model = tf.keras.models.Sequential()
# first LSTM layer
model.add(tf.keras.layers.LSTM(units=168, activation='relu', return_sequences=True, input_shape=(168, 1)))
# dropout layer
model.add(tf.keras.layers.Dropout(0.2))
# second LSTM layer
model.add(tf.keras.layers.LSTM(units=168, activation='relu', return_sequences=True))
# dropout layer
model.add(tf.keras.layers.Dropout(0.2))
# third LSTM layer
model.add(tf.keras.layers.LSTM(units=80, activation='relu', return_sequences=True))
# dropout layer
model.add(tf.keras.layers.Dropout(0.2))
# fourth LSTM layer
model.add(tf.keras.layers.LSTM(units=120, activation='relu'))
# dropout layer
model.add(tf.keras.layers.Dropout(0.2))
# output layer
model.add(tf.keras.layers.Dense(units=1))
model.summary()
# compile the model
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(x_train.as_numpy_iterator(), y_train.as_numpy_iterator(), batch_size=32, epochs=100)
predicted_stock_price = model.predict(x_test)
everything looks OK until the model starts to fit. and i reciev this error:
ValueError: `y` argument is not supported when using dataset as input.
Can you help to fix it?
As the docs say:
y - Target data. Like the input data x, it could be either Numpy array(s) or TensorFlow tensor(s). It should be consistent with x (you cannot have Numpy inputs and tensor targets, or inversely). If x is a dataset, generator, or keras.utils.Sequence instance, y should not be specified (since targets will be obtained from x).
So, I suppose you should have one generator serving tuples of sample and label.
If you are providing Dataset as input, then
type(train_dataset) should be tensorflow.python.data.ops.dataset_ops.BatchDataset
if so, simply feed this Dataset (which includes your X and y bundle) into the model,
model.fit(train_dataset, batch_size=32, epochs=100)
(Yes, this is a little different convention than how we did in sklearn - X and y separately.)
meanwhile, if you want tensorflow to explicitly use a separate dataset for validation, you must use the kwarg like:
model.fit(train_dataset, validation_data=val_dataset, batch_size=32, epochs=100)
where val_dataset is a separate dataset you had spared for validation during model training. (Not test).
use model.fit_generator, and use tuples (x,y) of input data and labels. So altogether:
model.fit_generator(train_dataset.as_numpy_iterator(),epochs=100)

Accuracy not growing across epochs on keras

I'm new to machine learning and deep learning and I'm trying to classify texts from 5 categories using neural networks. For that, I made a dictionary in order to translate the words to indexes, finally getting an array with lists of indexes. Moreover I change the labels to integers. I also did the padding and that stuff. The problem is that when I fit the model the accuracy keeps quite low (~0.20) and does not change across the epochs. I have tried to change a lot of params, like the size of the vocabulary, number of neurones, dropout probability, optimizer parameter, etc. The key parts of the code are below.
# Arrays with indexes (that works fine)
X_train = tokens_to_indexes(tokenized_tr_mrp, vocab, return_vocab=False)
X_test, vocab_dict = tokens_to_indexes(tokenized_te_mrp, vocab)
# Labels to integers
labels_dict = {}
labels_dict['Alzheimer'] = 0
labels_dict['Bladder Cancer'] = 1
labels_dict['Breast Cancer'] = 2
labels_dict['Cervical Cancer'] = 3
labels_dict['Negative'] = 4
y_train = np.array([labels_dict[i] for i in y_tr])
y_test = np.array([labels_dict[i] for i in y_te])
# One-hot encoding of labels
from keras.utils import to_categorical
encoded_train = to_categorical(y_train)
encoded_test = to_categorical(y_test)
# Padding
max_review_length = 235
X_train_pad = sequence.pad_sequences(X_train, maxlen=max_review_length)
X_test_pad = sequence.pad_sequences(X_test, maxlen=max_review_length)
# Model
# Vocab size
top_words = len(list(vocab_dict.keys()))
# Neurone type
rnn = LSTM
# dropout
set_dropout = True
p = 0.2
# embedding size
embedding_vector_length = 64
# regularization strength
L = 0.0005
# Number of neurones
N = 50
# Model
model = Sequential()
# Embedding layer
model.add(Embedding(top_words,
embedding_vector_length,
embeddings_regularizer=regularizers.l1(l=L),
input_length=max_review_length
#,embeddings_constraint=UnitNorm(axis=1)
))
# Dropout layer
if set_dropout:
model.add(Dropout(p))
# Recurrent layer
model.add(rnn(N))
# Output layer
model.add(Dense(5, activation='softmax'))
# Compilation
model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.001),
metrics=['Accuracy'])
# Split training set for validation
X_tr, X_va, y_tr_, y_va = train_test_split(X_train_pad, encoded_train,
test_size=0.3, random_state=2)
# Parameters
batch_size = 50
# N epochs
n_epocas = 20
best_val_acc = 0
best_val_loss = 1e20
best_i = 0
best_weights = []
acum_tr_acc = []
acum_tr_loss = []
acum_val_acc = []
acum_val_loss = []
# Training
for e in range(n_epocas):
h = model.fit(X_tr, y_tr_,
batch_size=batch_size,
validation_data=(X_va, y_va),
epochs=1, verbose=1)
acum_tr_acc = acum_tr_acc + h.history['accuracy']
acum_tr_loss = acum_tr_loss + h.history['loss']
val_acc = h.history['val_accuracy'][0]
val_loss = h.history['val_loss'][0]
acum_val_acc = acum_val_acc + [val_acc]
acum_val_loss = acum_val_loss + [val_loss]
# if val_acc > best_val_acc:
if val_loss < best_val_loss:
best_i = len(acum_val_acc)-1
best_val_acc = val_acc
best_val_loss = val_loss
best_weights = model.get_weights().copy()
if len(acum_tr_acc)>1 and (len(acum_tr_acc)+1) % 1 == 0:
if e>1:
clear_output()
The code you posted is really bad practice.
You can either train for n_epocas using your current method and add callbacks to get the best weights (ex ModelCheckpoint) or use tf.GradientTape but using model.fit() for one epoch at a time can lead to weird results, since your optimizer doesn't know which epoch it is at.
I suggest keeping your current code but training for n_epocas all in one go and report the results here (accuracy + loss).
Someone gave me the solution. I just had to change this line:
model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.001),
metrics=['Accuracy'])
For this:
model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.001),
metrics=['acc'])
I also changed the lines in the final loop relating to accuracy. The one-hot encoding was necessary as well.

Grid Search for Keras with multiple inputs

I am trying to do a grid search over my hyperparameters for tuning a deep learning architecture. I have multiple input options to the model and I am trying to use sklearn's grid search api. The problem is, grid search api only takes single array as input and the code fails while it checks for the data size dimension.(My input dimension is 5*number of data points while according to sklearn api, it should be number of data points*feature dimension). My code looks something like this:
from keras.layers import Concatenate, Reshape, Input, Embedding, Dense, Dropout
from keras.models import Model
from keras.wrappers.scikit_learn import KerasClassifier
def model(hyparameters):
a = Input(shape=(1,))
b = Input(shape=(1,))
c = Input(shape=(1,))
d = Input(shape=(1,))
e = Input(shape=(1,))
//Some operations and I get a single output -->out
model = Model([a, b, c, d, e], out)
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
k_model = KerasClassifier(build_fn=model, epochs=150, batch_size=512, verbose=2)
# define the grid search parameters
param_grid = hyperparameter options dict
grid = GridSearchCV(estimator=k_model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit([a_input, b_input, c_input, d_input, e_input], encoded_outputs)
this is workaround to use GridSearch and Keras model with multiple inputs. the trick consists in merge all the inputs in a single array. I create a dummy model that receives a SINGLE input and then split it into the desired parts using Lambda layers. the procedure can be easily modified according to your own data structure
def createMod(optimizer='Adam'):
combi_input = Input((3,)) # (None, 3)
a_input = Lambda(lambda x: tf.expand_dims(x[:,0],-1))(combi_input) # (None, 1)
b_input = Lambda(lambda x: tf.expand_dims(x[:,1],-1))(combi_input) # (None, 1)
c_input = Lambda(lambda x: tf.expand_dims(x[:,2],-1))(combi_input) # (None, 1)
## do something
c = Concatenate()([a_input, b_input, c_input])
x = Dense(32)(c)
out = Dense(1,activation='sigmoid')(x)
model = Model(combi_input, out)
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics='accuracy')
return model
## recreate multiple inputs
n_sample = 1000
a_input, b_input, c_input = [np.random.uniform(0,1, n_sample) for _ in range(3)]
y = np.random.randint(0,2, n_sample)
## merge inputs
combi_input = np.stack([a_input, b_input, c_input], axis=-1)
model = tf.keras.wrappers.scikit_learn.KerasClassifier(build_fn=createMod, verbose=0)
batch_size = [10, 20]
epochs = [10, 5]
optimizer = ['adam','SGD']
param_grid = dict(batch_size=batch_size, epochs=epochs)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(combi_input, y)
Another simple and valuable solution

Categories