Having trouble with Keras One Hot Encoding Memory Error - python

I am trying to build a word level lstm model using keras and for that I need to create one hot encoding for words to feed in the model. I have around 32360 words around 130,000 lines. Each time I attempt to run my model I run into a memory error.
I believe the issue is the size of the dataset. I have been researching this for a couple of days now and it seems the solution is to either: create a generator where I do the one hot encoding and then load my data in batches or to reduce the number of lines I attempt to feed into the model. I cannot quite figure out the generator piece.
The error I get is:
MemoryError: Unable to allocate 143. GiB for an array with shape
(1184643, 32360) and data type int32
Is the generator the correct way to go? Is there any way to solve this otherwise? My code is below:
vocab_size = len(tokenizer.word_index)+1
seq = []
for item in corpus:
seq_list = tokenizer.texts_to_sequences([item])[0]
for i in range(1, len(seq_list)):
n_gram = seq_list[:i+1]
seq.append(n_gram)
max_seq_size = max([len(s) for s in seq])
seq = np.array(pad_sequences(seq, maxlen=max_seq_size, padding='pre'))
input_sequences, labels = seq[:,:-1], seq[:,-1]
one_hot_labels = to_categorical(labels, num_classes=vocab_size, dtype='int32')
n_units = 256
embedding_size = 100
text_in = Input(shape = (None,))
x = Embedding(vocab_size, embedding_size)(text_in)
x = LSTM(n_units)(x)
x = Dropout(0.2)(x)
text_out = Dense(vocab_size, activation = 'softmax')(x)
model = Model(text_in, text_out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
filepath="new_model_weights/weights-{epoch:02d}.hdf5"
checkpoint = ModelCheckpoint(filepath,
monitor='accuracy',
verbose=1,
save_best_only=False,
save_weights_only=False,
mode='auto',
save_freq='epoch')
callbacks_list = [checkpoint]
epochs = 50
batch_size = 10
history = model.fit(input_sequences, one_hot_labels, epochs=epochs, batch_size=batch_size, callbacks=callbacks_list, verbose=1)

Related

Bidirectional neural network input does not process the data as predicted, what is wrong here?

i want to do a LSTM/BD neural network with Tensorflow in Python. The input arguments takes 5 arguments and looks like that:
def predict_stock_data_bidirectional(train_data : pd.DataFrame, train_labels, test_data, test_labels, pivot):
train_data_X_length = len(train_data.columns)
train_data_Y_length = len(train_data.index)
print(train_data_X_length)
print("Trying to predict stock data...")
standard_dropout_factor = 0.25
early_stop = keras.callbacks.EarlyStopping(monitor='loss', patience=10, restore_best_weights=True)
model = keras.Sequential()
model.add(BD(LSTM(units=train_data_X_length, return_sequences=True), merge_mode='concat', input_shape=(train_data_X_length, 5)))
model.add(BD(LSTM(units=8, activation='relu', return_sequences=True)))
model.add(BD(LSTM(units=8, activation='relu', return_sequences=True)))
model.add(BD(LSTM(units=8, activation='relu', return_sequences=True)))
model.add(keras.layers.Dense(units = pivot))
model.compile(loss='mse', optimizer='rmsprop')
history = model.fit(train_data, train_labels, epochs = 1000, callbacks=[early_stop], verbose=1)
The pivot variable is the size of a single datapoint (in my case it is 5) in train_data, e.g.:
As you can see, my dataset has a 276x23x5 dataframe. There is no incosisntence in this part of the data, so the mistake can't be here. When i proceed my method to the following snippet: history = model.fit(train_data, train_labels, epochs = 1000, callbacks=[early_stop], verbose=1) i get a strange error:
So, even though, i have pandas dataframe as input (the 4 data sets, train_data etc.), my VSC tells me it has problems with a numpy array.
To give you some more information, here is how he input of my method gets created:
def create_metadata(bounded_data_dict : dict):
data_raw = pd.DataFrame.from_dict(bounded_data_dict, dtype='float')
train_data_X_length = len(data_raw.columns)
train_data_Y_length = len(data_raw.index)
dif = 15
train_data = data_raw.iloc[0:train_data_Y_length - dif, :]
train_labels = train_data.iloc[:, 0]
test_data = data_raw.iloc[train_data_Y_length - dif : train_data_Y_length, :]
train_data = train_data.iloc[:, 1:data_raw.columns.size]
test_labels = test_data.iloc[:, 0]
test_data = test_data.iloc[:, 1:data_raw.columns.size]
return [train_data, train_labels, test_data, test_labels]
Another helpful information to mention is that my "inner" content of the dataset, the 5 datapoints get initialized with a empty list my_list = [] whereas the the 5 elements get appended to it and then filled int the datapoint.

Training 1660 NNs in a loop. However, on each iteration the training time of the model will slightly increase making it unfeasible

I was currently using the following code to set 1 column equal to zero and consequently retrain the model for all 10 NNs in NN1_List. However, as the model is going through the loop it slowly (very slowly but still is a big deal if I train 1660 NNs) increases the training time of the Neural Network. I checked a variety of websites and implemented all the possible solutions that I could find such as tf.keras.backend.clear_session(), tf.compat.v1.reset_default_graph(), del model, and gc.collect().
r2_list = list()
for i in tf.range(0, len(training_x.columns), 1):
column = training_x.columns[i]
df = training_x.copy()
df[column].values[:] = 0
prediction_list = list()
for j in tf.range(0, len(NN1_List), 1):
np.random.seed(int(seed_list[j]))
random.seed(int(seed_list[j]))
tf.random.set_seed(int(seed_list[j]))
model = keras.Sequential()
model.add(keras.layers.Dense(
units=64,
kernel_regularizer=keras.regularizers.L1(l1=0.00001),
input_shape=(training_x.shape[1],),
activation='relu')
)
model.add(keras.layers.Dense(
units=1))
## Compile Model.
opt = keras.optimizers.Adam(learning_rate=0.01)
model.compile(optimizer=opt,
loss='mean_squared_error')
## Fit Model.
callback = keras.callbacks.EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=5, restore_best_weights=True)
model.fit(x=df,
y=training_y,
validation_data=(validation_x, validation_y),
batch_size=10000,
epochs=100,
callbacks=[callback])
prediction_testing = model.predict(testing_x)
del model
tf.keras.backend.clear_session()
tf.compat.v1.reset_default_graph()
gc.collect()
prediction_list.append(prediction_testing)
prediction_array = np.mean(prediction_list, axis=0).ravel()
r2 = kelly_gu_r_squared(testing_y, prediction_array)
r2_list.append(r2)
I was wondering if you guys could point me in the right direction to fix this problem.

Keras ValueError: No gradients provided for any variable

I've read related threads but not been able to solve my problem.
I'm currently trying to get my model to run in order to classify 5000 different events, which all currently fall under the same category (so my "labels" dataset consists of 5000 1s).
I'm using one hot encoding for my labels data set:
labels = np.loadtxt("/content/drive/My Drive/5000labels1.csv")
from keras.utils import to_categorical
labels=to_categorical(labels) # convert labels to one-hot encoding
I then define my model like so:
inputs = keras.Input(shape=(29,29,1))
x=inputs
x = keras.layers.Conv2D(16, kernel_size=(3,3), name='Conv_1')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool2D((2,2), name='MaxPool_1')(x)
x = keras.layers.Conv2D(16, kernel_size=(3,3), name='Conv_2')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool2D((2,2), name='MaxPool_2')(x)
x = keras.layers.Conv2D(32, kernel_size=(3,3), name='Conv_3')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool2D((2,2), name='MaxPool_3')(x)
x = keras.layers.Flatten(name='Flatten')(x)
x = keras.layers.Dense(64, name='Dense_1')(x)
x = keras.layers.ReLU(name='ReLU_dense_1')(x)
x = keras.layers.Dense(64, name='Dense_2')(x)
x = keras.layers.ReLU(name='ReLU_dense_2')(x)
outputs = keras.layers.Dense(4, activation='softmax', name='Output')(x)
model = keras.Model(inputs=inputs, outputs=outputs, name='VGGlike_CNN')
model.summary()
keras.utils.plot_model(model, show_shapes=True)
OPTIMIZER = tf.keras.optimizers.Adam(learning_rate=LR_ST)
model.compile(optimizer=OPTIMIZER,
loss='categorical_crossentropy',
metrics=['accuracy'],
run_eagerly=False)
def lr_decay(epoch):
if epoch < 10:
return LR_ST
else:
return LR_ST * tf.math.exp(0.2 * (10 - epoch))
lr_scheduler = keras.callbacks.LearningRateScheduler(lr_decay)
model_checkpoint = keras.callbacks.ModelCheckpoint(
filepath='mycnn_best',
monitor='val_accuracy',
save_weights_only=True,
save_best_only=True,
save_freq='epoch')
callbacks = [ lr_scheduler, model_checkpoint ]
print('X_train.shape = ',X_train.shape)
history = model.fit(X_train, epochs=50,
validation_data=X_test, shuffle=True, verbose=1,
callbacks=callbacks)
I get the error: "No gradients provided for any variable: ['Conv_1_2/kernel:0', 'Conv_1_2/bias:0', 'Conv_2_2/kernel:0', 'Conv_2_2/bias:0', 'Conv_3_2/kernel:0', 'Conv_3_2/bias:0', 'Dense_1_2/kernel:0', 'Dense_1_2/bias:0', 'Dense_2_2/kernel:0', 'Dense_2_2/bias:0', 'Output_2/kernel:0', 'Output_2/bias:0']. "
From what I've read, it seems most likely due to a problem with the loss function - but I don't understand what the problem can be. Eventually I want the network to classify events into one of 4 categories, so I used the categorical cross-entropy in order to get a probability associated with each value of number of events.
Can anyone help me? If needed I can provide a link to the google colab file of my original code.
Thanks in advance!
you miss your target
model.fit(X_train, y_train, ..., validation_data = (X_test, y_test))

Accuracy not growing across epochs on keras

I'm new to machine learning and deep learning and I'm trying to classify texts from 5 categories using neural networks. For that, I made a dictionary in order to translate the words to indexes, finally getting an array with lists of indexes. Moreover I change the labels to integers. I also did the padding and that stuff. The problem is that when I fit the model the accuracy keeps quite low (~0.20) and does not change across the epochs. I have tried to change a lot of params, like the size of the vocabulary, number of neurones, dropout probability, optimizer parameter, etc. The key parts of the code are below.
# Arrays with indexes (that works fine)
X_train = tokens_to_indexes(tokenized_tr_mrp, vocab, return_vocab=False)
X_test, vocab_dict = tokens_to_indexes(tokenized_te_mrp, vocab)
# Labels to integers
labels_dict = {}
labels_dict['Alzheimer'] = 0
labels_dict['Bladder Cancer'] = 1
labels_dict['Breast Cancer'] = 2
labels_dict['Cervical Cancer'] = 3
labels_dict['Negative'] = 4
y_train = np.array([labels_dict[i] for i in y_tr])
y_test = np.array([labels_dict[i] for i in y_te])
# One-hot encoding of labels
from keras.utils import to_categorical
encoded_train = to_categorical(y_train)
encoded_test = to_categorical(y_test)
# Padding
max_review_length = 235
X_train_pad = sequence.pad_sequences(X_train, maxlen=max_review_length)
X_test_pad = sequence.pad_sequences(X_test, maxlen=max_review_length)
# Model
# Vocab size
top_words = len(list(vocab_dict.keys()))
# Neurone type
rnn = LSTM
# dropout
set_dropout = True
p = 0.2
# embedding size
embedding_vector_length = 64
# regularization strength
L = 0.0005
# Number of neurones
N = 50
# Model
model = Sequential()
# Embedding layer
model.add(Embedding(top_words,
embedding_vector_length,
embeddings_regularizer=regularizers.l1(l=L),
input_length=max_review_length
#,embeddings_constraint=UnitNorm(axis=1)
))
# Dropout layer
if set_dropout:
model.add(Dropout(p))
# Recurrent layer
model.add(rnn(N))
# Output layer
model.add(Dense(5, activation='softmax'))
# Compilation
model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.001),
metrics=['Accuracy'])
# Split training set for validation
X_tr, X_va, y_tr_, y_va = train_test_split(X_train_pad, encoded_train,
test_size=0.3, random_state=2)
# Parameters
batch_size = 50
# N epochs
n_epocas = 20
best_val_acc = 0
best_val_loss = 1e20
best_i = 0
best_weights = []
acum_tr_acc = []
acum_tr_loss = []
acum_val_acc = []
acum_val_loss = []
# Training
for e in range(n_epocas):
h = model.fit(X_tr, y_tr_,
batch_size=batch_size,
validation_data=(X_va, y_va),
epochs=1, verbose=1)
acum_tr_acc = acum_tr_acc + h.history['accuracy']
acum_tr_loss = acum_tr_loss + h.history['loss']
val_acc = h.history['val_accuracy'][0]
val_loss = h.history['val_loss'][0]
acum_val_acc = acum_val_acc + [val_acc]
acum_val_loss = acum_val_loss + [val_loss]
# if val_acc > best_val_acc:
if val_loss < best_val_loss:
best_i = len(acum_val_acc)-1
best_val_acc = val_acc
best_val_loss = val_loss
best_weights = model.get_weights().copy()
if len(acum_tr_acc)>1 and (len(acum_tr_acc)+1) % 1 == 0:
if e>1:
clear_output()
The code you posted is really bad practice.
You can either train for n_epocas using your current method and add callbacks to get the best weights (ex ModelCheckpoint) or use tf.GradientTape but using model.fit() for one epoch at a time can lead to weird results, since your optimizer doesn't know which epoch it is at.
I suggest keeping your current code but training for n_epocas all in one go and report the results here (accuracy + loss).
Someone gave me the solution. I just had to change this line:
model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.001),
metrics=['Accuracy'])
For this:
model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.001),
metrics=['acc'])
I also changed the lines in the final loop relating to accuracy. The one-hot encoding was necessary as well.

Keras model doesn't learn at all

My model weights (I output them to weights_before.txt and weights_after.txt) are precisely the same before and after the training, i.e. the training doesn't change anything, there's no fitting happening.
My data look like this (I basically want the model to predict the sign of feature, result is 0 if feature is negative, 1 if positive):
,feature,zerosColumn,result
0,-5,0,0
1,5,0,1
2,-3,0,0
3,5,0,1
4,3,0,1
5,3,0,1
6,-3,0,0
...
Brief summary of my approach:
Load the data.
Split it column-wise to x (feature) and y (result), split these two row-wise to test and validation sets.
Transform these sets into TimeseriesGenerators (not necessary in this scenario but I want to get this setup working and I don't see any reason why it shouldn't).
Create and compile simple Sequential model with few Dense layers and softmax activation on its output layer, use binary_crossentropy as loss function.
Train the model... nothing happens!
Complete code follows:
import keras
import pandas as pd
import numpy as np
np.random.seed(570)
TIMESERIES_LENGTH = 1
TIMESERIES_SAMPLING_RATE = 1
TIMESERIES_BATCH_SIZE = 1024
TEST_SET_RATIO = 0.2 # the portion of total data to be used as test set
VALIDATION_SET_RATIO = 0.2 # the portion of total data to be used as validation set
RESULT_COLUMN_NAME = 'feature'
FEATURE_COLUMN_NAME = 'result'
def create_network(csv_path, save_model):
before_file = open("weights_before.txt", "w")
after_file = open("weights_after.txt", "w")
data = pd.read_csv(csv_path)
data[RESULT_COLUMN_NAME] = data[RESULT_COLUMN_NAME].shift(1)
data = data.dropna()
x = data.ix[:, 1:2]
y = data.ix[:, 3]
test_set_length = int(round(len(x) * TEST_SET_RATIO))
validation_set_length = int(round(len(x) * VALIDATION_SET_RATIO))
x_train_and_val = x[:-test_set_length]
y_train_and_val = y[:-test_set_length]
x_train = x_train_and_val[:-validation_set_length].values
y_train = y_train_and_val[:-validation_set_length].values
x_val = x_train_and_val[-validation_set_length:].values
y_val = y_train_and_val[-validation_set_length:].values
train_gen = keras.preprocessing.sequence.TimeseriesGenerator(
x_train,
y_train,
length=TIMESERIES_LENGTH,
sampling_rate=TIMESERIES_SAMPLING_RATE,
batch_size=TIMESERIES_BATCH_SIZE
)
val_gen = keras.preprocessing.sequence.TimeseriesGenerator(
x_val,
y_val,
length=TIMESERIES_LENGTH,
sampling_rate=TIMESERIES_SAMPLING_RATE,
batch_size=TIMESERIES_BATCH_SIZE
)
model = keras.models.Sequential()
model.add(keras.layers.Dense(10, activation='relu', input_shape=(TIMESERIES_LENGTH, 1)))
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Dense(10, activation='relu'))
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(1, activation='softmax'))
for item in model.get_weights():
before_file.write("%s\n" % item)
model.compile(
loss=keras.losses.binary_crossentropy,
optimizer="adam",
metrics=[keras.metrics.binary_accuracy]
)
history = model.fit_generator(
train_gen,
epochs=10,
verbose=1,
validation_data=val_gen
)
for item in model.get_weights():
after_file.write("%s\n" % item)
before_file.close()
after_file.close()
create_network("data/sign_data.csv", False)
Do you have any ideas?
The problem is that you are using softmax as the activation function of last layer. Essentially, softmax normalizes its input to make the sum of the elements to be one. Therefore, if you use it on a layer with only one unit (i.e. Dense(1,...)), then it would always output 1. To fix this, change the activation function of last layer to sigmoid which outputs a value in the range (0,1).

Categories