DQN Training Cycles

DQN Training Cycles - python

So I am currently running a DQN Agent to play the cared game schnapsen. I was wondering how the training cycle works. I am firstly playing against a bot that plays random cards and then saving the states and calculating q values. I know the agent has a specific memory allocated to it(Mine is set to 5000) and then every time its memory exceeds a batchvalue, mine is set to 32, it replays a random mini batch from the memory to retrain the model. My question is, when i cross the threshold and have to retrain my model, so that means that len(agent.memory) > batchsize how many times is this event going to happen? Does this mean that from now on after every move I will have to retrain my network( Since the agents memory is greater than the batch size), or do I have to reset my agents memory? Ultimatley I want to make the agent train for 10,000 games, and if I retrain after every move after a specific point, wont that overfit my network or do damage to any ideal strategies that it might have computed?
I have tried the following approach:
I train the network after 100 games. This means that after 100 games I call the replay() function that takes the random batch sample from the agents memory and retrains the network.
My only problem with this is that when the agents memory is full. Will it start replacing previous states with new ones? Is that beneficial to the training or do I want to avoid that.
For referance I am using the python keras library and this is my NN Model and replay function:
def _build_model(self):
# Neural Net for Deep-Q learning Model
model = Sequential()
model.add(Dense(128, input_dim=self.state_size, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(self.action_size, activation='softmax'))
model.compile(loss='mse',
optimizer=Adam(lr=self.learning_rate))
return model
def get_valid(self,state,valid_moves):
valid_int_moves = [self.move_to_int(move) for move in valid_moves]
act_values = self.target_model.predict(state)[0]
print("Type")
print(type(self.target_model.predict(state)))
#print(len(act_values[0]))
legal_q_values = [act_values[i] for i in valid_int_moves]
#print(legal_q_values)
if len(legal_q_values) == 0:
return 35
action = [np.argmax(legal_q_values)]
#print(action[0])
return action[0]
def replay(self, batch_size):
minibatch = random.sample(self.memory, batch_size)
for state, action, reward, next_state, done,valid_moves,next_valid_moves in minibatch:
target = reward
pred = self.get_valid(next_state,next_valid_moves)
if not done or pred != 35:
target = (reward + self.gamma *
pred)
target_f = self.model.predict(state)
target_f[0][action] = target
self.model.fit(state, target_f, epochs=1, verbose=0)
if self.epsilon > self.epsilon_min:
self.epsilon *= self.epsilon_decay

Related

Image sequence detection with Keras, Convolutional and Stateful Neural Network

I am trying to write a pretty complicated neural network (at least for me) in keras that needs to combine both a common CNN structure and an LSTM/GRU layer.
Basically, I have a dataset of climatological maps of the Mediterranean sea, each map details the wind, pressure and other parameters. I am studying Medicanes (Mediterranean hurricanes) and my goal is to create a neural network that can classify each map with a label zero if there is no trace of such hurricanes or one if the map contains one.
In order to achieve that I need a network with two parts:
feature extractor (normal CNN).
temporal layer (LSTM/GRU).
The main cause of this is that each map is correlated with the previous one because the formation and life cycle of a Medicane can take several days to complete.
Important note: the dataset is too big to be uploaded all at once so I have to work one batch at a time.
I am working with Keras and I found it pretty challenging to adapt its standard framework to my needs so I have come up with some peculiar flow to feed my data into the network.
In particular, I found it hard to pass both my batch size and my time-step parameter to the GRU layer using a more standard alternative.
This is what I tried:
I am positively sure I have overcomplicated the task, but, as I said I am not very proficient with Keras and TensorFlow.
The main problem was that I could not find a way to import the data both in a batch (for RAM reasons) and in a sequence of 10-15 pictures (to be used as the time steps in the GRU layer).
I solved this problem by importing batches of 120 maps in order (no shuffle) and I created a way to turn these batches into the sequence of images I needed then I proceeded to re-batch the sequences and feed them to the model manually.
Data Import
batch_size=120
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
"./Figures_1/Train",
validation_split=None,
subset=None,
labels="inferred",
label_mode="binary",
color_mode="rgb",
interpolation='bilinear',
batch_size=batch_size,
image_size=(600, 600),
shuffle=False,
seed=123
)
Get a sequence of Images
Here, I break down the 120 map batches into sequences of 60 observations, and I return each sequence one at a time.
sequence_lengh=60
def sequence_x(train_dataset):
x_numpy = np.asarray(list(map(lambda x: x[0], tfds.as_numpy(train_dataset))),dtype=object)
for element in range(0,x_numpy.shape[0]):
for i in range(0, x_numpy.shape[0],sequence_lengh):
x_seq = x_numpy[element][i:i+sequence_lengh]
yield x_seq
def sequence_y(train_dataset):
y_numpy = np.asarray(list(map(lambda x: x[1], tfds.as_numpy(train_dataset))),dtype=object)
for element in range(0,y_numpy.shape[0]):
for i in range(0, y_numpy.shape[0],sequence_lengh):
y_seq = y_numpy[element][i:i+sequence_lengh]
yield y_seq
CNN Model
I build the CNN model based on a pre-trained DenseNet
from keras.layers import TimeDistributed, GRU
def build_convnet(shape=(600, 600, 3)):
inputs = keras.Input(shape = shape)
x = inputs
# preprocessing
x = keras.applications.densenet.preprocess_input(x)
#Convbase
x = convBase(x)
x = layers.Flatten()(x)
# Fine tuning
x = keras.layers.Dense(1024, activation='relu')(x)
x = layers.Dropout(0.2)(x)
x = keras.layers.Dense(512, activation='relu')(x)
x = keras.layers.GlobalMaxPool2D()
return x
GRU Model
I build the time part of the network with a GRU layer
def action_model(shape=(15, 600, 600, 3), nbout=15):
# Create our convnet with (112, 112, 3) input shape
convnet = build_convnet(shape[1:]) #[1:]
# then create our final model
model = keras.Sequential()
# add the convnet with (5, 112, 112, 3) shape
model.add(TimeDistributed(convnet, input_shape=shape))
# here, you can also use GRU or LSTM
model.add(GRU(64))
# and finally, we make a decision network
model.add(Dense(1024, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(512, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(128, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(64, activation='relu'))
model.add(Dense(15, activation='softmax'))
return model
Transfer Learning
I retrain a part of the GRU
convBase = DenseNet121(include_top=False, weights=None, input_shape=(600,600,3), pooling="avg")
for layer in convBase.layers:
if 'conv5' in layer.name:
layer.trainable = True
for layer in convBase.layers:
if 'conv4' in layer.name:
layer.trainable = True
Model Compile
Model compilation ( image size= 600x600x3)
INSHAPE=(15, 600, 600, 3) # (5, 112, 112, 3)
model = action_model(INSHAPE, 1)
optimizer = keras.optimizers.Adam(0.001)
model.compile(
optimizer,
'categorical_crossentropy',
metrics='accuracy'
)
Model Fit
Here I manually batch my data. I turn an array (60, 600, 600, 3) into a (4,15,600,600) array. Meaning 4 batches each one containing a 15-map long sequence.
epochs = 10
for value in range(0, epochs):
train_x, train_y = sequence_x(train_ds), sequence_y(train_ds)
val_x, val_y = sequence_x(validation_ds), sequence_y(validation_ds)
for i in range(0,278): #
x = next(train_x, "none")
y = next(train_y, "none")
if (x!="none" or y!="none"):
if (np.any(x) and np.any(y)):
x_stack = np.stack((x[:15], x[15:30], x[30:45], x[45:]))
y_stack = np.stack((y[:15], y[15:30], y[30:45], y[45:]))
y_stack=y_stack.reshape(4,15)
model.fit(x=x_stack, y=y_stack,
validation_data=None,
batch_size=None,
shuffle=False
)
else:
continue
else:
continue
The idea is to get a model that, when presented with a sequence of images, can categorize each one of them with a 0 or a 1 if they have a Medicane or not.
The model does compile without any errors but the results it provides are horrible:
.
What am I doing incorrectly? Is there a more effective way to write all of this?

Keras neural network predicting the same output

I need to develop a neural network with Keras to predict a disease using genetic data. It is known, that predicting this disease is possible even with logistic regression (however the predictions, in this case, are of very poor quality). It's worth mentioning that my data is imbalanced, so I introduced class weights later.
I decided to start with the simplest way to predict it - with a network, analogous to a logistic regression - one hidden layer with one neuron and achieved a bad, yet at least some result - 0.12-0.14 F1 score. Then I tried with 2 hidden and 1 output layers with different amount of neurons in the first hidden layer - from 1 to 8.
It turns out that in some cases it learns something, and in some is predicting the same output for every sample. I displayed the accuracy and loss function over the epochs and this is what I get:
Network loss function by epoch. It's clear that the loss function has roughly the same value, for the training data.
Network accuracy by epoch. It's clear that the accuracy is not improving, but fluctuates from 0 to 1
I searched for similar questions and the suggestions were the following:
Make more neurons - I just have to make it work with 1, 2 or more neurons in the first layer, so I can't add neurons to this one. I increased the amount of neurons in the second hidden layer up to 20, but it then stopped predicting anything with any number oh neurons in the first layer configuration.
Make more layers - I tried adding one more layer, but still have the same problem
To introduce dropout and increase it - what dropout are we talking about if it can learn with just one layer and one neuron in it
Reduce learning rate - decreased it from the default 10^(-3) to 10^(-4)
Reduce batch size - varied it from 500 samples in a minibatch to 1 (stochastic gradient descent)
More epochs - isn't 20 to 50 epochs on a 500'000 sample dataset enough?
Here's the model:
def run_nn_class_weights(data, labels, model):
n_iter = 20
predicted = None
true = None
print('Splitting the data')
x_train, x_valid, y_train, y_valid = train_test_split(data, labels, test_size = 0.05)
#model = create_model()
early_stopping_monitor=EarlyStopping(patience=240)
class_weights = class_weight.compute_class_weight('balanced',
np.unique(labels),
labels)
class_weights = dict(enumerate(class_weights))
hist = model.fit(x_train, y_train, validation_data=[x_valid, y_valid], class_weight=class_weights,
epochs=n_iter, batch_size=500, shuffle=True, callbacks=[early_stopping_monitor],verbose=1)
proba = model.predict(data)
predicted = proba.flatten()
true = labels
return(model, proba, hist)
def old_model_n_pred(n_neurons_1st = 1):
model = Sequential()
model.add(Dense(n_neurons_1st, activation='relu', input_shape=(7516,), kernel_initializer='glorot_normal'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
#model.add(Flatten())
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
return model

This is a small network that should be able to converge to something that's not an atractor (getting stuck on a single value).
I suggest taking a look at the weights of all the neurons with ReLu activation.
ReLus are great because get quick calculations; but half of the relu has derivate of zero, which doesn't help with gradient descent. This might be your case.
In guess in yout case the enemy would be the first neuron.
In order to overcome this problem, I would try to do regularize inputs (to have all samples centered around 0.5 and scaled by the standard deviation). If you do this to a ReLU, you'll make it ignore anything under between [-inf, sd].
if that does not fix part of the problem, swich to a different activation function in the first layer. A sigmoid will work very good and it's not too expensive for just one neuron.
Also, take a close look at your input distribution. What your network actually does is doing a sigmoid-like classification, then using between 4 to 8 neurons to "zoom"/correct on the important parts of the function that the first transformation didn't account for.

Training with keras using fragments of data

I train a sequential model (20 dense layers) in keras (python) using default settings and just 1 epoch.
All layers are activated with relu, except the last on that uses sigmoid.
METHOD A:
Feed model with 1,000,000 records of labeled training data.
METHOD B:
Train model with 50,000 records
Save the model
Do some stuff
Load saved model
Train with another 50,000 records
Repeat until all 1,000,000 records are used
Why is there a discrepancy between the above 2 methods?
I always get better accuracy using all data at once, than using it in groups.
What is the reason for that?
model = Sequential()
model.add(Dense(30, input_dim = 27, activation = 'relu'))
...
model.add(Dense(1, input_dim = 10, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy', optimizer = 'sgd', metrics = ['accuracy'])
model.load_weights(PreviousWeightsFile)
model.fit(X, Y, verbose = 0)
model.save_weights(WeightsFile)
(exit python and do some stuff)

from the documentation, here the crucial model parameters for your question
initial_epoch: Integer. Epoch at which to start training (useful for
resuming a previous training run).
and
epochs: Integer. Number of epochs to train the model. An epoch is an
iteration over the entire x and y data provided. Note that in
conjunction with initial_epoch, epochs is to be understood as "final
epoch". The model is not trained for a number of iterations given by
epochs, but merely until the epoch of index epochs is reached.
You are not using these parameters therefore you are overwriting your weights and are not resuming training like you could with the epochs parameter. That's the reason why your model always performs worse with method B.

With all the data, the interactions between features and the resultant backpropagation will be more accurate with all present data; this allows for features and the architecture of the model to build upon additional epochs.
When you save and reload you essentially restart this.

How to choose batch_size, steps_per_epoch and epoch with Keras generator

I'm training 2 different CNN (custom and transfer learning) for an image classification problem.
I use the same generator for both models.
The dataset contains 5000 samples for 5 classes, but is imbalanced.
Here's the custom model I'm using.
def __init__(self, transfer_learning = False, lambda_reg = 0.001, drop_out_rate = 0.1):
if(transfer_learning == False):
self.model = Sequential();
self.model.add(Conv2D(32, (3,3), input_shape = (224,224,3), activation = "relu"))
self.model.add(MaxPooling2D(pool_size = (2,2)))
self.model.add(Conv2D(64, (1,1), activation = "relu"))
self.model.add(MaxPooling2D(pool_size = (2,2)))
self.model.add(Conv2D(128, (3,3), activation = "relu"))
self.model.add(MaxPooling2D(pool_size = (2,2)))
self.model.add(Conv2D(128, (1,1), activation = "relu"))
self.model.add(MaxPooling2D(pool_size = (2,2)))
self.model.add(Flatten())
self.model.add(Dense(512))
self.model.add(Dropout(drop_out_rate))
self.model.add(Dense(256))
self.model.add(Dropout(drop_out_rate))
self.model.add(Dense(5, activation = "softmax"))
So I can't understand the relation between steps_per_epoch and batch_size.
batch_size is the number of samples the generator sends.
But is steps_per_epoch the number of batch_size to complete one training epoch?
If so, then it should be: steps_per_epoch = total_samples/batch_size ?
Whatever value I try, I always get the same problem (on both models), the val_acc seems to reach a local optima.

You are mixing two issues here. One is how to determine batch_size vs steps_per_epoch; the other one is why val_acc seems to reach a local optima and won't continue improving.
(1) For the issue -- batch_size vs steps_per_epoch
The strategy should be first to maximize batch_size as large as the memory permits, especially when you are using GPU (4~11GB). Normally batch_size=32 or 64 should be fine, but in some cases, you'd have to reduce to 8, 4, or even 1. The training code will throw exceptions if there is not enough memory to allocate, so you know when to stop increasing the batch_size.
Once batch_size is set, steps_per_epoch can be calculated by Math.ceil(total_samples/batch_size). but sometimes, you may want to set it a few times larger when data augmentation is used.
(2) The second issue -- val_acc reaches local optima, won't continue improving
It is the crux of the matter for deep learning, isn't it? It makes DL both exciting and difficult at the same time. The batch_size, steps_per_epoch and number of epochs won't help much here. It is the model and the hyperparameters (such as learning rate, loss function, optimization function, etc.) that controls how the model performs.
A few easy tips are to try different learning rates, different optimization functions. If you find the model is overfitting (val_acc going down with more epochs), increasing the sample size always helps if it is possible. Data augmentation helps to some degree.

First of all, steps_per_epoch = total_samples/batch_size is correct in general terms.
It's an example code written by tensowflow as following:
for epoch in range(training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples / batch_size)
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
feed_dict = {X: batch_xs, Y: batch_ys}
c, _ = sess.run([cost, optimizer], feed_dict=feed_dict)
avg_cost += c / total_batch
print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))
print('Learning Finished!')
By the way, Although It is not exactly related with your question. There are some various optimizer such as Stochastic Gradient Descent and Adam because that a learning takes too long time with heavy data set.
It does not learn all data every time. There are many articles about that. Here I just leave one of them.
And, For your val_acc, It seems that Your model has so many Convolution layer.
You reduced filters and maxpooling of convolution layers, But, I think it is too much. How is going on? Is it better than before?

TensorFlow 2.0: Eager execution of training either returns bad results or doesn't learn at all

I am experimenting with TensorFlow 2.0 (alpha). I want to implement a simple feed forward Network with two output nodes for binary classification (it's a 2.0 version of this model).
This is a simplified version of the script. After I defined a simple Sequential() model, I set:
# import layers + dropout & activation
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.activations import elu, softmax
# Neural Network Architecture
n_input = X_train.shape[1]
n_hidden1 = 15
n_hidden2 = 10
n_output = y_train.shape[1]
model = tf.keras.models.Sequential([
Dense(n_input, input_shape = (n_input,), activation = elu), # Input layer
Dropout(0.2),
Dense(n_hidden1, activation = elu), # hidden layer 1
Dropout(0.2),
Dense(n_hidden2, activation = elu), # hidden layer 2
Dropout(0.2),
Dense(n_output, activation = softmax) # Output layer
])
# define loss and accuracy
bce_loss = tf.keras.losses.BinaryCrossentropy()
accuracy = tf.keras.metrics.BinaryAccuracy()
# define optimizer
optimizer = tf.optimizers.Adam(learning_rate = 0.001)
# save training progress in lists
loss_history = []
accuracy_history = []
# loop over 1000 epochs
for epoch in range(1000):
with tf.GradientTape() as tape:
# take binary cross-entropy (bce_loss)
current_loss = bce_loss(model(X_train), y_train)
# Update weights based on the gradient of the loss function
gradients = tape.gradient(current_loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
# save in history vectors
current_loss = current_loss.numpy()
loss_history.append(current_loss)
accuracy.update_state(model(X_train), y_train)
current_accuracy = accuracy.result().numpy()
accuracy_history.append(current_accuracy)
# print loss and accuracy scores each 100 epochs
if (epoch+1) % 100 == 0:
print(str(epoch+1) + '.\tTrain Loss: ' + str(current_loss) + ',\tAccuracy: ' + str(current_accuracy))
accuracy.reset_states()
print('\nTraining complete.')
Training goes without errors, however strange things happen:
Sometimes, the Network doesn't learn anything. All loss and accuracy scores are constant throughout all the epochs.
Other times, the network is learning, but very very badly. Accuracy never went beyond 0.4 (while in TensorFlow 1.x I got an effortless 0.95+). Such a low performance suggests me that something went wrong in the training.
Other times, the accuracy is very slowly improving, while the loss remains constant all the time.
What can cause these problems? Please help me understand my mistakes.
UPDATE:
After some corrections, I can make the Network learn. However, its performance is extremely poor. After 1000 epochs, it reaches about %40 accuracy, which clearly means something is still wrong. Any help is appreciated.

The tf.GradientTape is recording every operation that happens inside its scope.
You don't want to record in the tape the gradient calculation, you only want to compute the loss forward.
with tf.GradientTape() as tape:
# take binary cross-entropy (bce_loss)
current_loss = bce_loss(model(df), classification)
# End of tape scope
# Update weights based on the gradient of the loss function
gradients = tape.gradient(current_loss, model.trainable_variables)
# The tape is now consumed
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
More importantly, I don't see the loop on the training set, therefore I suppose the complete code looks like:
for epoch in range(n_epochs):
for df, classification in dataset:
# your code that computes loss and trains
Moreover, the usage of the metrics is wrong.
You want to accumulate, thus update the internal state of the accuracy operation, at every training step and measure the overall accuracy at the end of every epoch.
Thus you have to:
# Measure the accuracy inside the training loop
accuracy.update_state(model(df), classification)
And call accuracy.result() only at the end of the epoch, when all the accuracy value have been saved into the metric.
Remember to call to the .reset_states() method to clears the variable states, resetting it to zero at the end of every epoch.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

DQN Training Cycles - python

Related

Image sequence detection with Keras, Convolutional and Stateful Neural Network

Keras neural network predicting the same output

Training with keras using fragments of data

How to choose batch_size, steps_per_epoch and epoch with Keras generator

TensorFlow 2.0: Eager execution of training either returns bad results or doesn't learn at all

Categories

Resources