Keras-RL2 and Tensorflow 1-2 Incompatibility

Keras-RL2 and Tensorflow 1-2 Incompatibility - python

I am getting;
tensorflow.python.framework.errors_impl.OperatorNotAllowedInGraphError: Using a symbolic `tf.Tensor` as a Python `bool` is not allowed in Graph execution. Use Eager execution or decorate this function with #tf.function.
Error while I'm trying to fit DDPG agent over custom environment.
Here is the CustomEnv()
class CustomEnv(Env):
def __init__(self):
print("Test_3 : Init")
"""NOTE: Bool array element definition for Box action space needs to be determined !!!!"""
self.action_space = Tuple((Box(low=4, high=20, shape=(1, 1)),
Box(low=0, high=1, shape=(1, 1)),
MultiBinary(1),
MultiBinary(1),
Box(low=4, high=20, shape=(1, 1)),
Box(low=0, high=1, shape=(1, 1)),
MultiBinary(1),
MultiBinary(1),
Box(low=0, high=100, shape=(1, 1)),
Box(low=0, high=100, shape=(1, 1))))
"""Accuracy array"""
self.observation_space = Box(low=np.asarray([0]), high=np.asarray([100]))
"""Initial Space"""
self.state = return_Acc(directory=source_dir, input_array=self.action_space.sample())
self.episode_length = 20
print(f"Action Space sample = {self.action_space.sample()}")
print("Test_3 : End Init")
def step(self, action):
print(f"Model Action Space Output = {action}")
print("Test_2 : Step")
accuracy_of_model = random.randint(0,100)#return_Acc(directory=source_dir, input_array=action)
self.state = accuracy_of_model#round(100*abs(accuracy_of_model))
self.episode_length -= 1
# Calculating the reward
print(f"self.state = {self.state}, accuracy_of_model = {accuracy_of_model}")
if (self.state > 60):
reward = self.state
else:
reward = -(60-self.state)*10
if self.episode_length <= 0:
done = True
else:
done = False
# Setting the placeholder for info
info = {}
# Returning the step information
print("Test_2 : End Step")
return self.state, reward, done, info
def reset(self):
print("Test_1 : Reset")
self.state = 50
print(f"Self state = {self.state}")
self.episode_length = 20
print("Test_1 : End Reset")
return self.state
return_Acc function runs a Random Decision Forrest Model and return it's accuracy to DDPG model for determining next step's parameters. For the last my DDPG model as given below;
states = env.observation_space.shape
actions = np.asarray(env.action_space.sample()).size
print(f"states = {states}, actions = {actions}")
def model_creation(states, actions):
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(32, activation='relu', input_shape=states))
model.add(tf.keras.layers.Dense(24, activation='relu'))
model.add(tf.keras.layers.Dense(actions, activation='linear'))
model.build()
return model
model = model_creation(states, actions)
model.summary()
def build_agent(model, actions, critic):
policy = BoltzmannQPolicy()
memory = SequentialMemory(limit=50000, window_length=1)
nafa = DDPGAgent(nb_actions=actions, actor=model, memory=memory, critic=critic, critic_action_input=action_input)
#dqn = DQNAgent(model=model, memory=memory, policy=policy,
# nb_actions=actions, nb_steps_warmup=10, target_model_update=1e-2)
return nafa
action_input = Input(shape=(actions,), name='action_input')
observation_input = Input(shape=(1,) + env.observation_space.shape, name='observation_input')
flattened_observation = Flatten()(observation_input)
x = Concatenate()([action_input, flattened_observation])
x = Dense(32)(x)
x = Activation('relu')(x)
x = Dense(32)(x)
x = Activation('relu')(x)
x = Dense(32)(x)
x = Activation('relu')(x)
x = Dense(1)(x)
x = Activation('linear')(x)
critic = Model(inputs=[action_input, observation_input], outputs=x)
print(critic.summary())
dqn = build_agent(model, actions, critic)
dqn.compile(tf.keras.optimizers.Adam(learning_rate=1e-3), metrics=['mae'])
dqn.fit(env, nb_steps=200, visualize=False, verbose=1)
results = dqn.test(env, nb_episodes=500, visualize=False)
print(f"episode_reward = {np.mean(results.history['episode_reward'])}")
I tried most of the solutions that I found here like
tf.compat.v1.enable_eager_execution()
and combination of this with other functions. (Such as enable_v2_behaviour()) But I couldn't able to make this worked. If I don't run RDF model inside DDPG then there is no problem occurring. If it's possible how can I connect RDf model accuracy output to self.state as an input.
keras-rl2 1.0.5
tensorflow-macos 2.10.0
And I'm using M1 based mac if that's matter.

To anyone interested with the solution I came up with a slower but at least working solution. It's actually simpler than expected. Just insert a command which runs the model script from terminal and write its output to a text file, than read that text file from RL agent script and again write the action space values to a text file which then can be red from model to create observation.

Related

Constant loss and accuracy in pytorch

I am training a model whose output and ground truth should be binary. It's an inception based two stream models. Inception architecture is used as an encoder and for decoder a custom based model is designed consisting of conv layers, batch normalization, up sampling and using tanh as non linearity.I have tried with relu but still no result.
Model is initializing at different values but not updating. My model's forward function is:
def forward(self, inp):
# Preprocessing
out = self.conv3d_1a_7x7(inp)
skip1 = out
out = self.maxPool3d_2a_3x3(out)
out = self.dropout(out)
out = self.conv3d_2b_1x1(out)
out = self.conv3d_2c_3x3(out)
out = self.maxPool3d_3a_3x3(out)
out = self.dropout(out)
out = self.mixed_3b(out)
skip2 = out
out = self.mixed_3c(out)
out = self.maxPool3d_4a_3x3(out)
out = self.dropout(out)
out = self.mixed_4b(out)
out = self.mixed_4c(out)
out = self.dropout(out)
out = self.mixed_4d(out)
skip3 = out
out = self.dropout(out)
out = self.mixed_4e(out)
out = self.mixed_4f(out)
out = self.maxPool3d_5a_2x2(out)
out = self.dropout(out)
out = self.mixed_5b(out)
out = self.mixed_5c(out)
out = self.dropout(out)
out = self.tconv6(out, skip1,skip2,skip3)
out = self.sigmoid(out)
print("Before permutation", out.shape)
out = out.permute(0,1,3,4,2)
out_logits = out
return out, out_logits
My train function is:
misc,out_logits[stream] = models[stream](data[stream])
out_softmax = torch.nn.functional.softmax(out_logits[stream], 1).requires_grad_()
val, preds = torch.max(out_logits[stream].data, 1)
preds = preds.to(device, dtype=torch.float)
gt = torch.round(gt)
gt_avg = torch.mean(gt)
gt[gt>gt_avg] = 1
gt[gt<=gt_avg] = 0
out_logits[stream] = out_logits[stream].squeeze(1)
losses[stream] = criterion(preds.cpu(), gt.cpu()).requires_grad_()
if phase == 'train':
optimizers[stream].zero_grad()
losses[stream].backward(retain_graph=True)
optimizers[stream].step()
running_losses[stream] += losses[stream].item() * data[stream].shape[0]
running_corrects[stream] += torch.sum(val.cpu() == gt_c.data.cpu()).item()
correct_t = torch.sum(preds==gt_c).item()
total_t = gt_c.shape[0]*gt_c.shape[1]*gt_c.shape[2]*gt_c.shape[3]
acc_epc = 100*correct_t/total_t
for scheduler in schedulers.values():
scheduler.step()
My loss and accuracy is always constant shown here
I have tried using different optimizers like SGD, Adam , RMSprop. Furthermore, I have tried tuning the hyperparameters but model is not converging. What am I missing?

You send the wrong variable into loss fuction if you are doing crossentropy. Change preds to out_logits[stream] and there's no need to do .cpu() and require_grad().
losses[stream] = criterion(out_logits[stream], gt)
Also, you performed argmax for preds. It's not differentiable regardless the loss function you used.

Unexpected shape of output from raw_rnn and how to inspect weights in raw_rnn

I have a simple code below for testing a RNN cell by feeding previous output as current input.
I was to do this after training.
When I call
tf.compat.v1.nn.raw_rnn(cell, rnn_loop)
after training I want it to use the weights that were achieved in training using another
tf.compat.v1.nn.raw_rnn(cell, rnn_loop)
Will the weights be the same or will the weights for raw_rnn during testing be initialized from zero? I will not run sess.run(tf.initialize_all_variables). I want know if I can safely call
tf.compat.v1.nn.raw_rnn(cell, rnn_loop) twice and still be using the same weights.
I also want to know how to inspect the trained weight values? so that I can confirm this.
The shape of rnn_outputs_tensor is (None,64,128) but I am expecting (10,64,128) because there are 10 steps (HORIZON) right?
print(rnn_outputs_tensor.shape)
import tensorflow as tf
tf.compat.v1.disable_eager_execution()
state_size = 128
BATCH_SIZE = 64
HORIZON = 10
cell = tf.compat.v1.nn.rnn_cell.BasicRNNCell(state_size)
class RnnLoop:
def __init__(self, initial_state, cell):
self.initial_state = initial_state
self.cell = cell
def __call__(self, time, cell_output, cell_state, loop_state):
emit_output = cell_output # == None for time == 0
if cell_output is None: # time == 0
initial_input = tf.fill([BATCH_SIZE, state_size], 0.0)
next_input = initial_input
next_cell_state = self.initial_state
else:
next_input = cell_output
next_cell_state = cell_state
elements_finished = (time >= HORIZON)
next_loop_state = None
return elements_finished, next_input, next_cell_state, emit_output, next_loop_state
initial_state_tensor = tf.zeros((BATCH_SIZE,state_size),dtype=tf.float32)
rnn_loop = RnnLoop(initial_state=initial_state_tensor, cell=cell)
rnn_outputs_tensor_array, _, _ = tf.compat.v1.nn.raw_rnn(cell, rnn_loop)
rnn_outputs_tensor = rnn_outputs_tensor_array.stack()
print(rnn_outputs_tensor.shape)
var = [v for v in tf.compat.v1.trainable_variables()]
print(var)

Delete model from GPU/CPU in Pytorch

I have a big issue with memory. I am developing a big application with GUI for testing and optimizing neural networks. The main program is showing the GUI, but training is done in thread. In my app I need to train many models with different parameters one after one. To do this I need to create a model for each attempt. When I train one I want to delete it and train new one, but I cannot delete old model. I am trying to do something like this:
del model
torch.cuda.empty_cache()
but GPU memory doesn't change,
then i tried to do this:
model.cpu()
del model
When I move model to CPU, GPU memory is freed but CPU memory increase.
In each attempt of training, memory is increasing all the time. Only when I close my app and run it again the all memory is freed.
Is there a way to delete model permanently from GPU or CPU?
Edit:
Code:
Thread, where the procces of training take pleace:
class uczeniegridsearcch(QObject):
endofoneloop = pyqtSignal()
endofonesample = pyqtSignal()
finished = pyqtSignal()
def __init__(self, train_loader, test_loader, epoch, optimizer, lenoftd, lossfun, numberofsamples, optimparams, listoflabels, model_name, num_of_class, pret):
super(uczeniegridsearcch, self).__init__()
self.train_loaderup = train_loader
self.test_loaderup = test_loader
self.epochup = epoch
self.optimizername = optimizer
self.lenofdt = lenoftd
self.lossfun = lossfun
self.numberofsamples = numberofsamples
self.acc = 0
self.train_loss = 0
self.sendloss = 0
self.optimparams = optimparams
self.listoflabels = listoflabels
self.sel_Net = model_name
self.num_of_class = num_of_class
self.sel_Pret = pret
self.modelforsend = []
def setuptrainmodel(self):
if self.sel_Net == "AlexNet":
model = models.alexnet(pretrained=self.sel_Pret)
model.classifier[6] = torch.nn.Linear(4096, self.num_of_class)
elif self.sel_Net == "ResNet50":
model = models.resnet50(pretrained=self.sel_Pret)
model.fc = torch.nn.Linear(model.fc.in_features, self.num_of_class)
elif self.sel_Net == "VGG13":
model = models.vgg13(pretrained=self.sel_Pret)
model.classifier[6] = torch.nn.Linear(model.classifier[6].in_features, self.num_of_class)
elif self.sel_Net == "DenseNet201":
model = models.densenet201(pretrained=self.sel_Pret)
model.classifier = torch.nn.Linear(model.classifier.in_features, self.num_of_class)
elif self.sel_Net == "MNASnet":
model = models.mnasnet1_0(pretrained=self.sel_Pret)
model.classifier[1] = torch.nn.Linear(model.classifier[1].in_features, self.num_of_class)
elif self.sel_Net == "ShuffleNet v2":
model = models.shufflenet_v2_x1_0(pretrained=self.sel_Pret)
model.fc = torch.nn.Linear(model.fc.in_features, self.num_of_class)
elif self.sel_Net == "SqueezeNet":
model = models.squeezenet1_0(pretrained=self.sel_Pret)
model.classifier[1] = torch.nn.Conv2d(512, self.num_of_class, kernel_size=(1, 1), stride=(1, 1))
model.num_classes = self.num_of_class
elif self.sel_Net == "GoogleNet":
model = models.googlenet(pretrained=self.sel_Pret)
model.fc = torch.nn.Linear(model.fc.in_features, self.num_of_class)
return model
def train(self):
for x in range(self.numberofsamples):
torch.cuda.empty_cache()
modelup = self.setuptrainmodel()
device = torch.device('cuda')
optimizerup = TableWidget.setupotimfun(self, modelup, self.optimizername, self.optimparams[(x, 0)],
self.optimparams[(x, 1)], self.optimparams[(x, 2)],
self.optimparams[(x, 3)],
self.optimparams[(x, 4)], self.optimparams[(x, 5)])
modelup = modelup.to(device)
best_accuracy = 0.0
train_error_count = 0
for epoch in range(self.epochup):
for images, labels in iter(self.train_loaderup):
images = images.to(device)
labels = labels.to(device)
optimizerup.zero_grad()
outputs = modelup(images)
loss = TableWidget.setuplossfun(self, lossfun=self.lossfun, outputs=outputs, labels=labels)
self.train_loss += loss
loss.backward()
optimizerup.step()
train_error_count += float(torch.sum(torch.abs(labels - outputs.argmax(1))))
self.train_loss /= len(self.train_loaderup)
test_error_count = 0.0
for images, labels in iter(self.test_loaderup):
images = images.to(device)
labels = labels.to(device)
outputs = modelup(images)
test_error_count += float(torch.sum(torch.abs(labels - outputs.argmax(1))))
test_accuracy = 1.0 - float(test_error_count) / float(self.lenofdt)
print('%s, %d,%d: %f %f' % ("Próba nr:", x+1, epoch, test_accuracy, self.train_loss), "Parametry: ", self.optimparams[x,:])
self.acc = test_accuracy
self.sendloss = self.train_loss.item()
self.endofoneloop.emit()
self.endofonesample.emit()
modelup.cpu()
del modelup,optimizerup,device,test_accuracy,test_error_count,train_error_count,loss,labels,images,outputs
torch.cuda.empty_cache()
self.finished.emit()
How I call thread in main block:
self.qtest = uczeniegridsearcch(self.train_loader,self.test_loader, int(self.InputEpoch.text()),
self.sel_Optim,len(self.test_dataset), self.sel_Loss,
int(self.numberofsamples.text()), self.params, self.listoflabels,
self.sel_Net,len(self.sel_ImgClasses),self.sel_Pret)
self.qtest.endofoneloop.connect(self.inkofprogress)
self.qtest.endofonesample.connect(self.inksamples)
self.qtest.finished.connect(self.prints)
testtret = threading.Thread(target=self.qtest.train)
testtret.start()

Assuming that the model creation code is run iteratively inside a loop,I suggest the following
Put code for model creation, training,evaluation and model deletion code inside a separate function and call that function from the loop body.
Call gc.collect() after the function call
The rational for first point is that the model creation, deletion and cache clearing would happen in a separate stack and it would force the GPU memory clearance when the method returns.

I get horrible results with my DDPG model TF2

Hello my DDPG model that I have implemented in TF 2 get's horrible results at every env on openai-gym that has continuous actions I need help to find what's the problem. I run this on my GPU. On env Pendulum I get -1200/-1000 rewards on every episode. This code is from a course I took on udemy but it was written in TF1.x and I rewrote it in TF2 but his TF1.x implementation had better results. Here is the code:
import tensorflow as tf
import numpy as np
import os
import gym
import random
import matplotlib.pyplot as plt
from tensorflow.keras.layers import Input, Dense, concatenate
from tensorflow.keras.models import Model
class ReplayBuffer():
def __init__(self, obs_dim, act_dim, size):
self.obs1_buf = np.zeros([size, obs_dim, ], dtype=np.float32)
self.obs2_buf = np.zeros([size, obs_dim, ], dtype=np.float32)
self.act_buf = np.zeros([size, act_dim], dtype=np.float32)
self.reward_buf = np.zeros(size, dtype=np.float32)
self.done_buf = np.zeros(size, dtype=np.float32)
self.current = 0
self.count = 0
self.size = size
def add_experience(self, state, action, reward, next_state, done):
self.obs1_buf[self.current] = state
self.act_buf[self.current] = action
self.reward_buf[self.current] = reward
self.obs2_buf[self.current] = next_state
self.done_buf[self.current] = done
self.current = (self.current + 1) % self.size
self.count = min(self.count+1, self.size)
def sample_batch(self, batch_size=32):
idx = np.random.randint(0, self.count, size=batch_size)
return dict(s=self.obs1_buf[idx],
s2=self.obs2_buf[idx],
a=self.act_buf[idx],
r=self.reward_buf[idx],
d=self.done_buf[idx])
class DDPG():
def __init__(self, env, num_states, num_actions, action_max):
self.env = env
self.num_states = num_states
self.num_actions = num_actions
self.action_max = action_max
self.gamma = 0.99
self.decay = 0.995
self.mu_optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)
self.q_optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)
def mu_model(hidden_layers):
inp = Input(shape=(self.num_states, ))
x = inp
for layers in hidden_layers[:-1]:
x = Dense(layers, activation='relu')(x)
x = Dense(hidden_layers[-1], activation='tanh')(x)
mu_model = Model(inp, x)
return mu_model
self.mu_model = mu_model([300, self.num_actions])
def q_model(inp_state, inp_act, hidden_layers):
inp_state = Input(shape=(inp_state, ))
inp_mu = Input(shape=(inp_act, ))
inp = concatenate([inp_state, inp_mu])
x = inp
for layers in hidden_layers[:-1]:
x = Dense(layers, activation='relu')(x)
x = Dense(hidden_layers[-1], activation='linear')(x)
q_model = Model([inp_state, inp_mu], x)
return q_model
self.q_model = q_model(self.num_states, self.num_actions, hidden_layers=[300, 1])
self.q_target_model = q_model(self.num_states, self.num_actions, hidden_layers=[300, 1])
#Eself.mu_do_minimize = tf.function(self.mu_minimize, input_signature=[
#tf.TensorSpec(shape=(None, self.num_states), dtype=tf.float32, name='state')])
self.q_do_minimize = tf.function(self.q_minimize, input_signature=[
tf.TensorSpec(shape=(None, self.num_states), dtype=tf.float32, name='state'),
tf.TensorSpec(shape=(None, self.num_actions), dtype=tf.float32, name='action'),
tf.TensorSpec(shape=(None, self.num_states), dtype=tf.float32, name='next_state'),
tf.TensorSpec(shape=(None, ), dtype=tf.float32, name='reward'),
tf.TensorSpec(shape=(None, ), dtype=tf.float32, name='done_flags')])
#tf.function
def train_mu(self, state):
with tf.GradientTape() as tape:
actions = self.mu_model(state, training=True)
critic_value = self.q_model([state, actions], training=True)
# Used `-value` as we want to maximize the value given
# by the critic for our actions
actor_loss = -tf.math.reduce_mean(critic_value)
actor_grad = tape.gradient(actor_loss, self.mu_model.trainable_variables)
self.mu_optimizer.apply_gradients(
zip(actor_grad, self.mu_model.trainable_variables)
)
def q_minimize(self, state, action, next_state, reward, done):
def calc_loss():
q_targ = reward + self.gamma * (1 - done) * self.q_target_model([next_state, action])
q = self.q_model([state, action])
cost = tf.reduce_mean((q - q_targ)**2)
return cost
self.q_optimizer.minimize(calc_loss, self.q_model.trainable_variables)
def train(self, state, action, reward, done, next_state):
state = np.atleast_2d(state)
next_state = np.atleast_2d(next_state)
action = np.atleast_2d(action)
reward = np.atleast_1d(reward)
done = np.atleast_1d(done)
self.update_target_net()
self.train_mu(state)
self.q_do_minimize(state, action, next_state, reward, done)
def update_target_net(self):
mu_weights = np.array(self.mu_model.get_weights())
q_weights = np.array(self.q_model.get_weights())
#print(mu_weights.shape)
#print(q_weights.shape)
mu_target_weights = np.array(self.mu_target_model.get_weights())
q_target_weights = np.array(self.q_target_model.get_weights())
self.q_target_model.set_weights(self.decay * q_weights + (1 - self.decay) * q_target_weights)
def get_action(self, states, noise=None):
if noise is None: noise = self.ACT_NOISE_SCALE
if len(states.shape) == 1: states = states.reshape(1,-1)
action = self.mu_model.predict_on_batch(states)[0]
if noise != 0:
action += noise * np.random.randn(self.num_actions)
action = np.clip(action, -self.action_max, self.action_max)
return action
def play_one(env, agent, replay_buffer, gamma=0.99, noise=0.1, max_episode_len=1000, start_steps=10000, num_train_ep=100, batch_size=100, test_ep_agent=25):
returns = []
num_steps = 0
for ep in range(num_train_ep):
s, ep_return, ep_len, d = env.reset(), 0, 0, False
while not (d or ep_len == max_episode_len):
env.render()
if num_steps > start_steps:
a = agent.get_action(s, noise)
else:
a = env.action_space.sample()
num_steps+=1
if num_steps == start_steps:
print("USING AGENT ACTIONS NOW")
s2, r, d, _ = env.step(a)
ep_return+=r
ep_len+=1
#print(s.shape)
d = False if ep_len == max_episode_len else d
replay_buffer.add_experience(s, a, r, s2, d)
s = s2
for _ in range(ep_len):
batch = replay_buffer.sample_batch()
state, next_state, action, reward, done = batch['s'], batch['s2'], batch['a'], batch['r'], batch['d']
loss = agent.train(state, action, reward, done, next_state)
returns.append(ep_return)
print('Iter:', ep, 'Rewards:', ep_return)
return returns
if __name__ == '__main__':
env = gym.make('Pendulum-v0')
obs_dim1 = env.observation_space.shape[0]
act_dim1 = env.action_space.shape[0]
action_max1 = env.action_space.high[0]
actor = DDPG(env, obs_dim1, act_dim1, action_max1)
replay_buffer = ReplayBuffer(obs_dim1, act_dim1, size=100000)
returns = play_one(env, actor, replay_buffer)
Thanks you in advance!

First things that comes to mind is the learning rate: 0.01 is too high, even for pendulum. Try a lower learning rate (eg 1e-3 for the actor and 5e-3 for the critic).
Also a couple of things look off in your code:
There is no target network for the actor. Why is that? IIRC ddpg has target network for both actor and critic.
Usually it is better to initialize main and target network with the same parameters. You can do that with target_model.set_weights(model.get_weights())
In the function play_one the training steps are done after playing a whole episode. This is probably ok, but there is no need to: because pendulum is not real time you don't need your code to be fast, so you can train while playing.
If you want to take a look I implemented ddpg in tensorflow 2 a while back. It solves pendulum in 80ish episodes.
GitHub

Tensorflow: strided_slice slicing error with while loop

I've created a rather complex seq2seq type model (based on "A Neural Transducer"), and in the latest version of Tensorflow, the following code returns the error:
Cannot use 'transducer_training/while/rnn/strided_slice' as input to 'gradients/transducer_training/while/rnn/while/Select_1_grad/Select/f_acc' because 'transducer_training/while/rnn/strided_slice' is in a while loop
The code worked before, only since the latest version has it stopped:
numpy (1.14.0)
protobuf (3.5.1) tensorflow (1.5.0) tensorflow-gpu
(1.3.0) tensorflow-tensorboard (1.5.1) Ubuntu version 16.04.3 LTS
(Xenial Xerus)
Code (To get the error just copy, paste and run it):
import tensorflow as tf
from tensorflow.contrib.rnn import LSTMCell, LSTMStateTuple
from tensorflow.python.layers import core as layers_core
# NOTE: Time major
# ---------------- Constants Manager ----------------------------
class ConstantsManager(object):
def __init__(self, input_dimensions, input_embedding_size, inputs_embedded, encoder_hidden_units,
transducer_hidden_units, vocab_ids, input_block_size, beam_width):
assert transducer_hidden_units == encoder_hidden_units, 'Encoder and transducer have to have the same amount' \
'of hidden units'
self.input_dimensions = input_dimensions
self.vocab_ids = vocab_ids
self.E_SYMBOL = len(self.vocab_ids)
self.vocab_ids.append('E_SYMBOL')
self.GO_SYMBOL = len(self.vocab_ids)
self.vocab_ids.append('GO_SYMBOL')
self.vocab_size = len(self.vocab_ids)
self.input_embedding_size = input_embedding_size
self.inputs_embedded = inputs_embedded
self.encoder_hidden_units = encoder_hidden_units
self.transducer_hidden_units = transducer_hidden_units
self.input_block_size = input_block_size
self.beam_width = beam_width
self.batch_size = 1 # Cannot be increased, see paper
self.log_prob_init_value = 0
# ----------------- Model ---------------------------------------
class Model(object):
def __init__(self, cons_manager):
self.var_list = []
self.cons_manager = cons_manager
self.max_blocks, self.inputs_full_raw, self.transducer_list_outputs, self.start_block, self.encoder_hidden_init,\
self.trans_hidden_init, self.logits, self.encoder_hidden_state_new, \
self.transducer_hidden_state_new, self.train_saver = self.build_full_transducer()
self.targets, self.train_op, self.loss = self.build_training_step()
def build_full_transducer(self):
with tf.variable_scope('transducer_training'):
embeddings = tf.Variable(tf.random_uniform([self.cons_manager.vocab_size,
self.cons_manager.input_embedding_size], -1.0, 1.0),
dtype=tf.float32,
name='embedding')
# Inputs
max_blocks = tf.placeholder(dtype=tf.int32, name='max_blocks') # total amount of blocks to go through
if self.cons_manager.inputs_embedded is True:
input_type = tf.float32
else:
input_type = tf.int32
inputs_full_raw = tf.placeholder(shape=(None, self.cons_manager.batch_size,
self.cons_manager.input_dimensions), dtype=input_type,
name='inputs_full_raw') # shape [max_time, 1, input_dims]
transducer_list_outputs = tf.placeholder(shape=(None,), dtype=tf.int32,
name='transducer_list_outputs') # amount to output per block
start_block = tf.placeholder(dtype=tf.int32, name='transducer_start_block') # where to start the input
encoder_hidden_init = tf.placeholder(shape=(2, 1, self.cons_manager.encoder_hidden_units), dtype=tf.float32,
name='encoder_hidden_init')
trans_hidden_init = tf.placeholder(shape=(2, 1, self.cons_manager.transducer_hidden_units), dtype=tf.float32,
name='trans_hidden_init')
# Temporary constants, maybe changed during inference
end_symbol = tf.get_variable(name='end_symbol',
initializer=tf.constant_initializer(self.cons_manager.vocab_size),
shape=(), dtype=tf.int32)
# Turn inputs into tensor which is easily readable#
inputs_full = tf.reshape(inputs_full_raw, shape=[-1, self.cons_manager.input_block_size,
self.cons_manager.batch_size,
self.cons_manager.input_dimensions])
# Outputs
outputs_ta = tf.TensorArray(dtype=tf.float32, size=max_blocks)
init_state = (start_block, outputs_ta, encoder_hidden_init, trans_hidden_init)
# Initiate cells, NOTE: if there is a future error, put these back inside the body function
encoder_cell = tf.contrib.rnn.LSTMCell(num_units=self.cons_manager.encoder_hidden_units)
transducer_cell = tf.contrib.rnn.LSTMCell(self.cons_manager.transducer_hidden_units)
def cond(current_block, outputs_int, encoder_hidden, trans_hidden):
return current_block < start_block + max_blocks
def body(current_block, outputs_int, encoder_hidden, trans_hidden):
# --------------------- ENCODER ----------------------------------------------------------------------
encoder_inputs = inputs_full[current_block]
encoder_inputs_length = [tf.shape(encoder_inputs)[0]]
encoder_hidden_state = encoder_hidden
if self.cons_manager.inputs_embedded is True:
encoder_inputs_embedded = encoder_inputs
else:
encoder_inputs = tf.reshape(encoder_inputs, shape=[-1, self.cons_manager.batch_size])
encoder_inputs_embedded = tf.nn.embedding_lookup(embeddings, encoder_inputs)
# Build model
# Build previous state
encoder_hidden_c, encoder_hidden_h = tf.split(encoder_hidden_state, num_or_size_splits=2, axis=0)
encoder_hidden_c = tf.reshape(encoder_hidden_c, shape=[-1, self.cons_manager.encoder_hidden_units])
encoder_hidden_h = tf.reshape(encoder_hidden_h, shape=[-1, self.cons_manager.encoder_hidden_units])
encoder_hidden_state_t = LSTMStateTuple(encoder_hidden_c, encoder_hidden_h)
# encoder_outputs: [max_time, batch_size, num_units]
encoder_outputs, encoder_hidden_state_new = tf.nn.dynamic_rnn(
encoder_cell, encoder_inputs_embedded,
sequence_length=encoder_inputs_length, time_major=True,
dtype=tf.float32, initial_state=encoder_hidden_state_t)
# Modify output of encoder_hidden_state_new so that it can be fed back in again without problems.
encoder_hidden_state_new = tf.concat([encoder_hidden_state_new.c, encoder_hidden_state_new.h], axis=0)
encoder_hidden_state_new = tf.reshape(encoder_hidden_state_new,
shape=[2, -1, self.cons_manager.encoder_hidden_units])
# --------------------- TRANSDUCER --------------------------------------------------------------------
encoder_raw_outputs = encoder_outputs
# Save/load the state as one tensor, use encoder state as init if this is the first block
trans_hidden_state = tf.cond(current_block > 0, lambda: trans_hidden, lambda: encoder_hidden_state_new)
transducer_amount_outputs = transducer_list_outputs[current_block - start_block]
# Model building
helper = tf.contrib.seq2seq.GreedyEmbeddingHelper(
embedding=embeddings,
start_tokens=tf.tile([self.cons_manager.GO_SYMBOL],
[self.cons_manager.batch_size]), # TODO: check if this looks good
end_token=end_symbol) # vocab size, so that it doesn't prematurely end the decoding
attention_states = tf.transpose(encoder_raw_outputs,
[1, 0, 2]) # attention_states: [batch_size, max_time, num_units]
attention_mechanism = tf.contrib.seq2seq.LuongAttention(
self.cons_manager.encoder_hidden_units, attention_states)
decoder_cell = tf.contrib.seq2seq.AttentionWrapper(
transducer_cell,
attention_mechanism,
attention_layer_size=self.cons_manager.transducer_hidden_units)
projection_layer = layers_core.Dense(self.cons_manager.vocab_size, use_bias=False)
# Build previous state
trans_hidden_c, trans_hidden_h = tf.split(trans_hidden_state, num_or_size_splits=2, axis=0)
trans_hidden_c = tf.reshape(trans_hidden_c, shape=[-1, self.cons_manager.transducer_hidden_units])
trans_hidden_h = tf.reshape(trans_hidden_h, shape=[-1, self.cons_manager.transducer_hidden_units])
trans_hidden_state_t = LSTMStateTuple(trans_hidden_c, trans_hidden_h)
decoder = tf.contrib.seq2seq.BasicDecoder(
decoder_cell, helper,
decoder_cell.zero_state(1, tf.float32).clone(cell_state=trans_hidden_state_t),
output_layer=projection_layer)
outputs, transducer_hidden_state_new, _ = tf.contrib.seq2seq.dynamic_decode(decoder,
output_time_major=True,
maximum_iterations=transducer_amount_outputs)
logits = outputs.rnn_output # logits of shape [max_time,batch_size,vocab_size]
decoder_prediction = outputs.sample_id # For debugging
# Modify output of transducer_hidden_state_new so that it can be fed back in again without problems.
transducer_hidden_state_new = tf.concat(
[transducer_hidden_state_new[0].c, transducer_hidden_state_new[0].h],
axis=0)
transducer_hidden_state_new = tf.reshape(transducer_hidden_state_new,
shape=[2, -1, self.cons_manager.transducer_hidden_units])
# Note the outputs
outputs_int = outputs_int.write(current_block - start_block, logits)
return current_block + 1, outputs_int, encoder_hidden_state_new, transducer_hidden_state_new
_, outputs_final, encoder_hidden_state_new, transducer_hidden_state_new = \
tf.while_loop(cond, body, init_state, parallel_iterations=1)
# Process outputs
outputs = outputs_final.concat()
logits = tf.reshape(
outputs,
shape=(-1, 1, self.cons_manager.vocab_size)) # And now its [max_output_time, batch_size, vocab]
# For loading the model later on
logits = tf.identity(logits, name='logits')
encoder_hidden_state_new = tf.identity(encoder_hidden_state_new, name='encoder_hidden_state_new')
transducer_hidden_state_new = tf.identity(transducer_hidden_state_new, name='transducer_hidden_state_new')
train_saver = tf.train.Saver() # For now save everything
return max_blocks, inputs_full_raw, transducer_list_outputs, start_block, encoder_hidden_init,\
trans_hidden_init, logits, encoder_hidden_state_new, transducer_hidden_state_new, train_saver
def build_training_step(self):
targets = tf.placeholder(shape=(None,), dtype=tf.int32, name='targets')
targets_one_hot = tf.one_hot(targets, depth=self.cons_manager.vocab_size, dtype=tf.float32)
targets_one_hot = tf.Print(targets_one_hot, [targets], message='Targets: ', summarize=10)
targets_one_hot = tf.Print(targets_one_hot, [tf.argmax(self.logits, axis=2)], message='Argmax: ', summarize=10)
stepwise_cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=targets_one_hot,
logits=self.logits)
loss = tf.reduce_mean(stepwise_cross_entropy)
train_op = tf.train.AdamOptimizer().minimize(loss)
return targets, train_op, loss
constants_manager = ConstantsManager(input_dimensions=1, input_embedding_size=11, inputs_embedded=False,
encoder_hidden_units=100, transducer_hidden_units=100, vocab_ids=[0, 1, 2],
input_block_size=1, beam_width=5)
model = Model(cons_manager=constants_manager)

I encounter a similar problem recently when I put a dynamic_rnn inside a scan (i.e. a while loop). It seems that the bug was introduced only in TensorFlow 1.5. You can try downgrade your TensorFlow version to 1.4 or upgrade to 1.6. Both should work.

In this particular case, the error seems to be raised incorrectly (see github issue in comments). In general, however, such errors mean the following:
The usage pattern that the error message is complaining about was always illegal. Earlier versions of TensorFlow just did not have good checks for it.
The core of the problem is that in TensorFlow's execution model, you cannot use a tensor that you create inside a while loop, outside of it. For a simple illustration of this, take a look at this test case.
You can just disable the check by immediately returning from here, but your computation graph will be malformed, which can lead to undefined behavior.
The correct fix is to add all the tensors that you want to access outside of the while loop (outside of cond and body functions) to the loop_vars and use them as returned from the tf.while_loop.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Keras-RL2 and Tensorflow 1-2 Incompatibility - python

Related

Constant loss and accuracy in pytorch

Unexpected shape of output from raw_rnn and how to inspect weights in raw_rnn

Delete model from GPU/CPU in Pytorch

I get horrible results with my DDPG model TF2

Tensorflow: strided_slice slicing error with while loop

Categories

Resources