Train convolutional neural network with theano/lasagne - python

I'm trying to implement a CNN using theano/lasagne.
I've made a neural network but can't figure out how to train it with the current state.
This is how I'm trying to get the output of the network with the current_states as input.
train = theano.function([input_var], lasagne.layers.get_output(l.out))
output = train(current_states)
However I get this error:
theano.compile.function_module.UnusedInputError: theano.function was asked to create a function computing outputs given certain inputs, but the provided input variable at index 0 is not part of the computational graph needed to compute the outputs: inputs.
To make this error into a warning, you can pass the parameter on_unused_input='warn' to theano.function. To disable it completely, use on_unused_input='ignore'.
Why is current_states not used?
I want to get the output of the model on the current_states. How do I do this?
(the CNN build code: http://pastebin.com/Gd35RncU)

The following code snippet works for me:
import lasagne, theano
import theano.tensor as T
import numpy as np
input_var = theano.tensor.tensor4('inputs')
l_out = build_cnn(input_var)
train = theano.function([input_var], lasagne.layers.get_output(l_out))
x = np.random.randn(10, 4, 80, 80).astype(theano.config.floatX)
train(x)
You didn't post your entire code, but you can check to see if in your script you are passing in the input_var variable to your build_cnn function. If you do not, then input_var will not be part of your computational graph, which is why Theano is raising the error.

Related

How to convert a keras tensor to a numpy array

I am trying to create a q-learning chess engine where the output of the last layer of the neural network (the density is equal to the number of legal moves) is run through a argmax() function which returns an integer that I am using as an index for the array where the legal moves are stored. Here is part of my code:
#imports
env = gym.make('ChessAlphaZero-v0') #builds environment
obs = env.reset()
type(obs)
done = False #game is not won
num_actions = len(env.legal_moves) #array where legal moves are stored
obs = chess.Board()
model = models.Sequential()
def dqn(board):
#dense layers
action = layers.Dense(num_actions)(layer5)
i = np.argmax(action)
move = env.legal_moves[i]
return keras.Model(inputs=inputs, outputs=move)
But when I run the code I get the following error:
TypeError: Cannot convert a symbolic Keras input/output to a numpy array. This error may indicate that you're trying to pass a symbolic value to a NumPy call, which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.
Any code examples would be appreciated, thanks.
The correct way to build a model and forward an input in keras is this:
1. Building the model
model = models.Sequential()
model.add(layers.Input(observation_shape))
model.add(layers.Dense(units=128, activation='relu'))
model.add(layers.Dense(units=num_actions, activation='softmax'))
return model
or
inputs = layers.Input(observation_shape)
x = layers.Dense(units=128, activation='relu')(inputs)
outputs = layers.Dense(units=num_actions, activation='softmax')(x)
model = keras.Model(inputs, output)
Both ways are equal.
2. Forward an observation & Get the best possible action
action_values = model.predict(observation)
best_action_index = tf.argmax(action_values)
best_action = action_values[best_action_index]
Implementing DQN by yourself in keras can be quite frustrating. You might wanna use a DRL framework such as tf_agents that has implementations of lots of agents:
https://www.tensorflow.org/agents
This repository contains a clean and easy to understand implementation of DQN for openai gym environments. Also, it contains examples of using tf_agents library as well for more complex agents:
https://github.com/kochlisGit/Tensorflow-DQN

fastai error predicting with exported/reloaded model: "Input type and weight type should be the same"

Whenever I export a fastai model and reload it, I get this error (or a very similar one) when I try and use the reloaded model to generate predictions on a new test set:
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same
Minimal reprodudeable code example below, you just need to update your FILES_DIR variable to where the MNIST data gets deposited on your system:
from fastai import *
from fastai.vision import *
# download data for reproduceable example
untar_data(URLs.MNIST_SAMPLE)
FILES_DIR = '/home/mepstein/.fastai/data/mnist_sample' # this is where command above deposits the MNIST data for me
# Create FastAI databunch for model training
tfms = get_transforms()
tr_val_databunch = ImageDataBunch.from_folder(path=FILES_DIR, # location of downloaded data shown in log of prev command
train = 'train',
valid_pct = 0.2,
ds_tfms = tfms).normalize()
# Create Model
conv_learner = cnn_learner(tr_val_databunch,
models.resnet34,
metrics=[error_rate]).to_fp16()
# Train Model
conv_learner.fit_one_cycle(4)
# Export Model
conv_learner.export() # saves model as 'export.pkl' in path associated with the learner
# Reload Model and use it for inference on new hold-out set
reloaded_model = load_learner(path = FILES_DIR,
test = ImageList.from_folder(path = f'{FILES_DIR}/valid'))
preds = reloaded_model.get_preds(ds_type=DatasetType.Test)
Output:
"RuntimeError: Input type (torch.cuda.FloatTensor) and weight type
(torch.cuda.HalfTensor) should be the same"
Stepping through the code statement by statement, everything works fine until the last line pred = ... which is where the torch error above pops up.
Relevant software versions:
Python 3.7.3
fastai 1.0.57
torch 1.2.0
torchvision 0.4.0
So the answer to this ended up being relatively simple:
1) As noted in my comment, training in mixed precision mode (setting conv_learner to_fp16()) caused the error with the exported/reloaded model
2) To train in mixed precision mode (which is faster than regular training) and enable export/reload of the model without errors, simply set the model back to default precision before exporting.
...In code, simply changing the example above:
# Export Model
conv_learner.export()
to:
# Export Model (after converting back to default precision for safe export/reload
conv_learner = conv_learner.to_fp32()
conv_learner.export()
...and now the full (reproduceable) code example above runs without errors, including the prediction after model reload.
Your model is in half precision if you have .to_fp16, which would be the same if you would model.half() in PyTorch.
Actually if you trace the code .to_fp16 will call model.half()
But there is a problem. If you convert the batch norm layer also to half precision you may get the convergence problem.
This is why you would typically do this in PyTorch:
model.half() # convert to half precision
for layer in model.modules():
if isinstance(module, torch.nn.modules.batchnorm._BatchNorm):
layer.float()
This will convert any layer to half precision other than batch norm.
Note that code from PyTorch forum is also OK, but just for nn.BatchNorm2d.
Then make sure your input is in half precision using to() like this:
import torch
t = torch.tensor(10.)
print(t)
print(t.dtype)
t=t.to(dtype=torch.float16)
print(t)
print(t.dtype)
# tensor(10.)
# torch.float32
# tensor(10., dtype=torch.float16)
# torch.float16

Structuring a Keras project to achieve reproducible results in GPU

I am writing a tensorflow.Keras wrapper to perform ML experiments.
I need my framework to be able to perform an experiment as specified in a configuration yaml file and run in parallel in a GPU.
Then I need a guarantee that if I ran the experiment again I would get if not the exact same results something reasonably close.
To try to ensure this, my training script contains these lines at the beginning, following the guidelines in the official documentation:
# Set up random seeds
random.seed(seed)
np.random.seed(seed)
tf.set_random_seed(seed)
This has proven to not be enough.
I ran the same configuration 4 times, and plotted the results:
As you can see, results vary a lot between runs.
How can I set up a training session in Keras to ensure I get reasonably similar results when training in a GPU? Is this even possible?
The full training script can be found here.
Some of my colleagues are using just pure TF, and their results seem far more consistent. What is more, they do not seem to be seeding any randomness except to ensure that the train and validation split is always the same.
Keras + Tensorflow.
Step 1, disable GPU.
import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = ""
Step 2, seed those libraries which are included in your code, say "tensorflow, numpy, random".
import tensorflow as tf
import numpy as np
import random as rn
sd = 1 # Here sd means seed.
np.random.seed(sd)
rn.seed(sd)
os.environ['PYTHONHASHSEED']=str(sd)
from keras import backend as K
config = tf.ConfigProto(intra_op_parallelism_threads=1,inter_op_parallelism_threads=1)
tf.set_random_seed(sd)
sess = tf.Session(graph=tf.get_default_graph(), config=config)
K.set_session(sess)
Make sure these two pieces of code are included at the start of your code, then the result will be reproducible.
Try adding seed parameters to weights/biases initializers. Just to add more specifics to Alexander Ejbekov's comment.
Tensorflow has two random seeds graph level and op level. If you're using more than one graph, you need to specify seed in every one. You can override graph level seed with op level, by setting seed parameter within function. And you can make two functions even from different graphs output same value if same seed is set.
Consider this example:
g1 = tf.Graph()
with g1.as_default():
tf.set_random_seed(1)
a = tf.get_variable('a', shape=(1,), initializer=tf.keras.initializers.glorot_normal())
b = tf.get_variable('b', shape=(1,), initializer=tf.keras.initializers.glorot_normal(seed=2))
with tf.Session(graph=g1) as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(a))
print(sess.run(b))
g2 = tf.Graph()
with g2.as_default():
a1 = tf.get_variable('a1', shape=(1,), initializer=tf.keras.initializers.glorot_normal(seed=1))
with tf.Session(graph=g2) as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(a1))
In this example, output of a is the same as a1, but b is different.

TensorFlow classifier prediction script always outputs zero

I have created a script to evaluate a TensorFlow convolutional neural network. It loads some images and does some simple preprocessing:
def main(argv):
classifier = import_model()
for path in argv[1:]:
image_reversed = imread(path).astype(np.float32)
image_unlayered = np.transpose(image_reversed, (1, 0, 2))
image = np.reshape(image_unlayered, [1, -1, 480, 3])
angle = infer_steering_angle(classifier, image)
print("Steering angle %f for image %s." % (angle, path))
It imports the model using a network structure function in another file that has been verified to at least mostly work and is used to train a network:
def import_model():
# Load estimator
classifier = learn.Estimator(
model_fn=cnn_model_fn,
model_dir="/tmp/network2"
)
return classifier
and finally, it uses the Estimator.predict function to pass the single image to the network, overriding the default batch_size of 10 and setting it to 1. It returns a tensor with a single element, which should correspond to the steering angle (this is an end-to-end autonomous driving regression problem).
def infer_steering_angle(classifier, image):
output = classifier.predict(
x=image,
batch_size=1
)
for angle in output:
return angle
The problem is, it always outputs 0.0 for the steering angle. I've looked over all of this several times, and the only thing I can think of is that I'm misunderstanding the Estimator.predict function. It's rather poorly documented, in that it lacks concrete examples of how it should be used. Does anybody notice anything wrong with how I'm formatting the input or parsing the output?
UPDATE:
I tried putting this code right in the training file, so the importing can't be the problem. I'm starting to become suspicious it's a problem with the model itself. The code is at https://hastebin.com/rakulonebu.py.

Keras with CNTK backend: Writing custom layers

I'm trying to write a custom layer in Keras to replicate on particular architecture proposed in a paper. The layer has no trainable weights. I believe this might be relevant, since it wouldn't be necessary to extend the class Layer.
I'm using the CNTK backend, but I'm trying to keep the code as backend-agnostic as possible, so I'm relying on the interfaces defined in keras.backend, instead of directly using CNTK.
Right now I'm just trying to get a small example to work. The example is as follows:
import numpy as np
from scipy.misc import imread
from keras import backend as K
im = imread('test.bmp')
#I'm extending a grayscale image to behave as a color image
ex_im = np.empty([im.shape[0],im.shape[1],3])
ex_im[:,:,0] = im
ex_im[:,:,1] = im
ex_im[:,:,2] = im
conv_filter = K.ones([3,3,ex_im.shape[2],ex_im.shape[2]])
x = K.conv2d(ex_im,conv_filter,padding='same')
This code, however, results in the following error:
RuntimeError: Convolution currently requires the main operand to have
dynamic axes
CNTK requires the input to the convolution to have dynamic axes, otherwise it would interpret the first dimension of the input as the batch size. So I tried to make the axes dynamic with placeholders (the only way I could find of doing so):
import numpy as np
from scipy.misc import imread
from keras import backend as K
im = imread('test.bmp')
ex_im = np.empty([1,im.shape[0],im.shape[1],3])
ex_im[0,:,:,0] = im
ex_im[0,:,:,1] = im
ex_im[0,:,:,2] = im
place = K.placeholder(shape=((None,) + ex_im.shape[1:]))
conv_filter = K.ones([3,3,ex_im.shape[3],ex_im.shape[3]])
x = K.conv2d(place,conv_filter,padding='same')
The image is now an array of images, with what is basically a batch size of 1.
This works correctly. However, I can't figure out how to feed an input to the placeholder in order to test my code. eval() doesn't take any arguments, and there doesn't seem to be a way to pass the input as an argument to the evaluation.
Is there a way to do this without placeholders? Or a way to feed the inputs to the placeholder? Am I doing something fundamentally wrong and should be following another path?
I should add that I really want to avoid being locked in to a backend, so any solutions should be backend-agnostic.
For using custom layers, you don't define tensors, let Keras do it for you. Just create the layer, and what will be given to the layer will already be a proper tensor:
images = np.ones((1,50,50,3))
def myFunc(x):
conv_filter = K.ones([3,3,3,3])
return K.conv2d(x,conv_filter,padding='same')
inp = Input((50,50,3))
out = Lambda(myFunc, output_shape=(50,50,3))(inp)
model = Model(inp,out)
print(model.predict(images))

Categories