I want to create a partially connected network in TensorFlow, what is the best approach to achieve that?
This is an illustration of what I am trying to achieve:
Pheraps using keras functional API might work?
Since I do not know what type of layers and how many neurons you want to use, I made them up and you will have to configure it yourself. One node in you picture depicts one layer in my example, however you can use multiple layers as well.
A demonstration with the Functional API:
import tensorflow as tf
node_a_input = tf.keras.Input(shape=(None,))
node_a = tf.keras.layers.Dense(10)(node_a_input)
node_b = tf.keras.layers.Dense(10)(node_a)
node_c_input = tf.keras.Input(shape=(None,))
node_c = tf.keras.layers.Dense(10)(node_c_input)
node_d = tf.keras.layers.Dense(10)(node_c)
node_e_merge = tf.keras.layers.Concatenate()([node_a, node_d]) # We have to merge the output of node_a and node_d. This can be done by concatenating, summing etc.
node_e = tf.keras.Dense(1)(node_e_merge)
model = tf.keras.Model(inputs=[node_a_input, node_c_input], outputs=[node_b, node_e])
Related
I am trying to create a q-learning chess engine where the output of the last layer of the neural network (the density is equal to the number of legal moves) is run through a argmax() function which returns an integer that I am using as an index for the array where the legal moves are stored. Here is part of my code:
#imports
env = gym.make('ChessAlphaZero-v0') #builds environment
obs = env.reset()
type(obs)
done = False #game is not won
num_actions = len(env.legal_moves) #array where legal moves are stored
obs = chess.Board()
model = models.Sequential()
def dqn(board):
#dense layers
action = layers.Dense(num_actions)(layer5)
i = np.argmax(action)
move = env.legal_moves[i]
return keras.Model(inputs=inputs, outputs=move)
But when I run the code I get the following error:
TypeError: Cannot convert a symbolic Keras input/output to a numpy array. This error may indicate that you're trying to pass a symbolic value to a NumPy call, which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.
Any code examples would be appreciated, thanks.
The correct way to build a model and forward an input in keras is this:
1. Building the model
model = models.Sequential()
model.add(layers.Input(observation_shape))
model.add(layers.Dense(units=128, activation='relu'))
model.add(layers.Dense(units=num_actions, activation='softmax'))
return model
or
inputs = layers.Input(observation_shape)
x = layers.Dense(units=128, activation='relu')(inputs)
outputs = layers.Dense(units=num_actions, activation='softmax')(x)
model = keras.Model(inputs, output)
Both ways are equal.
2. Forward an observation & Get the best possible action
action_values = model.predict(observation)
best_action_index = tf.argmax(action_values)
best_action = action_values[best_action_index]
Implementing DQN by yourself in keras can be quite frustrating. You might wanna use a DRL framework such as tf_agents that has implementations of lots of agents:
https://www.tensorflow.org/agents
This repository contains a clean and easy to understand implementation of DQN for openai gym environments. Also, it contains examples of using tf_agents library as well for more complex agents:
https://github.com/kochlisGit/Tensorflow-DQN
I'm trying to write a custom layer in Keras to replicate on particular architecture proposed in a paper. The layer has no trainable weights. I believe this might be relevant, since it wouldn't be necessary to extend the class Layer.
I'm using the CNTK backend, but I'm trying to keep the code as backend-agnostic as possible, so I'm relying on the interfaces defined in keras.backend, instead of directly using CNTK.
Right now I'm just trying to get a small example to work. The example is as follows:
import numpy as np
from scipy.misc import imread
from keras import backend as K
im = imread('test.bmp')
#I'm extending a grayscale image to behave as a color image
ex_im = np.empty([im.shape[0],im.shape[1],3])
ex_im[:,:,0] = im
ex_im[:,:,1] = im
ex_im[:,:,2] = im
conv_filter = K.ones([3,3,ex_im.shape[2],ex_im.shape[2]])
x = K.conv2d(ex_im,conv_filter,padding='same')
This code, however, results in the following error:
RuntimeError: Convolution currently requires the main operand to have
dynamic axes
CNTK requires the input to the convolution to have dynamic axes, otherwise it would interpret the first dimension of the input as the batch size. So I tried to make the axes dynamic with placeholders (the only way I could find of doing so):
import numpy as np
from scipy.misc import imread
from keras import backend as K
im = imread('test.bmp')
ex_im = np.empty([1,im.shape[0],im.shape[1],3])
ex_im[0,:,:,0] = im
ex_im[0,:,:,1] = im
ex_im[0,:,:,2] = im
place = K.placeholder(shape=((None,) + ex_im.shape[1:]))
conv_filter = K.ones([3,3,ex_im.shape[3],ex_im.shape[3]])
x = K.conv2d(place,conv_filter,padding='same')
The image is now an array of images, with what is basically a batch size of 1.
This works correctly. However, I can't figure out how to feed an input to the placeholder in order to test my code. eval() doesn't take any arguments, and there doesn't seem to be a way to pass the input as an argument to the evaluation.
Is there a way to do this without placeholders? Or a way to feed the inputs to the placeholder? Am I doing something fundamentally wrong and should be following another path?
I should add that I really want to avoid being locked in to a backend, so any solutions should be backend-agnostic.
For using custom layers, you don't define tensors, let Keras do it for you. Just create the layer, and what will be given to the layer will already be a proper tensor:
images = np.ones((1,50,50,3))
def myFunc(x):
conv_filter = K.ones([3,3,3,3])
return K.conv2d(x,conv_filter,padding='same')
inp = Input((50,50,3))
out = Lambda(myFunc, output_shape=(50,50,3))(inp)
model = Model(inp,out)
print(model.predict(images))
I'm using Keras with tensorflow as backend.
I have one compiled/trained model.
My prediction loop is slow so I would like to find a way to parallelize the predict_proba calls to speed things up.
I would like to take a list of batches (of data) and then per available gpu, run model.predict_proba() over a subset of those batches.
Essentially:
data = [ batch_0, batch_1, ... , batch_N ]
on gpu_0 => return predict_proba(batch_0)
on gpu_1 => return predict_proba(batch_1)
...
on gpu_N => return predict_proba(batch_N)
I know that it's possible in pure Tensorflow to assign ops to a given gpu (https://www.tensorflow.org/tutorials/using_gpu). However, I don't know how this translates to my situation since I've built/compiled/trained my model using Keras' api.
I had thought that maybe I just needed to use python's multiprocessing module and start a process per gpu that would run predict_proba(batch_n). I know this is theoretically possible given another SO post of mine: Keras + Tensorflow and Multiprocessing in Python. However, this still leaves me with the dilemma of not knowing how to actually "choose" a gpu to operate the process on.
My question boils down to: how does one parallelize prediction for one model in Keras across multiple gpus when using Tensorflow as Keras' backend?
Additionally I am curious if similar parallelization for prediction is possible with only one gpu.
A high level description or code example would be greatly appreciated!
Thanks!
I created one simple example to show how to run keras model across multiple gpus. Basically, multiple processes are created and each of process owns a gpu. To specify the gpu id in process, setting env variable CUDA_VISIBLE_DEVICES is a very straightforward way (os.environ["CUDA_VISIBLE_DEVICES"]). Hope this git repo can help you.
https://github.com/yuanyuanli85/Keras-Multiple-Process-Prediction
You can use this function to parallelize a Keras model (credits to kuza55).
https://github.com/kuza55/keras-extras/blob/master/utils/multi_gpu.py
.
from keras.layers import merge
from keras.layers.core import Lambda
from keras.models import Model
import tensorflow as tf
def make_parallel(model, gpu_count):
def get_slice(data, idx, parts):
shape = tf.shape(data)
size = tf.concat([ shape[:1] // parts, shape[1:] ],axis=0)
stride = tf.concat([ shape[:1] // parts, shape[1:]*0 ],axis=0)
start = stride * idx
return tf.slice(data, start, size)
outputs_all = []
for i in range(len(model.outputs)):
outputs_all.append([])
#Place a copy of the model on each GPU, each getting a slice of the batch
for i in range(gpu_count):
with tf.device('/gpu:%d' % i):
with tf.name_scope('tower_%d' % i) as scope:
inputs = []
#Slice each input into a piece for processing on this GPU
for x in model.inputs:
input_shape = tuple(x.get_shape().as_list())[1:]
slice_n = Lambda(get_slice, output_shape=input_shape, arguments={'idx':i,'parts':gpu_count})(x)
inputs.append(slice_n)
outputs = model(inputs)
if not isinstance(outputs, list):
outputs = [outputs]
#Save all the outputs for merging back together later
for l in range(len(outputs)):
outputs_all[l].append(outputs[l])
# merge outputs on CPU
with tf.device('/cpu:0'):
merged = []
for outputs in outputs_all:
merged.append(merge(outputs, mode='concat', concat_axis=0))
return Model(input=model.inputs, output=merged)
I'm trying to implement a CNN using theano/lasagne.
I've made a neural network but can't figure out how to train it with the current state.
This is how I'm trying to get the output of the network with the current_states as input.
train = theano.function([input_var], lasagne.layers.get_output(l.out))
output = train(current_states)
However I get this error:
theano.compile.function_module.UnusedInputError: theano.function was asked to create a function computing outputs given certain inputs, but the provided input variable at index 0 is not part of the computational graph needed to compute the outputs: inputs.
To make this error into a warning, you can pass the parameter on_unused_input='warn' to theano.function. To disable it completely, use on_unused_input='ignore'.
Why is current_states not used?
I want to get the output of the model on the current_states. How do I do this?
(the CNN build code: http://pastebin.com/Gd35RncU)
The following code snippet works for me:
import lasagne, theano
import theano.tensor as T
import numpy as np
input_var = theano.tensor.tensor4('inputs')
l_out = build_cnn(input_var)
train = theano.function([input_var], lasagne.layers.get_output(l_out))
x = np.random.randn(10, 4, 80, 80).astype(theano.config.floatX)
train(x)
You didn't post your entire code, but you can check to see if in your script you are passing in the input_var variable to your build_cnn function. If you do not, then input_var will not be part of your computational graph, which is why Theano is raising the error.
In python when I want to get the data from a layer using caffe I have the following code
input_image = caffe.io.load_image(imgName)
input_oversampled = caffe.io.resize_image(input_image, self.net.crop_dims)
prediction = self.net.predict([input_image])
caffe_input = np.asarray(self.net.preprocess('data', prediction))
self.net.forward(data=caffe_input)
data = self.net.blobs['fc7'].data[4] // I want to get this value in lua
Hoever when I'm using torch I'm a bit stuck since I don't know how to perform the same action.
Currently I have the following code
require 'caffe'
require 'image'
net = caffe.Net('/opt/caffe/models/bvlc_reference_caffenet/deploy.prototxt', '/opt/caffe/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel')
img = image.lena()
dest = torch.Tensor(3, 227,227)
img = image.scale(dest, img)
img = img:resize(10,3,227,227)
output = net:forward(img:float())
conv_nodes = net:findModules('fc7') -- not working
Any help would be appreciated
First of all please note that torch-caffe-binding (i.e the tool you use with require 'caffe') is a direct wrapper around Caffe library thanks to LuaJIT FFI.
This means that it allows you to conveniently do a forward or backward with a Torch tensor, but behind the scenes these operations are made on a caffe::Net and not on a Torch nn network.
So if you want to manipulate a plain Torch network what you should use is the loadcaffe library which fully converts the network into a nn.Sequential:
require 'loadcaffe'
local net = loadcaffe.load('net.prototxt', 'net.caffemodel')
Then you can use findModules. However please note that you cannot use their initial label anymore (like conv1 or fc7) as they are discarded after convert.
Here fc7 (= INNER_PRODUCT) corresponds to the N-1 linear transformation. So you can get it as follow:
local nodes = net:findModules('nn.Linear')
local fc7 = nodes[#nodes-1]
Then you can read the data (weights and biases) via fc7.weight and fc7.bias - these are regular torch.Tensor-s.
UPDATE
As of commit 2516fac loadcaffe now saves layer names in addition. So to retrieve the 'fc7' layer you can now do something like:
local fc7
for _,m in pairs(net:listModules()) do
if m.name == 'fc7' then
fc7 = m
break
end
end