Mixing feed forward layers and recurrent layers in Tensorflow? - python

Has anyone been able to mix feedforward layers and recurrent layers in Tensorflow?
For example:
input->conv->GRU->linear->output
I can imagine one can define his own cell with feedforward layers and no state which can then be stacked using the MultiRNNCell function, something like:
cell = tf.nn.rnn_cell.MultiRNNCell([conv_cell,GRU_cell,linear_cell])
This would make life a whole lot easier...

can't you just do the following:
rnnouts, _ = rnn(grucell, inputs)
linearout = [tf.matmul(rnnout, weights) + bias for rnnout in rnnouts]
etc.

This tutorial gives an example of how to use convolutional layers together with recurrent ones. For example, having last convolution layers like this:
...
l_conv4_a = conv_pre(l_pool3, 16, (5, 5), scope="l_conv4_a")
l_pool4 = pool(l_conv3_a, scope="l_pool4")
l_flatten = flatten(l_pool4, scope="flatten")
and having defined RNN cell:
_, shape_state = tf.nn.dynamic_rnn(cell=shape_cell,
inputs=tf.expand_dims(batch_norm(x_shape_pl), 2), dtype=tf.float32, scope="shape_rnn")
You can concatenate both outputs and use it as the input to the next layer:
features = tf.concat(concat_dim=1, values=[x_margin_pl, shape_state, x_texture_pl, l_flatten], name="features")
Or you can just use the output of CNN layer as the input to the RNN cell:
_, shape_state = tf.nn.dynamic_rnn(cell=shape_cell,
inputs=l_flatten, dtype=tf.float32, scope="shape_rnn")

This is what I have so far; improvements welcome:
class LayerCell(rnn_cell_impl.RNNCell):
def __init__(self, tf_layer, **kwargs):
''' :param tf_layer: a tensorflow layer, e.g. tf.layers.Conv2D or
tf.keras.layers.Conv2D. NOT tf.layers.conv2d !
Can pass all other layer params as well, just need to give the
parameter name: paramname=param'''
self.layer_fn = tf_layer(**kwargs)
def __call__(self, inputs, state, scope=None):
''' Every `RNNCell` must implement `call` with
the signature `(output, next_state) = call(input, state)`. The optional
third input argument, `scope`, is allowed for backwards compatibility
purposes; but should be left off for new subclasses.'''
return (self.layer_fn(inputs), state)
def __str__(self):
return "Cell wrapper of " + str(self.layer_fn)
def __getattr__(self, attr):
'''credits to https://stackoverflow.com/questions/1382871/dynamically-attaching-a-method-to-an-existing-python-object-generated-with-swig/1383646#1383646'''
return getattr(self.layer_fn, attr)
#property
def state_size(self):
"""size(s) of state(s) used by this cell.
It can be represented by an Integer, a TensorShape or a tuple of Integers
or TensorShapes.
"""
return (0,)
#property
def output_size(self):
"""Integer or TensorShape: size of outputs produced by this cell."""
# use with caution; could be uninitialized
return self.layer_fn.output_shape
(Naturally, don't use with recurrent layers because state-keeping will be destroyed.)
Seems to work with: tf.layers.Conv2D, tf.keras.layers.Conv2D, tf.keras.layers.Activation, tf.layers.BatchNormalization
Does NOT work with: tf.keras.layers.BatchNormalization.
At least it failed for me when using it in a tf.while loop; complaining about combining variables from different frames, similar to here. Maybe keras uses tf.Variable() instead of tf.get_variable() ...?
Usage:
cell0 = tf.contrib.rnn.ConvLSTMCell(conv_ndims=2, input_shape=[40, 40, 3], output_channels=16, kernel_shape=[5, 5])
cell1 = LayerCell(tf.keras.layers.Conv2D, filters=8, kernel_size=[5, 5], strides=(1, 1), padding='same')
cell2 = LayerCell(tf.layers.BatchNormalization, axis=-1)
inputs = np.random.rand(10, 40, 40, 3).astype(np.float32)
multicell = tf.contrib.rnn.MultiRNNCell([cell0, cell1, cell2])
state = multicell.zero_state(batch_size=10, dtype=tf.float32)
output = multicell(inputs, state)

Related

How to override gradient for the nonlinearity functions in lasagne?

I have a model, for which i need to compute the gradients of output w.r.t the model's input. But I want to apply some custom gradients for some of the nonlinearity functions applied on some of the model's layers. So i tried the idea explained here, which computes the nonlinear rectifier (RELU) in the forward pass but modifies the gradients of Relu in the backward pass. I added the following two classes:
The helper class that allows us to replace a nonlinearity with an Op
that has the same output, but a custom gradient
class ModifiedBackprop(object):
def __init__(self, nonlinearity):
self.nonlinearity = nonlinearity
self.ops = {} # memoizes an OpFromGraph instance per tensor type
def __call__(self, x):
# OpFromGraph is oblique to Theano optimizations, so we need to move
# things to GPU ourselves if needed.
if theano.sandbox.cuda.cuda_enabled:
maybe_to_gpu = theano.sandbox.cuda.as_cuda_ndarray_variable
else:
maybe_to_gpu = lambda x: x
# We move the input to GPU if needed.
x = maybe_to_gpu(x)
# We note the tensor type of the input variable to the nonlinearity
# (mainly dimensionality and dtype); we need to create a fitting Op.
tensor_type = x.type
# If we did not create a suitable Op yet, this is the time to do so.
if tensor_type not in self.ops:
# For the graph, we create an input variable of the correct type:
inp = tensor_type()
# We pass it through the nonlinearity (and move to GPU if needed).
outp = maybe_to_gpu(self.nonlinearity(inp))
# Then we fix the forward expression...
op = theano.OpFromGraph([inp], [outp])
# ...and replace the gradient with our own (defined in a subclass).
op.grad = self.grad
# Finally, we memoize the new Op
self.ops[tensor_type] = op
# And apply the memoized Op to the input we got.
return self.ops[tensor_type](x)
The subclass that does guided backpropagation through a nonlinearity:
class GuidedBackprop(ModifiedBackprop):
def grad(self, inputs, out_grads):
(inp,) = inputs
(grd,) = out_grads
dtype = inp.dtype
print('It works')
return (grd * (inp > 0).astype(dtype) * (grd > 0).astype(dtype),)
Then i used them in my code as follows:
import lasagne as nn
model_in = T.tensor3()
# model_in = net['input'].input_var
nn.layers.set_all_param_values(net['l_out'], model['param_values'])
relu = nn.nonlinearities.rectify
relu_layers = [layer for layer in
nn.layers.get_all_layers(net['l_out']) if getattr(layer,
'nonlinearity', None) is relu]
modded_relu = GuidedBackprop(relu)
for layer in relu_layers:
layer.nonlinearity = modded_relu
prop = nn.layers.get_output(
net['l_out'], model_in, deterministic=True)
for sample in range(ini, batch_len):
model_out = prop[sample, 'z'] # get prop for label 'z'
gradients = theano.gradient.jacobian(model_out, wrt=model_in)
# gradients = theano.grad(model_out, wrt=model_in)
get_gradients = theano.function(inputs=[model_in],
outputs=gradients)
grads = get_gradients(X_batch) # gradient dimension: X_batch == model_in(64, 20, 32)
grads = np.array(grads)
grads = grads[sample]
Now when i run the code, it works without any error, and the shape of the output is also correct. But that's because it executes the default theano.grad function and not the one supposed to override it. In other words, the grad() function in the class GuidedBackprop never been invoked.
I can't understand what is the issue?
is there's a solution?
If this is an unresolved issue, is there's an implementation for a Theano Op that can achieve such a functionality or some other way to override gradient for specific nonlinearity functions applied on some of the model's layers?
Are you try to set it back the value of model output into model layer input, all gradients calculation
group_1_ShoryuKen_Left = tf.constant([ 0,0,0,0,0,1,0,0,0,0,0,0, 0,0,0,0,0,1,0,1,0,0,0,0, 0,0,0,0,0,0,0,1,0,0,0,0, 0,0,0,0,0,0,0,0,0,1,0,0 ], shape=(1, 1, 48), dtype=tf.float32)
## layer_2 = tf.keras.layers.Dense(256, kernel_initializer=tf.constant_initializer(1.))
layer_2 = tf.keras.layers.LSTM(32, kernel_initializer=tf.constant_initializer(1.))
b_out = layer_2(group_1_ShoryuKen_Left)
layer_2.set_weights(layer_1.get_weights())

Obtain the output of intermediate layer (Functional API) and use it in SubClassed API

In the keras doc, it says that if we want to pick the intermediate layer's output of the model (sequential and functional), all we need to do as follows:
model = ... # create the original model
layer_name = 'my_layer'
intermediate_layer_model = keras.Model(inputs=model.input,
outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model(data)
So, here we get two models, the intermediate_layer_model is the sub-model of its parent model. And they're independent as well. Likewise, if we get the intermediate layer's output feature maps of the parent model (or base model), and do some operation with it and get some output feature maps from this operation, then we can also impute this output feature maps back to the parent model.
input = tf.keras.Input(shape=(size,size,3))
model = tf.keras.applications.DenseNet121(input_tensor = input)
layer_name = "conv1_block1" # for example
output_feat_maps = SomeOperationLayer()(model.get_layer(layer_name).output)
# assume, they're able to add up
base = Add()([model.output, output_feat_maps])
# bind all
imputed_model = tf.keras.Model(inputs=[model.input], outputs=base)
So, in this way we have one modified model. It's quite easy with functional API. All the keras imagenet models are written with functional API (mostly). In model subclassing API, we can use these models. My concern here is, what to do if we need the intermediate output feature maps of these functional API models' inside call function.
class Subclass(tf.keras.Model):
def __init__(self, dim):
super(Subclass, self).__init__()
self.dim = dim
self.base = DenseNet121(input_shape=self.dim)
# building new model with the desired output layer of base model
self.mid_layer_model = tf.keras.Model(self.base.inputs,
self.base.get_layer(layer_name).output)
def call(self, inputs):
# forward with base model
x = self.base(inputs)
# forward with mid_layer_model
mid_feat = self.mid_layer_model(inputs)
# do some op with it
mid_x = SomeOperationLayer()(mid_feat)
# assume, they're able to add up
out = tf.keras.layers.add([x, mid_x])
return out
The issue is, here we've technically two models in a joint fashion. But unlike building a model like this, here we simply want the intermediate output feature maps (from some inputs) of the base model forward manner and use it somewhere else and get some output. Like this
mid_x = SomeOperationLayer()(self.base.get_layer(layer_name).output)
But it gives ValueError: Graph disconnected. So, currently, we have to build a new model from the base model based on our desired intermediate layer. In the init method we define or create new self.mid_layer_model model that gives our desired output feature maps like this: mid_feat = self.mid_layer_model(inputs). Next, we take the mid_faet and do some operation and get some output and lastly add them with tf.keras.layers.add([x, mid_x]). So by creating a new model with desired intermediate out works but by the same time, we repeat the same operation twice i.e the base model and its subset model. Maybe I'm missing something obvious, please add up something. Is it how it is! or there some strategies we can adopt. I've asked in the forum here, no response yet.
Update
Here is a working example. Let's say we have a custom layer like this
import tensorflow as tf
from tensorflow.keras.applications import DenseNet121
from tensorflow.keras.layers import Add
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
class ConvBlock(tf.keras.layers.Layer):
def __init__(self, kernel_num=32, kernel_size=(3,3), strides=(1,1), padding='same'):
super(ConvBlock, self).__init__()
# conv layer
self.conv = tf.keras.layers.Conv2D(kernel_num,
kernel_size=kernel_size,
strides=strides, padding=padding)
# batch norm layer
self.bn = tf.keras.layers.BatchNormalization()
def call(self, input_tensor, training=False):
x = self.conv(input_tensor)
x = self.bn(x, training=training)
return tf.nn.relu(x)
And we want to impute this layer into an ImageNet model and construct a model like this
input = tf.keras.Input(shape=(32, 32, 3))
base = DenseNet121(weights=None, input_tensor = input)
# get output feature maps of at certain layer, ie. conv2_block1_0_relu
cb = ConvBlock()(base.get_layer("conv2_block1_0_relu").output)
flat = Flatten()(cb)
dense = Dense(1000)(flat)
# adding up
adding = Add()([base.output, dense])
model = tf.keras.Model(inputs=[base.input], outputs=adding)
from tensorflow.keras.utils import plot_model
plot_model(model,
show_shapes=True, show_dtype=True,
show_layer_names=True,expand_nested=False)
Here the computation from input to layer conv2_block1_0_relu is computed one time. Next, if we want to translate this functional API to subclassing API, we had to build a model from the base model's input to layer conv2_block1_0_relu first. Like
class ModelWithMidLayer(tf.keras.Model):
def __init__(self, dim=(32, 32, 3)):
super().__init__()
self.dim = dim
self.base = DenseNet121(input_shape=self.dim, weights=None)
# building sub-model from self.base which gives
# desired output feature maps: ie. conv2_block1_0_relu
self.mid_layer = tf.keras.Model(self.base.inputs,
self.base.get_layer("conv2_block1_0_relu").output)
self.flat = Flatten()
self.dense = Dense(1000)
self.add = Add()
self.cb = ConvBlock()
def call(self, x):
# forward with base model
bx = self.base(x)
# forward with mid layer
mx = self.mid_layer(x)
# make same shape or do whatever
mx = self.dense(self.flat(mx))
# combine
out = self.add([bx, mx])
return out
def build_graph(self):
x = tf.keras.layers.Input(shape=(self.dim))
return tf.keras.Model(inputs=[x], outputs=self.call(x))
mwml = ModelWithMidLayer()
plot_model(mwml.build_graph(),
show_shapes=True, show_dtype=True,
show_layer_names=True,expand_nested=False)
Here model_1 is actually a sub-model from DenseNet, which probably leads the whole model (ModelWithMidLayer) to compute the same operation twice. If this observation is correct, then this gives us concern.
I thought it might be much complex but it's actually rather very simple. We just need to build a model with desired output layers at the __init__ method and use it normally in the call method.
import tensorflow as tf
from tensorflow.keras.applications import DenseNet121
from tensorflow.keras.layers import Add
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
class ConvBlock(tf.keras.layers.Layer):
def __init__(self, kernel_num=32, kernel_size=(3,3), strides=(1,1), padding='same'):
super(ConvBlock, self).__init__()
# conv layer
self.conv = tf.keras.layers.Conv2D(kernel_num,
kernel_size=kernel_size,
strides=strides, padding=padding)
# batch norm layer
self.bn = tf.keras.layers.BatchNormalization()
def call(self, input_tensor, training=False):
x = self.conv(input_tensor)
x = self.bn(x, training=training)
return tf.nn.relu(x)
class ModelWithMidLayer(tf.keras.Model):
def __init__(self, dim=(32, 32, 3)):
super().__init__()
self.dim = dim
self.base = DenseNet121(input_shape=self.dim, weights=None)
# building sub-model from self.base which gives
# desired output feature maps: ie. conv2_block1_0_relu
self.mid_layer = tf.keras.Model(
inputs=[self.base.inputs],
outputs=[
self.base.get_layer("conv2_block1_0_relu").output,
self.base.output])
self.flat = Flatten()
self.dense = Dense(1000)
self.add = Add()
self.cb = ConvBlock()
def call(self, x):
# forward with base model
bx = self.mid_layer(x)[1] # output self.base.output
# forward with mid layer
mx = self.mid_layer(x)[0] # output base.get_layer("conv2_block1_0_relu").output
# make same shape or do whatever
mx = self.dense(self.flat(mx))
# combine
out = self.add([bx, mx])
return out
def build_graph(self):
x = tf.keras.layers.Input(shape=(self.dim))
return tf.keras.Model(inputs=[x], outputs=self.call(x))
mwml = ModelWithMidLayer()
tf.keras.utils.plot_model(mwml.build_graph(),
show_shapes=True, show_dtype=True,
show_layer_names=True,expand_nested=False)

Pass non-symbolic tensor to Keras Lambda layer

I am trying to pass a RNNCell object to a Keras lambda layer so that I can use the Tensorflow layer within a Keras model, as follows.
conv_cell = ConvGRUCell(shape = [14, 14],
filters = 32,
kernel = [3,3],
padding = 'SAME')
def convGRU(inputs, cell, length):
output, final = tf.nn.bidirectional_dynamic_rnn(
cell, cell, x, length, dtype=tf.float32)
output = tf.concat(output, -1)
final = tf.concat(final, -1)
return [output, final]
lm = Lambda(lambda x: convGRU(x[0], x[1], x[2])([input, conv_cell, length])
However, I get an error that conv_cell is not a symbolic tensor (it is a custom layer based on Tensorflow's GRUCell).
Is there any way to pass the cell to the lambda layer? I got it to work with functools.partial but it fails to save/load the model because it cannot access the function inside the model.
def convGRU(cell, length): # if length is produced by the model, use it with the inputs
def inner_func(inputs):
code...
return inner_func
lm = Lambda(convGRU(cell, length))(input)
For save/load you need to use custom_objects = {'convGRU': convGRU, 'cell':cell, 'length': length}, etc. Whatever Keras doesn't know automatically needs to be in custom_objects for loading a saved model.

Keras Conv2D custom kernel initialization

I need to initialize custom Conv2D kernels with weights
W = a1b1 + a2b2 + ... + anbn
where W = custom weight of Conv2D layer to initialise with
a = random weight Tensors as keras.backend.variable(np.random.uniform()), shape=(64, 1, 10)
b = fixed basis filters defined as keras.backend.constant(...), shape=(10, 11, 11)
W = K.sum(a[:, :, :, None, None] * b[None, None, :, :, :], axis=2) #shape=(64, 1, 11, 11)
I want my model to update the 'W' values with only changing the 'a's while keeping the 'b's constant.
I pass the custom 'W's as
Conv2D(64, kernel_size=(11, 11), activation='relu', kernel_initializer=kernel_init_L1)(img)
where kernel_init_L1 returns keras.backend.variable(K.reshape(w_L1, (11, 11, 1, 64)))
Problem:
I am not sure if this is the correct way to do this. Is it possible to specify in Keras which ones are trainable and which are not. I know that layers can be set trainable = True but i am not sure about weights.
I think the implementation is incorrect because I get similar results from my model with or without the custom initializations.
It would be immensely helpful if someone can point out any mistakes in my approach or provide a way to verify it.
Warning about your shapes: If your kernel size is (11,11), and assuming you have 64 input channels and 1 output channel, your final kernel shape must be (11,11,64,1).
You should probably be going for a[None,None] and b[:,:,:,None,None].
class CustomConv2D(Conv2D):
def __init__(self, filters, kernel_size, kernelB = None, **kwargs):
super(CustomConv2D, self).__init__(filters, kernel_size,**kwargs)
self.kernelB = kernelB
def build(self, input_shape):
#use the input_shape to calculate the shapes of A and B
#if needed, pay attention to the "data_format" used.
#this is an actual weight, because it uses `self.add_weight`
self.kernelA = self.add_weight(
shape=shape_of_kernel_A + (1,1), #or (1,1) + shape_of_A
initializer='glorot_uniform', #or select another
name='kernelA',
regularizer=self.kernel_regularizer,
constraint=self.kernel_constraint)
#this is an ordinary var that will participate in the calculation
#not a weight, not updated
if self.kernelB is None:
self.kernelB = K.constant(....)
#use the shape already containing the new axes
#in the original conv layer, this property would be the actual kernel,
#now it's just a var that will be used in the original's "call" method
self.kernel = K.sum(self.kernelA * self.kernelB, axis=2)
#important: the resulting shape should be:
#(kernelSizeX, kernelSizeY, input_channels, output_channels)
#the following are remains of the original code for "build" in Conv2D
#use_bias is True by default
if self.use_bias:
self.bias = self.add_weight(shape=(self.filters,),
initializer=self.bias_initializer,
name='bias',
regularizer=self.bias_regularizer,
constraint=self.bias_constraint)
else:
self.bias = None
# Set input spec.
self.input_spec = InputSpec(ndim=self.rank + 2,
axes={channel_axis: input_dim})
self.built = True
Hints for custom layers
When you create a custom layer from zero (derived from Layer), you should have these methods:
__init__(self, ... parameters ...) - this is the creator, it's called when you create a new instance of your layer. Here, you store the values the user passed as parameters. (In a Conv2D, the init would have the "filters", "kernel_size", etc.)
build(self, input_shape) - this is where you should create the weights (all learnable vars are created here, based on the input shape)
compute_output_shape(self,input_shape) - here you return the output shape based on the input shape
call(self,inputs) - Here you perform the actual layer calculations
Since we're not creating this layer from zero, but deriving it from Conv2D, everything is ready, all we did was to "change" the build method and replace what would be considered the kernel of the Conv2D layer.
More on custom layers: https://keras.io/layers/writing-your-own-keras-layers/
The call method for conv layers is here in class _Conv(Layer):.

How to restore the function defined in the graph?

I defined a funciton in tensorflow as follows:
def generator(keep_prob, z, out_channel_dim, alphag1, is_train=True):
"""
Create the generator network
:param z: Input z
:param out_channel_dim: The number of channels in the output image
:param is_train: Boolean if generator is being used for training
:return: The tensor output of the generator
"""
# TODO: Implement Function
# when it is training reuse=False
# when it is not training reuse=True
alpha=alphag1
with tf.variable_scope('generator',reuse=not is_train):
layer = tf.layers.dense(z, 3*3*512,activation=None,\
kernel_initializer=tf.contrib.layers.xavier_initializer(uniform=False))
layer = tf.reshape(layer, [-1, 3,3,512])
layer = tf.layers.batch_normalization(layer, training=is_train)
layer = tf.maximum(layer*alpha, layer)
#layer = layer+tf.random_normal(shape=tf.shape(layer), mean=0.0, stddev=0.0001, dtype=tf.float32)
#layer = tf.nn.dropout(layer,keep_prob)
layer = tf.layers.conv2d_transpose(layer, 256, 4, strides=2, padding='same',\
kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(uniform=False))
layer = tf.layers.batch_normalization(layer, training=is_train)
layer = tf.maximum(layer*alpha, layer)
#layer = layer+tf.random_normal(shape=tf.shape(layer), mean=0.0, stddev=0.00001, dtype=tf.float32)
#layer = tf.nn.dropout(layer,keep_prob)
layer = tf.layers.conv2d_transpose(layer, 128, 4, strides=2, padding='same',\
kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(uniform=False))
layer = tf.layers.batch_normalization(layer, training=is_train)
layer = tf.maximum(layer*alpha, layer)
#layer = layer+tf.random_normal(shape=tf.shape(layer), mean=0.0, stddev=0.000001, dtype=tf.float32)
#layer = tf.nn.dropout(layer,keep_prob)
layer = tf.layers.conv2d_transpose(layer, 64, 4, strides=2, padding='same',\
kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(uniform=False))
layer = tf.layers.batch_normalization(layer, training=is_train)
layer = tf.maximum(layer*alpha, layer)
#layer = layer+tf.random_normal(shape=tf.shape(layer), mean=0.0, stddev=0.0000001, dtype=tf.float32)
#layer = tf.nn.dropout(layer,keep_prob)
layer = tf.layers.conv2d_transpose(layer, out_channel_dim, 4, strides=2, padding='same',\
kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(uniform=False))
#layer = layer+tf.random_normal(shape=tf.shape(layer), mean=0.0, stddev=0.00000001, dtype=tf.float32)
layer = tf.tanh(layer)
return layer
This is complicated such that to track each variable in each layer is difficult.
I later used tf.train.Saver() and saver.save to save everything after training.
Now I would like to restore this function so that I can use it to do further manipulations while keeping the trained weigts of each layer unchanged.
I found online that most function like tf.get_default_graph().get_tensor_by_name or some other functions were limited to restore only the values of the variables but not this function.
For example the input z of this function generator(keep_prob, z, out_channel_dim, alphag1, is_train=True) is a tensor from another function.
I want to restore this function so that I can use two new tensors z1 and z2 with she same shape as z.
layer1 = generator(keep_prob, z1, out_channel_dim, alphag1, is_train=False)
layer2 = generator(keep_prob, z2, out_channel_dim, alphag1, is_train=False)
layer = layer1 - layer2
and I can put this new tensor layer into another function.
Here layer1 and layer2 use the function with the saved weights.
The thing that is difficlut is that when I use the function generator I have to specifiy it with the trianed weights which was stored using Saver(). I find it difficult to specify this function with its weights. For, 1. too many layers to track off and 2. I don't know how to specify weights for tf.layers.conv2().
So are there anyone who know how to solve this issue?
This is a general question:
I save the whole model into file and need to restore part of the model into part of new model.
Here name_map is a dict:the key is new name in graph and value is name in ckpt file.
def get_restore_saver(self, name_map, restore_optimise_var=True):
var_grp = {_.op.name:_ for _ in tf.global_variables()}
varm = {}
for main_name in var_grp:
if main_name in name_map:
varm[name_map[main_name]] = var_grp[main_name]
elif restore_optimise_var: # I use adam to optimise
var_arr = main_name.split('/')
tail = var_arr[-1]
_ = '/'.join(var_arr[: -1])
if tail in ['Adam', 'Adam_1', 'ExponentialMovingAverage'] and _ in name_map:
varm[name_map[_] + "/" + tail] = var_grp[main_name]
return tf.train.Saver(varm)
Why do you need to restore the function and what does that even mean? If you need to use a model, you have to restore the corresponding graph. What your function does is defining nodes of the graph. You may use your function to build or rebuild that graph again and then load weights stored somewhere using Saver() or you may restore graph from the protobuf file.
In order to rebuild the graph, try to invoke invoke your function somewhere output_layer=generator(keep_prob, z, out_channel_dim, alphag1, is_train=True) and than use Saver class as usual to restore weights. Your function does not compute, it defines a part or whole of the graph. All computations are performed by the graph.
In the last case you will find useful the following thread. Usually, you will need to know names of the input and output layers. That can be obtained by the code:
[n.name for n in tf.get_default_graph().as_graph_def().node]
After a long time of searching, it seems that maybe the following is a solution.
Define all the variables in advance,i.e.layer1 = generator(keep_prob, z1,
out_channel_dim, alphag1, is_train=False)
layer2 = generator(keep_prob, z2, out_channel_dim, alphag1, is_train=False)
layer = layer1 - layer2.
Now you can use tf.get_collection to find the operators.
It seems that tensorflow will not give you the pre defined functions. It keeps the graph and values only but not in the form of function. One needs to set everything needed in the furture in the graph or one should keep track of every weights, even too many.

Categories