How to restore the function defined in the graph? - python

I defined a funciton in tensorflow as follows:
def generator(keep_prob, z, out_channel_dim, alphag1, is_train=True):
"""
Create the generator network
:param z: Input z
:param out_channel_dim: The number of channels in the output image
:param is_train: Boolean if generator is being used for training
:return: The tensor output of the generator
"""
# TODO: Implement Function
# when it is training reuse=False
# when it is not training reuse=True
alpha=alphag1
with tf.variable_scope('generator',reuse=not is_train):
layer = tf.layers.dense(z, 3*3*512,activation=None,\
kernel_initializer=tf.contrib.layers.xavier_initializer(uniform=False))
layer = tf.reshape(layer, [-1, 3,3,512])
layer = tf.layers.batch_normalization(layer, training=is_train)
layer = tf.maximum(layer*alpha, layer)
#layer = layer+tf.random_normal(shape=tf.shape(layer), mean=0.0, stddev=0.0001, dtype=tf.float32)
#layer = tf.nn.dropout(layer,keep_prob)
layer = tf.layers.conv2d_transpose(layer, 256, 4, strides=2, padding='same',\
kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(uniform=False))
layer = tf.layers.batch_normalization(layer, training=is_train)
layer = tf.maximum(layer*alpha, layer)
#layer = layer+tf.random_normal(shape=tf.shape(layer), mean=0.0, stddev=0.00001, dtype=tf.float32)
#layer = tf.nn.dropout(layer,keep_prob)
layer = tf.layers.conv2d_transpose(layer, 128, 4, strides=2, padding='same',\
kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(uniform=False))
layer = tf.layers.batch_normalization(layer, training=is_train)
layer = tf.maximum(layer*alpha, layer)
#layer = layer+tf.random_normal(shape=tf.shape(layer), mean=0.0, stddev=0.000001, dtype=tf.float32)
#layer = tf.nn.dropout(layer,keep_prob)
layer = tf.layers.conv2d_transpose(layer, 64, 4, strides=2, padding='same',\
kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(uniform=False))
layer = tf.layers.batch_normalization(layer, training=is_train)
layer = tf.maximum(layer*alpha, layer)
#layer = layer+tf.random_normal(shape=tf.shape(layer), mean=0.0, stddev=0.0000001, dtype=tf.float32)
#layer = tf.nn.dropout(layer,keep_prob)
layer = tf.layers.conv2d_transpose(layer, out_channel_dim, 4, strides=2, padding='same',\
kernel_initializer=tf.contrib.layers.xavier_initializer_conv2d(uniform=False))
#layer = layer+tf.random_normal(shape=tf.shape(layer), mean=0.0, stddev=0.00000001, dtype=tf.float32)
layer = tf.tanh(layer)
return layer
This is complicated such that to track each variable in each layer is difficult.
I later used tf.train.Saver() and saver.save to save everything after training.
Now I would like to restore this function so that I can use it to do further manipulations while keeping the trained weigts of each layer unchanged.
I found online that most function like tf.get_default_graph().get_tensor_by_name or some other functions were limited to restore only the values of the variables but not this function.
For example the input z of this function generator(keep_prob, z, out_channel_dim, alphag1, is_train=True) is a tensor from another function.
I want to restore this function so that I can use two new tensors z1 and z2 with she same shape as z.
layer1 = generator(keep_prob, z1, out_channel_dim, alphag1, is_train=False)
layer2 = generator(keep_prob, z2, out_channel_dim, alphag1, is_train=False)
layer = layer1 - layer2
and I can put this new tensor layer into another function.
Here layer1 and layer2 use the function with the saved weights.
The thing that is difficlut is that when I use the function generator I have to specifiy it with the trianed weights which was stored using Saver(). I find it difficult to specify this function with its weights. For, 1. too many layers to track off and 2. I don't know how to specify weights for tf.layers.conv2().
So are there anyone who know how to solve this issue?

This is a general question:
I save the whole model into file and need to restore part of the model into part of new model.
Here name_map is a dict:the key is new name in graph and value is name in ckpt file.
def get_restore_saver(self, name_map, restore_optimise_var=True):
var_grp = {_.op.name:_ for _ in tf.global_variables()}
varm = {}
for main_name in var_grp:
if main_name in name_map:
varm[name_map[main_name]] = var_grp[main_name]
elif restore_optimise_var: # I use adam to optimise
var_arr = main_name.split('/')
tail = var_arr[-1]
_ = '/'.join(var_arr[: -1])
if tail in ['Adam', 'Adam_1', 'ExponentialMovingAverage'] and _ in name_map:
varm[name_map[_] + "/" + tail] = var_grp[main_name]
return tf.train.Saver(varm)

Why do you need to restore the function and what does that even mean? If you need to use a model, you have to restore the corresponding graph. What your function does is defining nodes of the graph. You may use your function to build or rebuild that graph again and then load weights stored somewhere using Saver() or you may restore graph from the protobuf file.
In order to rebuild the graph, try to invoke invoke your function somewhere output_layer=generator(keep_prob, z, out_channel_dim, alphag1, is_train=True) and than use Saver class as usual to restore weights. Your function does not compute, it defines a part or whole of the graph. All computations are performed by the graph.
In the last case you will find useful the following thread. Usually, you will need to know names of the input and output layers. That can be obtained by the code:
[n.name for n in tf.get_default_graph().as_graph_def().node]

After a long time of searching, it seems that maybe the following is a solution.
Define all the variables in advance,i.e.layer1 = generator(keep_prob, z1,
out_channel_dim, alphag1, is_train=False)
layer2 = generator(keep_prob, z2, out_channel_dim, alphag1, is_train=False)
layer = layer1 - layer2.
Now you can use tf.get_collection to find the operators.
It seems that tensorflow will not give you the pre defined functions. It keeps the graph and values only but not in the form of function. One needs to set everything needed in the furture in the graph or one should keep track of every weights, even too many.

Related

How to override gradient for the nonlinearity functions in lasagne?

I have a model, for which i need to compute the gradients of output w.r.t the model's input. But I want to apply some custom gradients for some of the nonlinearity functions applied on some of the model's layers. So i tried the idea explained here, which computes the nonlinear rectifier (RELU) in the forward pass but modifies the gradients of Relu in the backward pass. I added the following two classes:
The helper class that allows us to replace a nonlinearity with an Op
that has the same output, but a custom gradient
class ModifiedBackprop(object):
def __init__(self, nonlinearity):
self.nonlinearity = nonlinearity
self.ops = {} # memoizes an OpFromGraph instance per tensor type
def __call__(self, x):
# OpFromGraph is oblique to Theano optimizations, so we need to move
# things to GPU ourselves if needed.
if theano.sandbox.cuda.cuda_enabled:
maybe_to_gpu = theano.sandbox.cuda.as_cuda_ndarray_variable
else:
maybe_to_gpu = lambda x: x
# We move the input to GPU if needed.
x = maybe_to_gpu(x)
# We note the tensor type of the input variable to the nonlinearity
# (mainly dimensionality and dtype); we need to create a fitting Op.
tensor_type = x.type
# If we did not create a suitable Op yet, this is the time to do so.
if tensor_type not in self.ops:
# For the graph, we create an input variable of the correct type:
inp = tensor_type()
# We pass it through the nonlinearity (and move to GPU if needed).
outp = maybe_to_gpu(self.nonlinearity(inp))
# Then we fix the forward expression...
op = theano.OpFromGraph([inp], [outp])
# ...and replace the gradient with our own (defined in a subclass).
op.grad = self.grad
# Finally, we memoize the new Op
self.ops[tensor_type] = op
# And apply the memoized Op to the input we got.
return self.ops[tensor_type](x)
The subclass that does guided backpropagation through a nonlinearity:
class GuidedBackprop(ModifiedBackprop):
def grad(self, inputs, out_grads):
(inp,) = inputs
(grd,) = out_grads
dtype = inp.dtype
print('It works')
return (grd * (inp > 0).astype(dtype) * (grd > 0).astype(dtype),)
Then i used them in my code as follows:
import lasagne as nn
model_in = T.tensor3()
# model_in = net['input'].input_var
nn.layers.set_all_param_values(net['l_out'], model['param_values'])
relu = nn.nonlinearities.rectify
relu_layers = [layer for layer in
nn.layers.get_all_layers(net['l_out']) if getattr(layer,
'nonlinearity', None) is relu]
modded_relu = GuidedBackprop(relu)
for layer in relu_layers:
layer.nonlinearity = modded_relu
prop = nn.layers.get_output(
net['l_out'], model_in, deterministic=True)
for sample in range(ini, batch_len):
model_out = prop[sample, 'z'] # get prop for label 'z'
gradients = theano.gradient.jacobian(model_out, wrt=model_in)
# gradients = theano.grad(model_out, wrt=model_in)
get_gradients = theano.function(inputs=[model_in],
outputs=gradients)
grads = get_gradients(X_batch) # gradient dimension: X_batch == model_in(64, 20, 32)
grads = np.array(grads)
grads = grads[sample]
Now when i run the code, it works without any error, and the shape of the output is also correct. But that's because it executes the default theano.grad function and not the one supposed to override it. In other words, the grad() function in the class GuidedBackprop never been invoked.
I can't understand what is the issue?
is there's a solution?
If this is an unresolved issue, is there's an implementation for a Theano Op that can achieve such a functionality or some other way to override gradient for specific nonlinearity functions applied on some of the model's layers?
Are you try to set it back the value of model output into model layer input, all gradients calculation
group_1_ShoryuKen_Left = tf.constant([ 0,0,0,0,0,1,0,0,0,0,0,0, 0,0,0,0,0,1,0,1,0,0,0,0, 0,0,0,0,0,0,0,1,0,0,0,0, 0,0,0,0,0,0,0,0,0,1,0,0 ], shape=(1, 1, 48), dtype=tf.float32)
## layer_2 = tf.keras.layers.Dense(256, kernel_initializer=tf.constant_initializer(1.))
layer_2 = tf.keras.layers.LSTM(32, kernel_initializer=tf.constant_initializer(1.))
b_out = layer_2(group_1_ShoryuKen_Left)
layer_2.set_weights(layer_1.get_weights())

Save and restore for a CNN based Denoising Network Tensorflow

My question is about restoring the Denoised Trained Model.
I have my network defined in the following way.
Conv1->relu1->Conv2->relu2->Conv3->relu3->Deconv1
The tf.variable_scope(name) is same as above.
Now I have my loss, optimizer and accuracy defined with tf.name_scope.
When I try to restore loss function, It will ask even for labels (which I don't have).
feed_dict={x:input, y:labels}
sess.run('loss',feed_dict)
Can anyone please help me understand how to test this? Which operation should I restore ?
Should I have to call all layers, pass the input and check the loss(MSE)?
I checked many examples but it seems to be all Classification problem and defining softmax with logits at last works.
Edit:
Below is my code and now it is easily visible how tf.name_scope and tf.variable_scope is defined. I feel I may have to bring whole layer to test new Image. Is that right?
def new_conv_layer(input, num_input_channels, filter_size, num_filters, name):
with tf.variable_scope(name):
# Shape of the filter-weights for the convolution
shape = [filter_size, filter_size, num_input_channels, num_filters]
# Create new weights (filters) with the given shape
weights = tf.Variable(tf.truncated_normal([filter_size, filter_size, num_input_channels, num_filters], stddev=0.5))
# Create new biases, one for each filter
biases = tf.Variable(tf.constant(0.05, shape=[num_filters]))
filters = tf.Variable(tf.truncated_normal([filter_size, filter_size, num_input_channels, num_filters], stddev=0.5))
# TensorFlow operation for convolution
layer = tf.nn.conv2d(input=input, filter=filters, strides=[1,1,1,1], padding='SAME')
# Add the biases to the results of the convolution.
layer += biases
return layer, weights
def new_relu_layer(input, name):
with tf.variable_scope(name):
#TensorFlow operation for convolution
layer = tf.nn.relu(input)
return layer
def new_pool_layer(input, name):
with tf.variable_scope(name):
# TensorFlow operation for convolution
layer = tf.nn.max_pool(value=input, ksize=[1, 1, 1, 1], strides=[1, 1, 1, 1], padding='SAME')
return layer
def new_layer(inputs, filters,kernel_size,strides,padding, name):
with tf.variable_scope(name):
layer = tf.layers.conv2d_transpose(inputs=inputs, filters=filters , kernel_size=kernel_size, strides=strides, padding=padding, data_format = 'channels_last')
return layer
layer_conv1, weights_conv1 = new_conv_layer(input=yTraininginput, num_input_channels=1, filter_size=5, num_filters=32, name ="conv1")
layer_relu1 = new_relu_layer(layer_conv1, name="relu1")
layer_conv2, weights_conv2 = new_conv_layer(input=layer_relu1, num_input_channels=32, filter_size=5, num_filters=64, name ="conv2")
layer_relu2 = new_relu_layer(layer_conv2, name="relu2")
layer_conv3, weights_conv3 = new_conv_layer(input=layer_relu2, num_input_channels=64, filter_size=5, num_filters=128, name ="conv3")
layer_relu3 = new_relu_layer(layer_conv3, name="relu3")
layer_deconv1 = new_layer(inputs=layer_relu3, filters=1, kernel_size=[5,5] ,strides=[1,1] ,padding='same',name = 'deconv1')
layer_relu4 = new_relu_layer(layer_deconv1, name="relu4")
layer_conv4, weights_conv4 = new_conv_layer(input=layer_relu4, num_input_channels=1, filter_size=5, num_filters=128, name ="conv4")
layer_relu5 = new_relu_layer(layer_conv4, name="relu5")
layer_deconv2 = new_layer(inputs=layer_relu5, filters=1, kernel_size=[5,5] ,strides=[1,1] ,padding='same',name = 'deconv2')
layer_relu6 = new_relu_layer(layer_deconv2, name="relu6")
# Use Cross entropy cost function
with tf.name_scope("loss"):
cross_entropy = tf.losses.mean_squared_error(labels = xTraininglabel,predictions = layer_relu6)
# Use Adam Optimizer
with tf.name_scope("optimizer"):
optimizer = tf.train.AdamOptimizer(learning_rate=1e-6).minimize(loss = cross_entropy)
# Accuracy
with tf.name_scope("accuracy"):
accuracy = tf.image.psnr(a=layer_relu6,b=xTraininglabel,max_val=1.0)
Try to view the graph of your code on tensorboard, get the operation name from the last layer(in your case deconv4). Something like below image.
Try loading the tensor, using below code:
operation = graph.get_tensor_by_name("<operationname:0>")
This should work, as your layers are interconnected.
Let me know if this worked!
Operation Image

Keras Conv2D custom kernel initialization

I need to initialize custom Conv2D kernels with weights
W = a1b1 + a2b2 + ... + anbn
where W = custom weight of Conv2D layer to initialise with
a = random weight Tensors as keras.backend.variable(np.random.uniform()), shape=(64, 1, 10)
b = fixed basis filters defined as keras.backend.constant(...), shape=(10, 11, 11)
W = K.sum(a[:, :, :, None, None] * b[None, None, :, :, :], axis=2) #shape=(64, 1, 11, 11)
I want my model to update the 'W' values with only changing the 'a's while keeping the 'b's constant.
I pass the custom 'W's as
Conv2D(64, kernel_size=(11, 11), activation='relu', kernel_initializer=kernel_init_L1)(img)
where kernel_init_L1 returns keras.backend.variable(K.reshape(w_L1, (11, 11, 1, 64)))
Problem:
I am not sure if this is the correct way to do this. Is it possible to specify in Keras which ones are trainable and which are not. I know that layers can be set trainable = True but i am not sure about weights.
I think the implementation is incorrect because I get similar results from my model with or without the custom initializations.
It would be immensely helpful if someone can point out any mistakes in my approach or provide a way to verify it.
Warning about your shapes: If your kernel size is (11,11), and assuming you have 64 input channels and 1 output channel, your final kernel shape must be (11,11,64,1).
You should probably be going for a[None,None] and b[:,:,:,None,None].
class CustomConv2D(Conv2D):
def __init__(self, filters, kernel_size, kernelB = None, **kwargs):
super(CustomConv2D, self).__init__(filters, kernel_size,**kwargs)
self.kernelB = kernelB
def build(self, input_shape):
#use the input_shape to calculate the shapes of A and B
#if needed, pay attention to the "data_format" used.
#this is an actual weight, because it uses `self.add_weight`
self.kernelA = self.add_weight(
shape=shape_of_kernel_A + (1,1), #or (1,1) + shape_of_A
initializer='glorot_uniform', #or select another
name='kernelA',
regularizer=self.kernel_regularizer,
constraint=self.kernel_constraint)
#this is an ordinary var that will participate in the calculation
#not a weight, not updated
if self.kernelB is None:
self.kernelB = K.constant(....)
#use the shape already containing the new axes
#in the original conv layer, this property would be the actual kernel,
#now it's just a var that will be used in the original's "call" method
self.kernel = K.sum(self.kernelA * self.kernelB, axis=2)
#important: the resulting shape should be:
#(kernelSizeX, kernelSizeY, input_channels, output_channels)
#the following are remains of the original code for "build" in Conv2D
#use_bias is True by default
if self.use_bias:
self.bias = self.add_weight(shape=(self.filters,),
initializer=self.bias_initializer,
name='bias',
regularizer=self.bias_regularizer,
constraint=self.bias_constraint)
else:
self.bias = None
# Set input spec.
self.input_spec = InputSpec(ndim=self.rank + 2,
axes={channel_axis: input_dim})
self.built = True
Hints for custom layers
When you create a custom layer from zero (derived from Layer), you should have these methods:
__init__(self, ... parameters ...) - this is the creator, it's called when you create a new instance of your layer. Here, you store the values the user passed as parameters. (In a Conv2D, the init would have the "filters", "kernel_size", etc.)
build(self, input_shape) - this is where you should create the weights (all learnable vars are created here, based on the input shape)
compute_output_shape(self,input_shape) - here you return the output shape based on the input shape
call(self,inputs) - Here you perform the actual layer calculations
Since we're not creating this layer from zero, but deriving it from Conv2D, everything is ready, all we did was to "change" the build method and replace what would be considered the kernel of the Conv2D layer.
More on custom layers: https://keras.io/layers/writing-your-own-keras-layers/
The call method for conv layers is here in class _Conv(Layer):.

Duplicate/Replicate tensorflow layers with same properties to form a graph

The task I try to implement by using a neural network differs a bit from its most common usage. I try to simulate a physical process by propagating something from the input to the output layer by optimizing the network's weights which represents physical properties.
Therefor I need a i.e. 150 layer network where each layer has the same properties in the form
mx+b
where x is my variable I like to optimize and m an external factor which is the same for each layer (b is not in use right now).
I would like to automate the process of creating the graph rather than copy/paste each layer. So is there a function to copy the structure of the first layer to all following layers?
In tensorflow it should look like something like this here:
graph = tf.Graph()
with graph.as_default():
# Input data.
tf_input = tf.placeholder(tf.float32, shape=(n_data, n))
tf_spatial_grid = tf.constant(m_index_mat)
tf_ph_unit = tf.const(m_unit_mat)
tf_output = tf.placeholder(tf.float32, shape=(n_data, n))
# new hidden layer 1
hidden_weights = tf.Variable( tf.truncated_normal([n*n, 1]) )
hidden_layer = tf.nn.matmul( tf.matmul( tf_input, hidden_weights), tf_ph_unit)
# new hidden layer 2
hidden_weights_2 = tf.Variable( tf.truncated_normal([n*n, 1]) )
hidden_layer_2 = tf.nn.matmul( tf.matmul( hidden_layer, hidden_weights_2), tf_ph_unit)
......
# new hidden layer n
hidden_weights_n = tf.Variable( tf.truncated_normal([n*n, 1]) )
hidden_layer_n = tf.nn.matmul( tf.matmul( hidden_layer_m, hidden_weights_n), tf_ph_unit)
...
So is there any option that automates this process somehow? Maybe I'm missing something
I really appreciate any help!
The easiest way to accomplish that is to create a function that builds your layer and simply invoke the function multiple times, possibly in a loop.
For instance:
def layer(input):
hidden_weights = tf.Variable( tf.truncated_normal([n*n, 1]) )
hidden_layer = tf.nn.matmul( tf.matmul( input, hidden_weights), tf_ph_unit)
return hidden_layer
and then:
input = tf_input
for i in range(10):
input = layer(input)

How to add regularizations in TensorFlow?

I found in many available neural network code implemented using TensorFlow that regularization terms are often implemented by manually adding an additional term to loss value.
My questions are:
Is there a more elegant or recommended way of regularization than doing it manually?
I also find that get_variable has an argument regularizer. How should it be used? According to my observation, if we pass a regularizer to it (such as tf.contrib.layers.l2_regularizer, a tensor representing regularized term will be computed and added to a graph collection named tf.GraphKeys.REGULARIZATOIN_LOSSES. Will that collection be automatically used by TensorFlow (e.g. used by optimizers when training)? Or is it expected that I should use that collection by myself?
As you say in the second point, using the regularizer argument is the recommended way. You can use it in get_variable, or set it once in your variable_scope and have all your variables regularized.
The losses are collected in the graph, and you need to manually add them to your cost function like this.
reg_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
reg_constant = 0.01 # Choose an appropriate one.
loss = my_normal_loss + reg_constant * sum(reg_losses)
A few aspects of the existing answer were not immediately clear to me, so here is a step-by-step guide:
Define a regularizer. This is where the regularization constant can be set, e.g.:
regularizer = tf.contrib.layers.l2_regularizer(scale=0.1)
Create variables via:
weights = tf.get_variable(
name="weights",
regularizer=regularizer,
...
)
Equivalently, variables can be created via the regular weights = tf.Variable(...) constructor, followed by tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, weights).
Define some loss term and add the regularization term:
reg_variables = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
reg_term = tf.contrib.layers.apply_regularization(regularizer, reg_variables)
loss += reg_term
Note: It looks like tf.contrib.layers.apply_regularization is implemented as an AddN, so more or less equivalent to sum(reg_variables).
I'll provide a simple correct answer since I didn't find one. You need two simple steps, the rest is done by tensorflow magic:
Add regularizers when creating variables or layers:
tf.layers.dense(x, kernel_regularizer=tf.contrib.layers.l2_regularizer(0.001))
# or
tf.get_variable('a', regularizer=tf.contrib.layers.l2_regularizer(0.001))
Add the regularization term when defining loss:
loss = ordinary_loss + tf.losses.get_regularization_loss()
Another option to do this with the contrib.learn library is as follows, based on the Deep MNIST tutorial on the Tensorflow website. First, assuming you've imported the relevant libraries (such as import tensorflow.contrib.layers as layers), you can define a network in a separate method:
def easier_network(x, reg):
""" A network based on tf.contrib.learn, with input `x`. """
with tf.variable_scope('EasyNet'):
out = layers.flatten(x)
out = layers.fully_connected(out,
num_outputs=200,
weights_initializer = layers.xavier_initializer(uniform=True),
weights_regularizer = layers.l2_regularizer(scale=reg),
activation_fn = tf.nn.tanh)
out = layers.fully_connected(out,
num_outputs=200,
weights_initializer = layers.xavier_initializer(uniform=True),
weights_regularizer = layers.l2_regularizer(scale=reg),
activation_fn = tf.nn.tanh)
out = layers.fully_connected(out,
num_outputs=10, # Because there are ten digits!
weights_initializer = layers.xavier_initializer(uniform=True),
weights_regularizer = layers.l2_regularizer(scale=reg),
activation_fn = None)
return out
Then, in a main method, you can use the following code snippet:
def main(_):
mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
x = tf.placeholder(tf.float32, [None, 784])
y_ = tf.placeholder(tf.float32, [None, 10])
# Make a network with regularization
y_conv = easier_network(x, FLAGS.regu)
weights = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'EasyNet')
print("")
for w in weights:
shp = w.get_shape().as_list()
print("- {} shape:{} size:{}".format(w.name, shp, np.prod(shp)))
print("")
reg_ws = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES, 'EasyNet')
for w in reg_ws:
shp = w.get_shape().as_list()
print("- {} shape:{} size:{}".format(w.name, shp, np.prod(shp)))
print("")
# Make the loss function `loss_fn` with regularization.
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
loss_fn = cross_entropy + tf.reduce_sum(reg_ws)
train_step = tf.train.AdamOptimizer(1e-4).minimize(loss_fn)
To get this to work you need to follow the MNIST tutorial I linked to earlier and import the relevant libraries, but it's a nice exercise to learn TensorFlow and it's easy to see how the regularization affects the output. If you apply a regularization as an argument, you can see the following:
- EasyNet/fully_connected/weights:0 shape:[784, 200] size:156800
- EasyNet/fully_connected/biases:0 shape:[200] size:200
- EasyNet/fully_connected_1/weights:0 shape:[200, 200] size:40000
- EasyNet/fully_connected_1/biases:0 shape:[200] size:200
- EasyNet/fully_connected_2/weights:0 shape:[200, 10] size:2000
- EasyNet/fully_connected_2/biases:0 shape:[10] size:10
- EasyNet/fully_connected/kernel/Regularizer/l2_regularizer:0 shape:[] size:1.0
- EasyNet/fully_connected_1/kernel/Regularizer/l2_regularizer:0 shape:[] size:1.0
- EasyNet/fully_connected_2/kernel/Regularizer/l2_regularizer:0 shape:[] size:1.0
Notice that the regularization portion gives you three items, based on the items available.
With regularizations of 0, 0.0001, 0.01, and 1.0, I get test accuracy values of 0.9468, 0.9476, 0.9183, and 0.1135, respectively, showing the dangers of high regularization terms.
If anyone's still looking, I'd just like to add on that in tf.keras you may add weight regularization by passing them as arguments in your layers. An example of adding L2 regularization taken wholesale from the Tensorflow Keras Tutorials site:
model = keras.models.Sequential([
keras.layers.Dense(16, kernel_regularizer=keras.regularizers.l2(0.001),
activation=tf.nn.relu, input_shape=(NUM_WORDS,)),
keras.layers.Dense(16, kernel_regularizer=keras.regularizers.l2(0.001),
activation=tf.nn.relu),
keras.layers.Dense(1, activation=tf.nn.sigmoid)
])
There's no need to manually add in the regularization losses with this method as far as I know.
Reference: https://www.tensorflow.org/tutorials/keras/overfit_and_underfit#add_weight_regularization
I tested tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) and tf.losses.get_regularization_loss() with one l2_regularizer in the graph, and found that they return the same value. By observing the value's quantity, I guess reg_constant has already make sense on the value by setting the parameter of tf.contrib.layers.l2_regularizer.
If you have CNN you may do the following:
In your model function:
conv = tf.layers.conv2d(inputs=input_layer,
filters=32,
kernel_size=[3, 3],
kernel_initializer='xavier',
kernel_regularizer=tf.contrib.layers.l2_regularizer(1e-5),
padding="same",
activation=None)
...
In your loss function:
onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=num_classes)
loss = tf.losses.softmax_cross_entropy(onehot_labels=onehot_labels, logits=logits)
regularization_losses = tf.losses.get_regularization_losses()
loss = tf.add_n([loss] + regularization_losses)
cross_entropy = tf.losses.softmax_cross_entropy(
logits=logits, onehot_labels=labels)
l2_loss = weight_decay * tf.add_n(
[tf.nn.l2_loss(tf.cast(v, tf.float32)) for v in tf.trainable_variables()])
loss = cross_entropy + l2_loss
Some answers make me more confused.Here I give two methods to make it clearly.
#1.adding all regs by hand
var1 = tf.get_variable(name='v1',shape=[1],dtype=tf.float32)
var2 = tf.Variable(name='v2',initial_value=1.0,dtype=tf.float32)
regularizer = tf.contrib.layers.l1_regularizer(0.1)
reg_term = tf.contrib.layers.apply_regularization(regularizer,[var1,var2])
#here reg_term is a scalar
#2.auto added and read,but using get_variable
with tf.variable_scope('x',
regularizer=tf.contrib.layers.l2_regularizer(0.1)):
var1 = tf.get_variable(name='v1',shape=[1],dtype=tf.float32)
var2 = tf.get_variable(name='v2',shape=[1],dtype=tf.float32)
reg_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
#here reg_losses is a list,should be summed
Then,it can be added into the total loss
tf.GraphKeys.REGULARIZATION_LOSSES will not be added automatically, but there is a simple way to add them:
reg_loss = tf.losses.get_regularization_loss()
total_loss = loss + reg_loss
tf.losses.get_regularization_loss() uses tf.add_n to sum the entries of tf.GraphKeys.REGULARIZATION_LOSSES element-wise. tf.GraphKeys.REGULARIZATION_LOSSES will typically be a list of scalars, calculated using regularizer functions. It gets entries from calls to tf.get_variable that have the regularizer parameter specified. You can also add to that collection manually. That would be useful when using tf.Variable and also when specifying activity regularizers or other custom regularizers. For instance:
#This will add an activity regularizer on y to the regloss collection
regularizer = tf.contrib.layers.l2_regularizer(0.1)
y = tf.nn.sigmoid(x)
act_reg = regularizer(y)
tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, act_reg)
(In this example it would presumably be more effective to regularize x, as y really flattens out for large x.)

Categories