Using custom tensorflow ops in keras

Using custom tensorflow ops in keras - python

I am having a script in tensorflow which contains the custom tensorflow ops. I want to port the code to keras and I am not sure how to call the custom ops within keras code.
I want to use tensorflow within keras, so the tutorial I found so far is describing the opposite to what I want: https://blog.keras.io/keras-as-a-simplified-interface-to-tensorflow-tutorial.html.
I also read about Lambda layers that can wrap arbitrary custom function, yet I did not see an example for tf.ops.
If you could provide code snippet with a simplest example how to do that I would be very grateful. For example assuming the tf.ops as:
outC = my_custom_op(inA, inB)
---EDIT:
Similar problem has been described in here - essentially calling this custom op in keras, however I cannot grasp the solution how to apply it on another example that I want, for instance this one. This custom tf op is first compiled (for gpu) and then so far used within tensorflow as here, see # line 40. It is clear for me how to use a custom (lambda) function wrapped in Lambda layer, what I would like to understand is how to use the compiled custom ops, if I use keras.

You can wrap arbitrary tensorflow functions in a keras Lambda layer and add them to your model. Minimal working example from this answer:
import tensorflow as tf
from keras.layers import Dense, Lambda, Input
from keras.models import Model
W = tf.random_normal(shape=(128,20))
b = tf.random_normal(shape=(20,))
inp = Input(shape=(10,))
x = Dense(128)(inp)
# Custom linear transformation
y = Lambda(lambda x: tf.matmul(x, W) + b, name='custom_layer')(x)
model = Model(inp, y)

Related

Optimize and (re)save saved-model with grappler

I've got a (TF2) saved-model full of training ops clutter and I'm trying to optimize it for inference using grappler, but I want to save it back to a TF2 saved-model subsequently (to keep the general workflow away from TF1).
I currently have:
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2_as_graph
from tensorflow.lite.python.util import run_graph_optimizations, get_grappler_config
# Load the saved-model and get the inference concrete function
sm = tf.saved_model.load('path/to/savedmodel/dir')
func = sm.signatures['serving_default']
# Replace variables with constants in order to get rid of the training clutter
frozen_func, graph_def = convert_variables_to_constants_v2_as_graph(func)
# Use grappler to optimize the concrete function graph after replacing vars with constants
input_tensors = [tsr for tsr in frozen_func.inputs if tsr.dtype != tf.resource]
output_tensors = frozen_func.outputs
graph_def = run_graph_optimizations(graph_def, input_tensors, output_tensors,
config=get_grappler_config(["constfold", "function"]),
graph=frozen_func.graph)
# Here the intention is to somehow reconvert the optimized graph-def into a concrete function
# and subsequently re-save that as a TF2(not TF1!) saved-model, is there a way to do that?
frozen_func_graph = tf.Graph()
with frozen_func_graph.as_default():
tf.import_graph_def(graph_def, name='')
# ... what now?
The issue is, since direct tf.Graph usage has been deprecated in TF2, I intend to convert the optimized graph back to a TF2 saved-model. I was thinking of doing that by somehow manually constructing a ConcreteFunction wrapping this optimized graph, but as far as I've researched, there seems to be now way to do that. This would basically mean I'd still have to use TF1 compat APIs, which ideally I'd like to avoid.
The ugly (ugly) option I'd really like to avoid would be (haven't tried it yet but would probably work):
use v1 APIs to construct a TF1 saved-model using tf.compat.v1.saved_model.builder.SavedModelBuilder and save the TF1 saved-model
load back the TF1 saved-model using v2 API (so tf.saved_model.load instead of tf.compat.v1.saved_model.load, the former converts a TF1 saved-model automatically to a TF2 saved-model)
(re)save the converted TF2 saved-model
Is there a way to do this nicely? Preferably also without being forced to dump the optimized saved-model if I don't want to, seems that constructing saved-models in memory is not possible? (that's not such a big issue though)

Finally got it, not ideal since I use internal (more or less) TF2 API calls, but at least TF1 compat APIs are not used at all. Here's the full code.
import tensorflow as tf
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2_as_graph
from tensorflow.lite.python.util import run_graph_optimizations, get_grappler_config
from tensorflow.python.tools.optimize_for_inference_lib import optimize_for_inference
from tensorflow.python.eager import context, wrap_function
# Load the saved-model and get the inference concrete function
sm = tf.saved_model.load('path/to/savedmodel/dir')
func = sm.signatures['serving_default'] # note: key might differ according to what your model's inference function is
# Replace variables with constants in order to get rid of the training clutter
frozen_func, graph_def = convert_variables_to_constants_v2_as_graph(func)
# Use grappler to optimize the concrete function graph after replacing vars with constants
input_tensors = [tsr for tsr in frozen_func.inputs if tsr.dtype != tf.resource]
output_tensors = frozen_func.outputs
graph_def = run_graph_optimizations(graph_def, input_tensors, output_tensors,
config=get_grappler_config(["constfold", "function"]),
graph=frozen_func.graph)
# Optimize for inference
input_tsr_names = [tsr.name for tsr in input_tensors]
output_tsr_names = [tsr.name for tsr in output_tensors]
input_node_names = list(set([tsr_name.rsplit(':', 1)[0] for tsr_name in input_tsr_names]))
output_node_names = list(set([tsr_name.rsplit(':', 1)[0] for tsr_name in output_tsr_names]))
graph_def = optimize_for_inference(input_graph_def=graph_def,
input_node_names=input_node_names,
placeholder_type_enum=tf.dtypes.float32.as_datatype_enum,
output_node_names=output_node_names,
toco_compatible=True)
# This next part inspired from _construct_concrete_function function here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/framework/convert_to_constants.py#L1062
# Remove old functions to use updated functions from graph def - not sure if this is actually needed here, didn't look into it
for f in graph_def.library.function:
if context.context().has_function(f.signature.name):
context.context().remove_function(f.signature.name)
# GraphDef to concrete function
opt_frozen_func = wrap_function.function_from_graph_def(graph_def,
input_tsr_names,
output_tsr_names)
# Wrap concrete function into module to export as saved-model
class OptimizedFrozenModel(tf.Module):
def __init__(self, name=None):
super().__init__(name)
module = OptimizedFrozenModel()
module.__call__ = opt_frozen_func
# Export frozen & optimized saved-model
tf.saved_model.save(module, 'path/to/optimized_savedmodel/dir', signatures=opt_frozen_func)

When using torch.autocast, how do I force individual layers to float32

I'm trying to train a model in mixed precision. However, I want a few of the layers to be in full precision for stability reasons. How do I force an individual layer to be float32 when using torch.autocast? In particular, I'd like for this to be onnx compileable.
Is it something like:
with torch.cuda.amp.autocast(enabled=False, dtype=torch.float32):
out = my_unstable_layer(inputs.float())
Edit:
Looks like this is indeed the official method. See the torch docs.

I think the motivation of torch.autocast is to automate the reduction of precision (not the increase).
If you have functions that need a particular dtype, you should consider using, custom_fwd
import torch
#torch.cuda.amp.custom_fwd(cast_inputs=torch.complex128)
def get_custom(x):
print(' Decorated function received', x.dtype)
def regular_func(x):
print(' Regular function received', x.dtype)
get_custom(x)
x = torch.tensor(0.0, dtype=torch.half, device='cuda')
with torch.cuda.amp.autocast(False):
print('autocast disabled')
regular_func(x)
with torch.cuda.amp.autocast(True):
print('autocast enabled')
regular_func(x)
autocast disabled
Regular function received torch.float16
Decorated function received torch.float16
autocast enabled
Regular function received torch.float16
Decorated function received torch.complex128
Edit: Using torchscript
I am not sure how much you can rely on this, due to a comment in the documentation. However the comment is apparently outdated.
Here is an example where I trace the model with autocast enabled, feeze it and then I use it and the value is indeed cast to the specified type
class Cast(torch.nn.Module):
#torch.cuda.amp.custom_fwd(cast_inputs=torch.float64)
def forward(self, x):
return x
with torch.cuda.amp.autocast(True):
model = torch.jit.trace(Cast().eval(), x)
model = torch.jit.freeze(model)
x = torch.tensor(0.0, dtype=torch.half, device='cuda')
print(model(x).dtype)
torch.float64
But I suggest you to validate this approach before using it for a serious application.

Keras concatenate is not defined when loading the model

In one of my lambda layers, I used from keras.layers import concatenate to concatenate two tensors and it worked without any problem during training and I successfully saved the model files.
However, when I'm loading the model, it throws me this error:
NameError: name 'concatenate' is not defined
Does anyone know what might be wrong? I've imported concatenate before I load the model.
The lambda layer looks like this:
def concat_l1_l2(vests):
l1, l2 = vests
l1 = K.l2_normalize(l1, axis=-1)
l2 = K.l2_normalize(l2, axis=-1)
return concatenate([l1, l2])

I had the same issue when loading a model from a json file, try the following line (it worked for me):
from keras.layers import concatenate
model_from_json(model_file, custom_objects={'concatenate': concatenate})

Maybe the following will solve your problem.
Try to pass your costum function to the load function of keras, i.e.
load(model_path,{"concat_l1_l2":concat_l1_l2})

Tensorflow Estimator: How to run an operation within model_fn

I got a very short question, which has probably a very simple answer but I just can't figure it out, although I tried for hours now.
I'm using Tensorflow Estimator and I want to access the global step within my model_fn. I've tried tf.train.get_global_step, which returns me a Tensor. I need the global_step as an integer though (or as a string)!
So I've tried to eval() (= tf.get_default_session().run(t)), but it doesn't work..
Cheers!

You can use tf.cast to cast the Tensor to int or string.
For example,
tf.cast(tf.train.get_global_step(), dtype=tf.int)
See the reference here.

One way would be to parse it from the latest checkpoint file in the model_dir.
So assuming you can pass the model_dir into the model_fn (either through the params argument of tf.estimator.Estimator(..., params={'model_dir': 'path/to/model_dir'}) or through tf.flags.FLAGS, you can then use this utility function:
import tensorflow as tf
def get_global_step_from_model_dir(model_dir):
latest_checkpoint_file = tf.train.latest_checkpoint(model_dir)
if latest_checkpoint_file is None:
return 0
else:
return int(os.path.basename(latest_checkpoint_file).split('-')[-1])

What is a `"Python"` layer in caffe?

Caffe has a layer type "Python".
For instance, this layer type can be used as a loss layer.
On other occasions it is used as an input layer.
What is this layer type?
How can this layer be used?

Prune's and Bharat's answers gives the overall purpose of a "Python" layer: a general purpose layer which is implemented in python rather than c++.
I intend this answer to serve as a tutorial for using "Python" layer.
A Tutorial for "Python" layer
what is a "Python" layer?
Please see the excellent answers of Prune and Bharat.
Pre-requisite
In order to use 'Python" layer you need to compile caffe with flag
WITH_PYTHON_LAYER := 1
set in 'Makefile.config'.
How to implement a "Python" layer?
A "Python" layer should be implemented as a python class derived from caffe.Layer base class. This class must have the following four methods:
import caffe
class my_py_layer(caffe.Layer):
def setup(self, bottom, top):
pass
def reshape(self, bottom, top):
pass
def forward(self, bottom, top):
pass
def backward(self, top, propagate_down, bottom):
pass
What are these methods?
def setup(self, bottom, top): This method is called once when caffe builds the net. This function should check that number of inputs (len(bottom)) and number of outputs (len(top)) is as expected.
You should also allocate internal parameters of the net here (i.e., self.add_blobs()), see this thread for more information.
This method has access to self.param_str - a string passed from the prototxt to the layer. See this thread for more information.
def reshape(self, bottom, top): This method is called whenever caffe reshapes the net. This function should allocate the outputs (each of the top blobs). The outputs' shape is usually related to the bottoms' shape.
def forward(self, bottom, top): Implementing the forward pass from bottom to top.
def backward(self, top, propagate_down, bottom): This method implements the backpropagation, it propagates the gradients from top to bottom. propagate_down is a Boolean vector of len(bottom) indicating to which of the bottoms the gradient should be propagated.
Some more information about bottom and top inputs you can find in this post.
Examples
You can see some examples of simplified python layers here, here and here.
Example of "moving average" output layer can be found here.
Trainable parameters
"Python" layer can have trainable parameters (like "Conv", "InnerProduct", etc.).
You can find more information on adding trainable parameters in this thread and this one. There's also a very simplified example in caffe git.
How to add a "Python" layer in a prototxt?
See Bharat's answer for details.
You need to add the following to your prototxt:
layer {
name: 'rpn-data'
type: 'Python'
bottom: 'rpn_cls_score'
bottom: 'gt_boxes'
bottom: 'im_info'
bottom: 'data'
top: 'rpn_labels'
top: 'rpn_bbox_targets'
top: 'rpn_bbox_inside_weights'
top: 'rpn_bbox_outside_weights'
python_param {
module: 'rpn.anchor_target_layer' # python module name where your implementation is
layer: 'AnchorTargetLayer' # the name of the class implementation
param_str: "'feat_stride': 16" # optional parameters to the layer
}
}
How to add a "Python" layer using pythonic NetSpec interface?
It's very simple:
import caffe
from caffe import layers as L
ns = caffe.NetSpec()
# define layers here...
ns.rpn_labels, ns.rpn_bbox_targets, \
ns.rpn_bbox_inside_weights, ns.rpn_bbox_outside_weights = \
L.Python(ns.rpn_cls_score, ns.gt_boxes, ns.im_info, ns.data,
name='rpn-data',
ntop=4, # tell caffe to expect four output blobs
python_param={'module': 'rpn.anchor_target_layer',
'layer': 'AnchorTargetLayer',
'param_str': '"\'feat_stride\': 16"'})
How to use a net with a "Python" layer?
Invoking python code from caffe is nothing you need to worry about. Caffe uses boost API to call python code from compiled c++.
What do you do need to do?
Make sure the python module implementing your layer is in $PYTHONPATH so that when caffe imports it - it can be found.
For instance, if your module my_python_layer.py is in /path/to/my_python_layer.py then
PYTHONPATH=/path/to:$PYTHONPATH $CAFFE_ROOT/build/tools/caffe train -solver my_solver.prototxt
should work just fine.
How to test my layer?
You should always test your layer before putting it to use.
Testing the forward function is entirely up to you, as each layer has a different functionality.
Testing the backward method is easy, as this method only implements a gradient of forward it can be numerically tested automatically!
Check out test_gradient_for_python_layer testing utility:
import numpy as np
from test_gradient_for_python_layer import test_gradient_for_python_layer
# set the inputs
input_names_and_values = [('in_cont', np.random.randn(3,4)),
('in_binary', np.random.binomial(1, 0.4, (3,1))]
output_names = ['out1', 'out2']
py_module = 'folder.my_layer_module_name'
py_layer = 'my_layer_class_name'
param_str = 'some params'
propagate_down = [True, False]
# call the test
test_gradient_for_python_layer(input_names_and_values, output_names,
py_module, py_layer, param_str,
propagate_down)
# you are done!
Special Notice
It is worth while noting that python code runs on CPU only. Thus, if you plan to have a Python layer in the middle of your net you will see a significant degradation in performance if you plan on using GPU. This happens because caffe needs to copy blobs from GPU to CPU before calling python layer and then copy back to GPU to proceed with the forward/backward pass.
This degradation is far less significant if the python layer is either an input layer or the topmost loss layer.
Update: On Sep 19th, 2017 PR #5904 was merged into master. This PR exposes GPU pointers of blobs via the python interface.
You may access blob._gpu_data_ptr and blob._gpu_diff_ptr directly from python at your own risk.

Very simply, it's a layer in which you provide the implementation code, rather than using one of the pre-defined types -- which are all backed by efficient functions.
If you want to define a custom loss function, go ahead: write it yourself, and create the layer with type Python. If you have non-standard input needs, perhaps some data-specific pre-processing, no problem: write it yourself, and create the layer with type Python.

Python layers are different from C++ layers which need to be compiled, their parameters need to be added to the proto file and finally you need to register the layer in layer_factory. If you write a python layer, you don't need to worry about any of these things. Layer parameters can be defined as a string, which are accessible as a string in python. For example: if you have a parameter in a layer, you can access it using 'self.param_str', if param_str was defined in your prototxt file. Like other layers, you need to define a class with the following functions:
Setup - Initialize your layer using parameters obtained from layer variables
Forward - What would be input and output of a layer
Backward - Given the prediction and gradients from the next layer, compute the gradients for the previous layer
Reshape - Reshape your blob if needed
Prototxt example:
layer {
name: 'rpn-data'
type: 'Python'
bottom: 'rpn_cls_score'
bottom: 'gt_boxes'
bottom: 'im_info'
bottom: 'data'
top: 'rpn_labels'
top: 'rpn_bbox_targets'
top: 'rpn_bbox_inside_weights'
top: 'rpn_bbox_outside_weights'
python_param {
module: 'rpn.anchor_target_layer'
layer: 'AnchorTargetLayer'
param_str: "'feat_stride': 16"
}
}
Here, name of the layer is rpn-data, bottom and top are input and output details of the layer respectively. python_param defines what are the parameters of the Python layer. 'module' specifies what is the file name of your layer. If the file called 'anchor_target_layer.py' is located inside a folder called 'rpn', the parameter would be 'rpn.anchor_target_layer'. The 'layer' parameter is the name of your class, in this case it is 'AnchorTargetLayer'. 'param_str' is a parameter for the layer, which contains a value 16 for the key 'feat_stride'.
Unlike C++/CUDA layers, Python layers do not work in a multi-GPU setting in caffe as of now, so that is a disadvantage of using them.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.