I got a very short question, which has probably a very simple answer but I just can't figure it out, although I tried for hours now.
I'm using Tensorflow Estimator and I want to access the global step within my model_fn. I've tried tf.train.get_global_step, which returns me a Tensor. I need the global_step as an integer though (or as a string)!
So I've tried to eval() (= tf.get_default_session().run(t)), but it doesn't work..
Cheers!
You can use tf.cast to cast the Tensor to int or string.
For example,
tf.cast(tf.train.get_global_step(), dtype=tf.int)
See the reference here.
One way would be to parse it from the latest checkpoint file in the model_dir.
So assuming you can pass the model_dir into the model_fn (either through the params argument of tf.estimator.Estimator(..., params={'model_dir': 'path/to/model_dir'}) or through tf.flags.FLAGS, you can then use this utility function:
import tensorflow as tf
def get_global_step_from_model_dir(model_dir):
latest_checkpoint_file = tf.train.latest_checkpoint(model_dir)
if latest_checkpoint_file is None:
return 0
else:
return int(os.path.basename(latest_checkpoint_file).split('-')[-1])
Related
I'm trying to train a model in mixed precision. However, I want a few of the layers to be in full precision for stability reasons. How do I force an individual layer to be float32 when using torch.autocast? In particular, I'd like for this to be onnx compileable.
Is it something like:
with torch.cuda.amp.autocast(enabled=False, dtype=torch.float32):
out = my_unstable_layer(inputs.float())
Edit:
Looks like this is indeed the official method. See the torch docs.
I think the motivation of torch.autocast is to automate the reduction of precision (not the increase).
If you have functions that need a particular dtype, you should consider using, custom_fwd
import torch
#torch.cuda.amp.custom_fwd(cast_inputs=torch.complex128)
def get_custom(x):
print(' Decorated function received', x.dtype)
def regular_func(x):
print(' Regular function received', x.dtype)
get_custom(x)
x = torch.tensor(0.0, dtype=torch.half, device='cuda')
with torch.cuda.amp.autocast(False):
print('autocast disabled')
regular_func(x)
with torch.cuda.amp.autocast(True):
print('autocast enabled')
regular_func(x)
autocast disabled
Regular function received torch.float16
Decorated function received torch.float16
autocast enabled
Regular function received torch.float16
Decorated function received torch.complex128
Edit: Using torchscript
I am not sure how much you can rely on this, due to a comment in the documentation. However the comment is apparently outdated.
Here is an example where I trace the model with autocast enabled, feeze it and then I use it and the value is indeed cast to the specified type
class Cast(torch.nn.Module):
#torch.cuda.amp.custom_fwd(cast_inputs=torch.float64)
def forward(self, x):
return x
with torch.cuda.amp.autocast(True):
model = torch.jit.trace(Cast().eval(), x)
model = torch.jit.freeze(model)
x = torch.tensor(0.0, dtype=torch.half, device='cuda')
print(model(x).dtype)
torch.float64
But I suggest you to validate this approach before using it for a serious application.
I am trying to learn catboost, and I see two confusing terms with CatBoostClassifier:
custom_loss and custom_metric.
I have browsed here which says: https://catboost.ai/docs/concepts/python-reference_parameters-list.html#python-reference_parameters-list
custom_metric:
Metric values to output during training. These functions are not optimized and are displayed for informational purposes only. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).
but then what is custom_loss?
I see custom_loss defined in the R documentation: https://catboost.ai/docs/features/loss-functions-desc.html - but not in the python one.
yet. on the python tutorial, they have defined a custom_loss like so:
model = CatBoostClassifier(
custom_loss=['Accuracy'],
random_seed=42,
logging_level='Silent'
)
Am I missing something here? Infact, custom_loss does not seem to be defined as a property anywhere in the python docs: https://catboost.ai/docs/concepts/python-reference_parameters-list.html#python-reference_parameters-list
I infer the following from this link in the documentation.
I am almost certain that they refer to the same parameter, but custom_loss is the R name while custom_metric is for Python. Apparently they can be used interchangeably as long as they don't cause name collisions.
I'm looking for a way to change floatx in keras directly in python.
floatx is the default float type (float16, float32 . . .)
The config is stored in a json file at:
$HOME/.keras/keras.json
But I'm looking for a way to change the config inside my python programm without changing the config file itself.
There is a similiar question, in which somebody ask the same for changing the backend, which is also stored in keras.json.
The accepted answer involves setting the environment variable KERAS_BACKEND and reload the keras module, but I didn't find a similar environment variable for floatx.
Turns out keras.backend has function for setting and retrieving the floatx value (scroll down in the link):
keras.backend.floatx()
>>> 'float32'
keras.backend.set_floatx('float16')
keras.backend.floatx()
>>> 'float16'
Also you are not allowed to reload the keras module after using set_floatx like when changing backend, because then keras will simply reread the config file and return to its previous value:
keras.backend.floatx()
>>> 'float32'
keras.backend.set_floatx('float16')
keras.backend.floatx()
>>> 'float16'
importlib.reload(keras.backend)
keras.backend.floatx()
>>> 'float32'
Well, the floatx var should certainly be used in keras.json, as described in documentation.
The least buggy way to do it is using the file indeed and reloading the module.
Using K.set_floatx, at least for me, left parts of the models unchanged (even if sef_floatx was the very first thing I did after loading the keras model in a new python kernel)
Even though, I faced yet another bug when setting precision to float16: all my loss functions very quickly became nan. Unfortunately I had to go back to float32 (the default) to have the possibility of training.
Is there a nice way to distinguish programmatically between tensors, variables, and ops in TensorFlow? This can come up, for example, when reloading a model and tf.local_variables() can have both tensors and variables in it. If you try to initialize a tensor, you get an error.
Below is some code for my current hack to get around this, but is there a better way? Part of the issue is that the type of variables, tensors, etc. is, e.g., tensorflow.python.ops.variables.Variable but it seems that tensorflow.python isn't accessible anymore (I think it was in some earlier releases?). The example only shows variables vs tensors, but I've also needed to distinguish ops from tensors before and had to use similar hacks.
import tensorflow as tf
vars_list = [tf.Variable(0), tf.constant(0)]
# init = tf.variables_initializer(vars_list) # -> AttributeError: 'Tensor' object has no attribute 'initializer'
var_type = type(tf.Variable(0))
init = tf.variables_initializer([v for v in vars_list if type(v) == var_type])
Normally, in Python, one would use
isinstance(x, tf.Variable)
or
isinstance(x, (tf.Variable, tf.Tensor))
etc.
I'm running the example cnn_mnist given on github that is using layers module.
I can run the program but a Warning appears telling me that one of the function is deprecated.
I couldn't find which new function needs to be used.
WARNING:tensorflow:From <ipython-input-14-ee49e8b76469>:25: calling BaseEstimator.fit (from tensorflow.contrib.learn.python.learn.estimators.estimator) with batch_size is deprecated and will be removed after 2016-12-01.
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
est = Estimator(...) -> est = SKCompat(Estimator(...))
It should be sufficient to modify that script like this:
# Create the Estimator
mnist_classifier = learn.SKCompat(learn.Estimator(
model_fn=cnn_model_fn, model_dir="/tmp/mnist_convnet_model"))
However, please open a Github issue to update that sample so it doesn't mislead anyone else.