MXNet predict not working when model is cached - python

I have an MXNet MultilayerPerceptron inside a class MyModel.
I first load the trained weights from a file.
I am performing prediction with the MLP like this:
class MyModel:
...
def predict(self, X):
data_iterator = mx.io.NDArrayIter(data=X,
batch_size=self.model.data_shapes[0].shape[0], shuffle=False)
predictions_npa = self.model.predict(data_iterator ).asnumpy()
where X is a numpy array (1,777)
Now the first time i'm performing a MyModel.predict this works perfectly.
I then store the MyModel instance in a functools.LRUCache and trying to perform a second time the prediction with the exact same input.
And every time when doing that, my python process just stops doing anything, no logs, no actions, neither does it exit. I just know that when I try to inspect the result of self.model.predict(data_iterator ) in my PyCharm debugger I get a loading error.
So I'm a bit confused with what's happening there, if anyone had an idea it could be a great help!
Thanks

This maybe be because you have to recreate data_iterator. data_iterator is exausted once it has finished and .next() call will raise the error.

Related

Tensorflow's cycle gan model: error in fit method

I tried to run the code from here: https://keras.io/examples/generative/cyclegan/
but when running model.fit(..) I get the following error:
ValueError: Model <__main__.CycleGan object at 0x7fec3de767c0> cannot be saved either because the
input shape is not available or because the forward pass of the model is not defined.To define a
forward pass, please override `Model.call()`. To specify an input shape, either call
`build(input_shape)` directly, or call the model on actual data using `Model()`, `Model.fit()`, or
`Model.predict()`. If you have a custom training step, please make sure to invoke the forward pass
in train step through `Model.__call__`, i.e. `model(inputs)`, as opposed to `model.call()`.
Even if I run it through the linked colab file from the author. Did anyone else already faced this issue and knows how to fix it ?
I also tried to predefine the inputsize by model.build((batch_size,256,256,3)), but still get the same error.
If i comment the callbacks, it works. I think the problem is in the model_checkpoint_callback. Without it, the code works, but then I don't save the model.
Thanks a lot in advance for every answer!
The solution is to add this line:
# Create cycle gan model
cycle_gan_model = CycleGan(
generator_G=gen_G, generator_F=gen_F, discriminator_X=disc_X, discriminator_Y=disc_Y
)
# add following line:
cycle_gan_model.compute_output_shape(input_shape=(None, 256, 256, 3))

Unable to restore a layer of class TextVectorization - Text Classification

System information
Google Colab
When I run the example provided by official tensorflow basic text classification, everything runs fine until the model save, but when I load the model it gives me this error.
RuntimeError: Unable to restore a layer of class TextVectorization. Layers of class TextVectorization require that the class be provided to the model loading code, either by registering the class using #keras.utils.register_keras_serializable on the class def and including that file in your program, or by passing the class in a keras.utils.CustomObjectScope that wraps this load call.
Expected Behavior: Model should be loaded successfully and process the raw input
https://colab.research.google.com/gist/amahendrakar/8b65a688dc87ce9ca07ffb0ce50b84c7/44199.ipynb#scrollTo=fEjmSrKIqiiM
Example Link: https://tensorflow.google.cn/tutorials/keras/text_classification
I also ran into this error message (RuntimeError: Unable to restore a layer of class TextVectorization. [...]) when I implemented (and customized) the code from the "Basic Text Classification" tutorial.
Instead of running the code in a notebook, I have two scripts, one for building, training and saving the model and the other one for loading it and making predictions. (Thus, the error does not seem to be limited to Google Colab).
This is what I had to do (see https://github.com/tensorflow/tensorflow/issues/45231):
First, I added this line in the first script before the function definition and built, trained and saved the model again:
#tf.keras.utils.register_keras_serializable()
def custom_standardization(input_data):
[...]
# Save model as SavedModel
export_model.save(model_path, save_format='tf')
Secondly, I also had to add the same line and the whole function definition in the second script to make sure that it works if I restart(!) ipython (where I currently run the scripts) and only run the second script:
#tf.keras.utils.register_keras_serializable()
def custom_standardization(input_data):
lowercase = tf.strings.lower(input_data)
stripped_html = tf.strings.regex_replace(lowercase, '<br />', ' ')
return tf.strings.regex_replace(stripped_html,
'[%s]' % re.escape(string.punctuation),
'')
[...]
# Load model
reloaded_model = tf.keras.models.load_model(model_path)
# Make predictions
predictions = reloaded_model.predict(examples)
Note: If I run the second script without restarting ipython after running the first script, I get this error:
ValueError: Custom>custom_standardization has already been registered [...]
Alternatively, you can just use the default standardization method in the vectorizer layer when building the model:
vectorize_layer = TextVectorization(
standardize="lower_and_strip_punctuation",
max_tokens=max_features,
output_mode='int',
output_sequence_length=sequence_length)
I got something working as Hassan describes it, I think. Not sure it's the right way, but it seems to work for me...
I define, train, and archive the model in one notebook
I un-archive it, load it, and use it for predictions from another notebook.
See here: https://github.com/OlivierLD/oliv-ai/tree/master/JupyterNotebooks/tf-tutorials/sentiment-analysis

TensorFlow RuntimeError: "Attempting to capture an EagerTensor without building a function"

I am trying to build a neural network in Python for solving PDEs, and, as such, I have had to write custom training steps. My training function looks like this:
...
tf.enable_eager_execution()
class PDENet:
...
def train_step():
input = self.input
with tf.GradientTape() as tape, tf.Session() as sess:
tape.watch(input)
output = self.model(input)
self.loss = self.pde_loss(output) # (network does not use training data)
grad = tape.gradient(self.loss, self.model.trainable_weights)
self.optimizer.apply_gradients([(grad, self.model)])
...
Due to my hardware, I have no choice but to use tensorflow==1.12.0 and keras==2.2.4.
When I run this code, I get "RuntimeError: Attempting to capture an EagerTensor without building a function". I have seen other posts about this, but all of the answers say to update tensorflow/keras, which I can't, use "tf.enable_eager_execution()", which I've already done, and "tf.disable_v2_behavior()", which is nonexistent on older versions of tensorflow. Is there anything else I can do to solve this problem? The error makes me think tensorflow wants me to add #tf.function, but that feature also doesn't seem to exist in tensorflow 1.

Is there way to save trained model after overfitting occurs in CatBoost?

I am using CatBoostRegressor in Python version of the Catboost library.
According to documentation, it's possible to use overfitting detector, which I am doing, like this:
model = CatBoostRegressor(iterations=iters, learning_rate=0.03, depth=depth, verbose=True, od_pval=1, od_type='IncToDec', od_wait=20)
model.fit(train_pool, eval_set=validation_pool)
# this code didn't executed
model.save_model(model_name)
However, after the overfitting occurs, I've got my Python script interrupted, prematurely stopped, pick any phrase you want, and save model part didn't get executed, which leads to a lot of waisted time and no results in the end. I didn't get any stacktrace.
Is there any possibility to handle it in CatBoost and save hours of fitting work?
Use this code. It will save the model no matter what happens in the try block.
try:
model.fit(X, y)
finally:
model.save_model()
Well i don't know how catboost work but i would like to share a different way to save/store your trained data maybe it could help
import pickle
model = CatBoostRegressor(iterations=iters, learning_rate=0.03, depth=depth, verbose=True, od_pval=1, od_type='IncToDec', od_wait=20)
model.fit(train_pool, eval_set=validation_pool)
#----To store model----------
filename = 'final_model' # name to store model
pickle.dump(model, open(filename, 'wb')) # pickling
#-----To load model------------
loaded_model = pickle.load(open(filename, 'rb'))
You can do it with pickle, just train your module and dump it using pickle.
pickle.dump(regr, open("models/svrrbf.sav",'wb'))
Further you can use that module to test your inputs.
Hope it helps

Keras + Tensorflow : Debug NaNs

Here is a great question on how to find the first occurence of Nan in a tensorflow graph:
Debugging nans in the backward pass
The answer is quite helpful, here is the code from it:
train_op = ...
check_op = tf.add_check_numerics_ops()
sess = tf.Session()
sess.run([train_op, check_op]) # Runs training and checks for NaNs
Apparently, running the training and the numerical check at the same time will result in an error report as soon as Nan is encountered for the first time.
How do I integrate this into Keras ?
In the documentation, I can't find anything that looks like this.
I checked the code, too.
The update step is executed here:
https://github.com/fchollet/keras/blob/master/keras/engine/training.py
There is a function called _make_train_function where an operation to compute the loss and apply updates is created. This is later called to train the network.
I could change the code like this (always assuming that we're running on a tf backend):
check_op = tf.add_check_numerics_ops()
self.train_function = K.function(inputs,
[self.total_loss] + self.metrics_tensors + [check_op],
updates=updates, name='train_function', **self._function_kwargs)
I'm currently trying to set this up properly and not sure whether the code above actually works.
Maybe there is an easier way ?
I've been running into the exact same problem, and found an alternative to the check_add_numerics_ops() function. Instead of going that route, I use the TensorFlow Debugger to walk through my model, following the example in https://www.tensorflow.org/guide/debugger to figure out exactly where my code produces nans. This snippet should work for replacing the TensorFlow Session that Keras is using with a debugging session, allowing you to use tfdbg.
from tensorflow.python import debug as tf_debug
sess = K.get_session()
sess = tf_debug.LocalCLIDebugWrapperSession(sess)
K.set_session(sess)

Categories