Is there a way to dynamically fetch on the graph? - python

I'd like to write one decoder for both training (should pass gradient down to the encoder) and beam-search mode (single steps from python, sadly, so not linked to the encoder directly).
Ideally, something like this would work:
decoder(beamSearchFlag_boolPlaceholder, initalState_fromEncoder, initialState_placeholder, input):
initialState = tf.cond(beamSearchFlag_boolPlaceholder,
lambda: initialState_placeholder,
lambda: initalState_fromEncoder)
... = cell(input, initialState)
But with cond() TF still needs to resolve the dependencies of both branches. The _fromEncoder branch is executed when beamSearchFlag==False, even without effect, and that's a big part of unnecessary graph. Is there a way around this?

Related

Problem in tqdm function in a Doc2Vec model

I am using this article https://actsusanli.medium.com/ to implement the Doc2Vec model and I have a problem in the training step.
model_dbow.train(utils.shuffle([x for x in tqdm(train_tagged.values)]), total_examples=len(train_tagged.values), epochs = 40)
As you can see, I am using the tqdm function. When I ran the code the tqdm is 100%, after some minutes, but the algorithm still runs in the same shell for a long time.
Do you have any idea if this is a problem of tqdm function or something else?
By using the "list comprehension" ([..])...
[x for x in tqdm(train_tagged.values)]
...you are having tqdm iterate once over your train_tagged.values sequence, into an actual in-memory Python list. This will show the tqdm progress rather quickly – then completely finish any involvement with tqdm.
Then, you're passing that plain result list (without any tqdm features) into Doc2Vec.train(), where Doc2Vec does its epochs=40 training passes. tqdm is no longer involved, so there'll be no incremental progress-bar output.
You might be tempted to try (or have already tried) something that skips the extra list creation, passing the tqdm-wrapped sequence directly in like:
corpus = utils.shuffle(train_tagged.values)
model_dbow.train(tqdm(corpus), total_examples=len(corpus), epochs = 40)
But this has a different problem: the tqdm-wrapper is only designed to allow (& report the progress of) one iteration over the wrapped sequence. So this will show that one iteration's incremental progress.
But when .train() tries its next necessary 39 re-iterations, to complete its epochs=40 training-runs, the single-pass tqdm object will be exhausted, preventing full & proper training.
Note that there is an option for progress-logging within Gensim, by setting the Python logging level (globally, or just for the class Doc2Vec) to INFO. Doc2Vec will then emit a log-line showing progress, within each epoch and between epochs, about every 1 second. But: you can also make such logging less-frequent by supplying a different seconds value to the optional report_delay argument of .train(), for example report_delay=60 (for a log line every minute instead of every second).
If you really want a progress-bar, it should possible to use tqdm - but you will have to work around its assumption that the iterable you're wrapping with tqdm() will only be iterated over once.
I believe there'd be two possible approaches, each with different tradeoffs:
(1) Instead of letting .train() repeat the corpus N times, do it yourself - adjusting the other .train() parameters accordingly. Roughly, that'd mean changing a line like...
model.train(corpus, total_examples=len(corpus), epochs=40)
...into something that turns your desired 40 epochs into something that looks like just one iteration to both tqdm & Gensim's .train(), like...
repeated_corpus = itertools.chain(*[corpus]*40)
repeated_len = 40 * len(corpus)
model.train(tqdm(repeated_corpus, total=repeated_len), total_examples=repeated_len, epochs=1)
(Note that you now have to give tqdm a hint as to the sequence's length, because the one-time chained-iterator from itertools.chain() doesn't report its own length.)
Then you'll get one progress-bar across the whole, training corpus - which the model is now seeing as one pass over a larger corpus, but ultimately involves the same 40 passes.
You'll want to reinterpret any remaining log lines with this change in mind, and you'll lose a chance to install your own per-epoch callbacks via the model's end-of-epoch callback mechanism. (But, that's a seldom-used feature, anyway.)
(2) Instead of wrapping the corpus with a single tqdm() (which can only show a progress-bar for one-iteration), wrap the corpus as a new fully-re-iterable object that itself will start a new tqdm() each time. For example, something like:
class TqdmEveryIteration(object):
def __init__(self, inner_iterable):
self.inner_iterable = inner_iterable
def iter(self):
return tqdm(inner_iterable)
Then, using this new extra tqdm-adding wrapper, you should be able to do:
corpus = utils.shuffle(train_tagged.values)
model_dbow.train(TqdmEveryIteration(corpus), total_examples=len(corpus), epochs = 40)
In this case, you should get one progress bar per epoch, because a new tqdm() wrapper will be started each training pass.
(If you try either of these approaches & they work well, please let me know! They should be roughly correct, but I haven't tested them yet.)
Separately: if the article from the author at actsusanli.medium.com that you're modeling your work on is...
https://towardsdatascience.com/multi-class-text-classification-with-doc2vec-logistic-regression-9da9947b43f4
...note that it's using an overly-complex & fragile anti-pattern, calling .train() multiple times in a loop with manual alpha management. That has problems as described in this other answer. But that approach would also have the side-effect of re-wrapping the corpus each time in a new tqdm (like the TqdmEveryIteration class above), so despite its other issues, would achieve one actual progress-bar each call to .train().
(I sent the author a private note via Medium about a month ago about this problem.)

How do you use Tensorflow Keras Custom Objects with tf.saved_model.Asset?

I have a custom Keras Layer that reads from a pickle file to initialize some weights, and I'd like to be able to use tf.keras.utils.register_keras_serializable() on it. The issue is that my __init__ function takes the path to the pickle file, which might not be available when the layer is deserialized again. Keras Assets should theoretically make the layer more portable, but I can't figure out how to get it to work with the layer's get_config().
Barebones version of my code:
#tf.keras.utils.register_keras_serializable()
class AssetLayer(tf.keras.layers.Layer):
def __init__(self, asset_path, **kwargs):
super().__init__(**kwargs)
self.asset_path = asset_path
self.asset = tf.saved_model.Asset(asset_path)
data = tf.io.read_file(self.asset)
# do something with data
def get_config(self):
return {
**super().get_config(),
"asset_path": self.asset_path,
}
def call(self, arg):
# arbitrary call function
return arg
If a model using this layer is loaded using tf.keras.models.load_model(), Keras will call get_config() to reinitialize the layer using the saved asset_path which might not be pointing to the right place at deserialization time. Ideally it would point to the path of the saved asset, but I don't know how to make it do that.
For instance, I've tried this code
!echo abcd > file.txt
model = tf.keras.Sequential([AssetLayer("file.txt")])
model(tf.ones(3))
model.save("test")
# reloading
!rm file.txt
reloaded_model = tf.keras.models.load_model("test")
which gives me an error saying file.txt is not found.
I've also tried removing the get_config() function entirely. This makes it so the layer can be successfully reloaded while retaining access to the asset variable, but other attributes in the layer such as self.asset_path aren't accessible. This isn't ideal for debugging purposes, so I'm wondering if there's a better way.
I'm currently using Tensorflow 2.5.0`
Edited code:
Previous to this part , code is fine .Issue is replicating because of
!rm file.txt
(so I put it at the end)
!echo abcd > file.txt
model = tf.keras.Sequential([AssetLayer("file.txt")])
model(tf.ones(3))
model.save("./content/sample_data/test.h5")
# reloading
reloaded_model = tf.keras.models.load_model("/content/content/sample_data/test.h5")
reloaded_model.summary()
!rm file.txt
Reference: https://www.tensorflow.org/guide/keras/save_and_serialize
It seems "tf.saved_model.Asset" do not support "tf.keras.models.load_model"
Try use tf.saved_model.save / tf.saved_model.load instead

How to access results from BestExporter while using train_and_evaluate?

When I use tf.estimator.train_and_evaluate with a BestExporter in my EvalSpec the return value at the end might not include an export_result since the final evaluation call won't necessarily lead to an export. This happens for instance if your last checkpoint doesn't lead to a lower loss on your evaluation set.
How do you access the last export_result that led to an export from the BestExporter? Ideally I would like to have a list of each (metrics, export_results) at the end of train_and_evaluate instead of just the last one.
For anyone desperate for a workaround you can access the directory using python built-ins like this.
estimator = tf.estimator.Estimator(...)
best_exporter = tf.estimator.BestExporter(...)
# Add best_exporter to your eval_spec
# Make train_spec
metrics, export_results = tf.estimator.train_and_evaluate(...)
best_export_dir = os.path.join(estimator.model_dir, 'export', best_exporter.name)
savedmodels = os.listdir(best_export_dir)
best_model = savedmodels[-1]
Obviously a better method would be preferred. The particular issue I'm describing here is that export_results might just be [None] since the last checkpoint didn't result in an export even when there has been an earlier export.
For anyone who cares these are the relevant bits of code from tensorflow r1.13 tracing the life of export_results from call to value,
tf.estimator.train_and_evaluate 471
_TrainingExecutor.run 611
_TrainingExecutor.run_local 703
_NewCheckpointListenerForEvaluate.after_save 517
_NewCheckpointListenerForEvaluate._evaluate 536
_Evaluator.evaluate_and_export 924
_Evaluator._export_eval_result 948
I might have found a solution if you are willing to (slightly) change the source code, specifically the _SavedModelExporter class implementation, in tensorflow_estimator\python\estimator\exporter.py.
First, I am using the package tensorflow_estimator, instead of getting estimator from tf.estimator. If the solution doesn't work in your case, consider using tensorflow_estimator - you should not lose anything by that.
Basically, _SavedModelExporter has a method called export, which, in my case (tensorflow 1.13.2, tensorflow_estimator 1.13.0), starts in line 116 and has the following implementation:
def export(self, estimator, export_path, checkpoint_path, eval_result,
is_the_final_export):
del is_the_final_export
export_result = estimator.export_savedmodel(
export_path,
self._serving_input_receiver_fn,
assets_extra=self._assets_extra,
as_text=self._as_text,
checkpoint_path=checkpoint_path,
strip_default_attrs=self._strip_default_attrs)
################
###I ADDED THIS
################
results_file = os.path.join(export_result, b"model_eval.txt")
with open(results_file, mode="w") as f:
for result in eval_result:
f.write(result + ": " + str(eval_result[result]) + "\n")
################
###END OF I ADDED THIS
################
return export_result
In the code above, as marked, I added code which loops through the dictionary of evaluation results (eval_result variable, already available for us but not used here!) and saves it as lines to a file. This file will be saved inside the same folder which contains the exported model, that is, something like export\best_exporter\1565348723\.
Some points:
1) You asked for a returned value, and I am not giving you that. I am instead saving it to file, since I think this is the solution with least changes to the source code. Do let me know if you cannot work with that.
2) You can develop on this solution. For example, you can probably save all entries to the same file instead of saving one file per exported model.
3) All three implemented exporters (LatestExporter, FinalExporter and BestExporter) are making calls to _SavedModelExporter, which we just changed. So you can either live with this behavior for all different Exporters, or have some variable, default to False, which controls whether the saving to file will happen or not. Then, expose this variable through the call to BestExporter.
Hope I could help with something.

Tensorflow Dataset .map() API

Couple of questions about this
For occasions when I'd like to do something like the following in Tensorflow (assume I'm creating training examples by loading WAV files):
import tensorflow as tf
def _some_audio_preprocessing_func(filename):
# ... some logic here which mostly uses Tensorflow ops ...
with tf.Session(graph=tf.Graph()) as sess:
wav_filename_placeholder = tf.placeholder(tf.string, [])
wav_loader = io_ops.read_file(wav_filename_placeholder)
wav_decoder = contrib_audio.decode_wav(wav_loader, desired_channels=1)
data = sess.run(
[wav_decoder],
feed_dict={wav_filename_placeholder: filename})
return data
dataset = tf.data.Dataset.list_files('*.wav')
dataset = dataset.map(_some_preprocessing_func)
If I have a parse_image() function that uses tensor ops - should
this be part of the main Graph? Following the example set in Google's own audio TF tutorial, it looks like they create a separate graph! Doesn't this ruin the point of using Tensorflow to make things faster?
Do I use tf.py_func() any time any single line isn't from the tensorflow library? Again, I wonder what the performance implications are and when I should use this...
Thanks!
When you use Dataset.map(map_func), TensorFlow defines a subgraph for all the ops created in the function map_func, and arranges to execute it efficiently in the same session as the rest of your graph. There is almost never any need to create a tf.Graph or tf.Session inside map_func: if your parsing function is made up of TensorFlow ops, these ops can be embedded directly in the graph that defines the input pipeline.
The modified version of the code using tf.data would look like this:
import tensorflow as tf
from tensorflow.contrib.framework.python.ops import audio_ops as contrib_audio
def _some_audio_preprocessing_func(filename):
wav_loader = tf.read_file(filename)
return contrib_audio.decode_wav(wav_loader, desired_channels=1)
dataset = tf.data.Dataset.list_files('*.wav')
dataset = dataset.map(_some_preprocessing_func)
If your map_func contains non-TensorFlow operations that you want to apply to each element, you should wrap them in a tf.py_func() (or Dataset.from_generator(), if the data generation process is defined in Python logic). The main performance implication is that any code running in a tf.py_func() is subject to the Global Interpreter Lock, so I would generally recommend trying to find a native TensorFlow implementation for anything that is performance critical.

TensorFlow gradient does not respond when using while_loop

I'm using TensorFlow to build a deep learning model. (The entire model is very complicated.) In the model, I need to use while_loop to dynamically control the computation flow based on my input sentences number. Previously, I used for loop instead of while_loop. After I switched to while_loop, the gradient doesn't work any more.
By the gradient not working I mean that if I execute forward, it works fine (produces some output). But if I enable gradients computation for training, it doesn't produce any response when I run my code, just hangs there. In top, it shows as S (suspend).
Anyone have any idea what is going on?
Below is how I use while_loop, in a very standard way:
def body(argmax_ep_gate, h, mem_state_previous, dummy):
'''doing some computation'''
return tf.to_int32(argmax_ep_gate), h, mem_state_current, mem_state_previous
def condition(argmax_ep_gate, h, mem_state_previous, dummy):
'''return some condition in bool'''
argmax_g, h, _, state = tf.while_loop(
condition, body, [initial_argmax_g, initial_h, self.state, self.state])
refer to TensorFlow stuck into endless loop using tf.while_loop(). note that if the body contains trainable variables, you need use variable scope

Categories