Tensorflow LSTM model parameter learning inside parameter - python

I'm tryinig to train my LSTM model in tensorflow and my module has to calculate parameter inside parameter. And i want to train both parameters altogether.
More details are in the picture below.
I think that tensorflow LSTM module's input must be a perfect sequence and parameters like "tf.placeholder".
How can i do this in tensorflow? Or can you recommend another appropriate framework better than tensorflow in this task?
Sorry for my poor english.

First of all your usage of the word parameter is quite confusing. Normally parameters are referred as trainable parameters and therefore every variable which is trained by the optimizer. There are also so-called hyper-parameters, which have to be set per hand e.g. like the model topology.
Tensorflow work with tensors, which are representations of data which are used to build the workflow and are filled with data during run time via placeholder which is like an entry point for the data.
Also, if you have trouble to build your model in tensorflow, then there is also keras. Keras can run with tensorflow as its backend but model building is much easier. Also, keras is also available in the tensorflow API as tf.keras. In keras one or multiple LSTMs are simplified as a layer which can be added to your model.
If you like a more specific answer to your question, please provide code to describe your problem.

Related

What is the difference between freeze_graph and write_graph?

Google says:
But as far as I know if I would like to use tensorflow inference i need protobuf file (.pb) which I can get using freeze_graph method. So what is the difference between those two?
As a heads-up, freeze_graph is generally deprecated in TensorFlow 2.x. You should be using Saved Models for the same functionality in Tensorflow 2.x. I'll be answering this question from the perspective of TensorFlow 1.x.
Before understanding the difference between the two, you need to know how a TensorFlow model is shaped.
Each TensorFlow model is composed of a graph data structure, which contains the Operation objects, which are the units of computation, and the Tensor objects, which are the units of data that flow between them.
However, a graph alone is not enough to do anything like inference. When you train a model, it learns and optimizes a unique set of parameters for the different parts of your graph.
A PB file, then, includes both of these parts - the graph that represents the structure of the model, and the parameters that the model has learned through training.
So back to the original question - what's the difference between write_graph and freeze_graph?
write_graph writes out the graph of the model (the structure) into the PB file.
This does not require any training, so it doesn't include any parameters the model may have learned.
freeze_graph takes the trained parameters of the model from a training checkpoint and saves that as well to the PB file.

Creating a custom LSTMCell in Tensorflow with GPU support

I would like to integrate an attentional component into the LSTM model I'm creating. Unfortunately, with tensorflow 2.3.1 that I'm using, it appears that if you subclass the LSTMCell, you have to run the model on CPU. From the tensorflow documentation:
CuDNN is only available at the layer level, and not at the cell level.
Which means I'm relegated to the CPU if I try something like this:
output=keras.layers.RNN(AttentionLSTMCell(400), return_sequences=True, stateful=False)(input_layer);
Where AttentionLSTMCell is a custom class I made that will take in some additional constants (generally an output of the previous timestamp and some new input) that will condition the output of the LSTM. In fact, the documentation seems to suggest that even only specific conditioning is allowed. I am about to dig into creating a full custom Layer (perhaps copy the existing and see if I can add my new inputs in call), but is there a better way? It makes prototyping quite difficult. Large recurrent networks are slow to train, especially in my case where I integrate image data as input.

How to use keras model inside other model in TPU

I am trying to convert a keras model to tpu model in google colab, but this model has another model inside.
Take a look at the code:
https://colab.research.google.com/drive/1EmIrheKnrNYNNHPp0J7EBjw2WjsPXFVJ
This is a modified version of one of the examples in the google tpu documentation:
https://colab.research.google.com/github/tensorflow/tpu/blob/master/tools/colab/fashion_mnist.ipynb
If the sub_model is converted and used directly it works, but if the sub model is inside another model it does not work. I need the sub model type of network because i am trying to train a GAN network that has 2 networks inside (gan=generator+discriminator) so if this test works probably it will work with the gan too.
I have tried several things:
Convert to tpu the model without converting the sub model, in that case when training starts an error is prompted related to the inputs of the sub model.
Convert both the model and sub model to tpu, in that case an error is prompted when converting the "parent" model, the exception only says at the end "layers".
Convert only the sub model to tpu, in that case no error is prompted but the training is not accelerated by the tpu and it is extremely slow like if no conversion to tpu was made at all.
Using fixed batch size or not, both have the same result, the model does not work.
Any ideas? Thanks a lot.
Divide into parts only use submodel at tpu first. Then put something simple instead of submodel and use the model in TPU. If this does not work , create something very simple which includes similar structure with models you are sure that are working and then step by step add things to converge your complex model which you want to use in TPU.
I am struggling with such things. What I did at the very beginning using MNIST is trained the model and get the coefficients outside rewrite relu dense dropout and NN matricies myself and run the model using numpy and then cupy and then pyopencl and then I replaced functions with my own raw cuda C and opencl functions so that getting deeper and simpler I can find what is wrong when something does not work. At last I write my genetic selective training algo and learned a lot.
And most important it gave me the opportunity to try some crazy ideas for training and modelling and manuplating and making sense of NN coffecients.
The problem in my opinion is TF - Keras etc are too high level. Optimizers - Solvers , there is too much unknown. Even neural networks are not under control. GAN is problematic while training it does not converge everytime takes days to train most of the time. Even if you train. You dont know any idea how it converges. Most of the tricks - techniques which protects you from vanishing gradient are not mathematically backed they are nevertheless works very amazingly. (?!?)
**Go simpler deeper and and complexity step by step. Follow a practicing on which you comprehend as much as you can ** It will cost some time and energy but you will benefit it tremendously in my opinion.

Tensorflow op in Keras model

I'm trying to use a tensorflow op inside a Keras model. I previously tried to wrap it with a Lambda layer but I believe this disables that layers' backpropagation.
More specifically, I'm trying to use the layers from here in a Keras model, without porting it to Keras layers (I hope to deploy to tensorflow later on). I can compile these layers in a shared library form and load these into python. This gives me tensorflow ops and I don't know how to combine this in a Keras model.
A simple example of a Keras MNIST model, where for example one Conv2D layer is replaced by a tf.nn.conv2d op, would be exactly what I'm looking for.
I've seen this tutorial but it appears to do the opposite of what I am looking for. It seems to insert Keras layers into a tensorflow graph. I'm looking to do the exact opposite.
Best regards,
Hans
Roughly two weeks have passed and it seems I am able to answer my own question now.
It seems like tensorflow can look up gradients if you register them using this decorator. As of writing, this functionality is not (yet) available in C++, which is what I was looking for. A workaround would be to define a normal op in C++ and wrap it in a python method using the mentioned decorator. If these functions with corresponding gradients are registered with tensorflow, backpropagation will happen 'automagically'.

Trying to create CNN using Theano and Tensorflow

I am trying to create CNN using theano and tensorflow. Defined layers are MLP, convolvemaxpool in different classes now I am trying to create CNN using these classes depending on the input provided by user. CNN should call theano or tensorflow classes depending on the user input. Now, Let's say theano and tensorflow both has classes MLP, convolvemaxpool implementing MLP, Convolution simultaneously. Now my problem is how to call the classes depending on the input? I don't want to use if else since adding more libraries means more if else statement for each class which I don't think is a right solution for now. Any help will be appreciated. Thanks.

Categories