How to reuse operation in tensorflow? - python

Keras layers can be reused i.e. if I have l = keras.layers.Dense(5) I can apply it multiple times to different tensors like t1 = l(t1); t2 = l(t2).
Is there anything similar in tensorflow without using keras?
Why do I need it. I have non-eager mode and want to create static .pb graph-file. Suppose I have a function f(t) that is huge and long, and it does tensor t transformations. Inside a graph it creates a huge sub-graph of different operations with flow of tensors over paths. Now I want to reuse it, meaning that I don't want to call it for every input t because it will form new sub-graph each time, just duplicates with different inputs. I want somehow to reuse same subgraph and directing different tensors as inputs to this subgraph. Also it is good to reuse it not to call huge function to form same structure for every possible input tensor, because it is slow.
Another important reason for re-using same operation is because same weights and heavy parameters can be used for many calls of operation on many inputs. It is sometimes important and needed that weights are same for all inputs to have correctly trained neural network.
The real reason for reusing is not only to save sapce occupied by graph, but also due to the fact that number of possible inputs to f(t) may vary depending on input. Suppose we have keras.layers.Input(...) placeholder as input. It always has batch 0-th dimension equal to None (unknown) at graph construction time, the real value for 0-th dimension is only known when real data is fed through sess.run(...). Now when data is fed I want to make as many transformations (calls to f(t)) as the size of batch dimension, in other words I want to call f(t) for every sub-tensor in the batch. E.g. for batch of images I want to call f(t) for every single image in the batch. Hence there will be different number of calls of f(t) for different batch sizes. How do I achieve this? Could it be achieved through tf.while_loop, if yes than how do I use while loop in my case?

Related

Tensorflow - reuse tensors on different graphs possible?

Is it possible to reuse tensors in multiple tf-graphs, even after they are reset?
Problem:
I have a large dataset that I want to evaluate with many different tf-graphs.
For each evaluation, tensorflow is reset with tf.compat.v1.reset_default_graph() and initialized completely from scratch.
Imho, it seems kind of dull and slow to call the data-to-tensor procedure every time, so I thought I could just define the data-tensor once and use it for all future evaluation.
Unfortunately, reusing tensors does not seem to be possible, as 'Tensor must be from the same graph as Tensor'.
ValueError: Tensor("Const:0", shape=(1670,), dtype=float32, device=/device:GPU:0) must be from the same graph as Tensor("Const_1:0", shape=(1670,), dtype=float32).
Is it possible to reuse these tensors somehow?
Check out this answer in another answered on another questio. https://stackoverflow.com/a/42616834/13514201
TensorFlow stores all operations on an operational graph. This graph defines what functions output to where, and it links it all together so that it can follow the steps you have set up in the graph to produce your final output. If you try to input a Tensor or operation on one graph into a Tensor or operation on another graph it will fail. Everything must be on the same execution graph.
Try removing with tf.Graph().as_default():

Formulaically updating parameters in Keras layers

I am trying to write some custom layers in Keras. The ultimate goal is that certain parameters (updated according to a fixed formula after each batch of data is optimized over in the training process) be passed to the loss function. I do not believe it is possible to use dynamic loss functions in Keras, but that I should be able to pass these parameters to the loss function using multiple inputs and a custom layer.
I want to know whether it is possible to create a layer in Keras having parameters that are not trainable (and not optimized over at all in the training process), but instead updated according to a fixed formula at the end of each batch optimization in the training process.
The simplest example I can give: instead of optimizing a generic cost function (like cross-entropy), I want to optimize something proportional to the cross entropy (c*cross_entropy). After one batch of data is processed in the training procedure, I want to set, for example, c = 1.2*c, and this to be used as the c value in the batch of data.
(This should be more or less useless in this case as a positive constant times the loss function shouldn't affect the minima but it's fairly close to what I actually need to do).

Why using placeholders for the input data of the TensorFlow functions

When I read TensorFlow codes, I see people specify placeholders for the input arguments of the functions and then feed the input data in a session.run. A trivial example can be like:
def sigmoid(z):
x = tf.placeholder(tf.float32, name='x')
sigmoid = tf.sigmoid(x)
with tf.Session() as session:
result = session.run(sigmoid, feed_dict={x:z})
return result
I wonder why don't they directly feed the z into the tf.sigmoid(z) and get rid of the placeholder x?
If this is a best practice, what is the reason behind it?
In your example method sigmoid, you basically built a small computation graph (see below) and run it with session.run (in the same method). Yes, it does not add any benefit to use a place-holder in your case.
However, usually people just built the computation graph (and execute the graph with data later). But at the time of building the graph, the data is not needed. That's why we use a place-holder to hold the place of data. Or in other words, it allows us to create our computing operations without needing any data.
Also this should explain why we want to use tf.placehoder instead of tf.Variable for holding training data. In short:
tf.Variable is for trainable parameters of the model.
tf.placeholder is for training data which does not change as model trains.
No initial values are needed for placeholders.
The first dimension of data through feeding could be None thus supporting any batch_size.

Make a Custom loss function in Keras in detail

I try to make a custom loss function in Keras.
I want to make this loss function
The dimension of output is 80. Batch size is 5000.
So I build this loss function below. But this doesn't work.
def normalize_activation(y_true, y_pred):
nb_divide = K.reshape(K.sqrt(K.sum(K.square(y_pred), axis=1)),(5000, 1))
nb_divide=numpy.tile(nb_divide,80)
predicted=numpy.divide(y_pred,nb_divide)
return K.sum(K.square(y_true-predicted))
ValueError: setting an array element with a sequence.
This error occurs. I think that the shape of y_true, y_pred is (5000,80).
where should I fix it??
Loss functions should avoid all kinds of operations that are not from keras backend. The values are tensors and you must keep them like tensors.
And you don't need to reshape things unless you actually want them to behave in a specific way.
If you have shapes (5000,80) and (5000,1), you can make operations with them without needing K.repeat_elements() (the equivalent to numpy.tile).
So, supposing that 5000 is the batch size (number of samples) and 80 is the only actual dimension belonging to a sample:
def normalize_loss(yTrue,yPred):
nb_divide = K.sqrt(K.sum(K.square(yPred),axis=1,keepdims=True))
#keepdims=True keeps the shape like (5000,1)
#this is not summing the entire batch, but only a single sample, is that correct?
predicted = yPred/nb_divide
return K.sum(K.square(yTrue-predicted))
Some observations:
(I'm not a loss function expert here) You're dividing only the predicted part, but not the true part. Wouldn't that create big differences between both values and result in a misguiding loss function? (Again, I'm not the expert here)
Usually people use K.mean() at the end of the loss function, but I see you used K.sum(). This is not a problem and doesn't prevent training from working. But you may like to visualize this same loss function for data with different sizes, and be able to compare them size-independently.

Tensorflow: how do I extract/export variable values at every iteration of training?

I have been playing around with some neural networks on Tensorflow and I wanted to make a visualization of the neural network's learning process.
To do so, I intend to extract the following variables into text/JSON/csv: pre-activation result before 1st layer, activation, bias and weight values for testing and training, each layer and for all time steps. I am looking for a generalizable solution so that I don't have to modify my source code (or at least not more than one or two lines) when applying visualization to future networks. Ideally I could run some function from another python program to read any python/TF code and extract the variables described above. So far I have considered the following solutions:
1) use tf.summary and the filewriter to save as a serialized protocol buffer, then find a way to go from protocol buffer --> JSON format. This unfortunately would not fit the bill as it requires me to modify too much inner code.
2) Perhaps using https://www.tensorflow.org/api_docs/python/tf/train/export_meta_graph
Although I am not sure how to implement given my TF foundations are not quite there yet
3) I have also found this solution:
W_val, b_val= sess.run([W, b])
np.savetxt("W1.csv", W_val, delimiter=",")
np.savetxt("b1.csv", b_val, delimiter=",")
But the problem is that it only saves the final values of the weights and biases, whereas I am looking to save their values at all timesteps of training.
If anyone has any suggestions on how to tackle this problem or any guidance I would appreciate it.
Many thanks
for step in range(num_train_steps):
_, weight_values, bias_values = sess.run([your_train_op, weight, bias])
# save weight_values and bias_values
Doing it with tf.Summaries is probably a good idea. You could then visualize it all in Tesnorboard, much like with some of the tutorials and the inception retraining code.
Alternatively you could perform fetches within your sess.run() call to grab whatever tensors you like at every step (i.e. every run call).
I have pasted a response to a similar question regarding extracting the cross entropy from another question below:
When you do your session run call (e.g. res = sess.run(...) ) then you can put in a fetch for your cross entropy variable.
For example, let's say you have a complicated sess.run() call that gets some predictions but you also want to your cross entropy then you may have code that looks like this:
feeds={x_data:x,y_data:y}
fetches=[y_result,cross_entropy]
res=sess.run(fetches=fetches, feed_dict=feeds) predictions=res[0]
#your first fetch parameter xent=res[1] #Your second fetch parameter.
Fetches within the run call allows you to "fetch" tensors from your graph.
You should be able to do the above but instead of cross entropy, just a list of whatever you want. I use it to fetch both my summaries and also intermediate accuracy values.

Categories