I'm trying to update a two dimensional tensor in a nested while_loop(). When passing the variable to the second loop however, I cannot updated it using tf.assign() as it throws this error:
ValueError: Sliced assignment is only supported for variables
Somehow it works fine if I create the variable outside the while_loop and use it only in the first loop.
How can I modify my 2D tf variable in the second while loop?
(I'm using python 2.7 and TensorFlow 1.2)
My code:
import tensorflow as tf
import numpy as np
tf.reset_default_graph()
BATCH_SIZE = 10
LENGTH_MAX_OUTPUT = 31
it_batch_nr = tf.constant(0)
it_row_nr = tf.Variable(0, dtype=tf.int32)
it_col_nr = tf.constant(0)
cost = tf.constant(0)
it_batch_end = lambda it_batch_nr, cost: tf.less(it_batch_nr, BATCH_SIZE)
it_row_end = lambda it_row_nr, cost_matrix: tf.less(it_row_nr, LENGTH_MAX_OUTPUT+1)
def iterate_batch(it_batch_nr, cost):
cost_matrix = tf.Variable(np.ones((LENGTH_MAX_OUTPUT+1, LENGTH_MAX_OUTPUT+1)), dtype=tf.float32)
it_rows, cost_matrix = tf.while_loop(it_row_end, iterate_row, [it_row_nr, cost_matrix])
cost = cost_matrix[0,0] # IS 1.0, SHOULD BE 100.0
return tf.add(it_batch_nr,1), cost
def iterate_row(it_row_nr, cost_matrix):
# THIS THROWS AN ERROR:
cost_matrix[0,0].assign(100.0)
return tf.add(it_row_nr,1), cost_matrix
it_batch = tf.while_loop(it_batch_end, iterate_batch, [it_batch_nr, cost])
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
out = sess.run(it_batch)
print(out)
tf.Variable objects cannot be used as loop variables in a while loop, as loop variables are implemented differently.
So either create your variable outside the loop and update it yourself with tf.assign in each iteration or manually keep track of the updates as you do with loop variables (by returning their updated values from the loop lambdas, and in your case using the value from the inner loop as the new value for the outer loop).
Got this to work, with #AlexandrePassos help, by placing the Variable outside the while_loop. However, I also had to force the execution of the commands using tf.control_dependencies() (as the operations are not directly used on the loop variable). The loop now looks like this:
cost_matrix = tf.Variable(np.ones((LENGTH_MAX_OUTPUT+1, LENGTH_MAX_OUTPUT+1)), dtype=tf.float32)
def iterate_batch(it_batch_nr, cost):
it_rows = tf.while_loop(it_row_end, iterate_row, [it_row_nr])
with tf.control_dependencies([it_rows]):
cost = cost_matrix[0,0]
return tf.add(it_batch_nr,1), cost
def iterate_row(it_row_nr):
a = tf.assign(cost_matrix[0,0], 100.0)
with tf.control_dependencies([a]):
return tf.add(it_row_nr,1)
Related
EDIT: Problem soved, just had to "transform" the input W using W.data ...
Hi guys,
In my code, i am trying to train a model so that it moves a given sample to a given target distribution. The next step is to introduce intermediate distributions and to use a loop so that the particles (the samples) are moved from one distribution to another iteratively. Unfortunately, at the second iteration I get the following Error-Message when running my code:
"Trying to backward through the graph a second time (or directly access saved variables after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved variables after calling backward"
I don't think that retain_graph = True would fit my problem, since I would rather kind of clear the model after every iteraion than retain it. However, i gave it a shot, the result is the following error:
one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [1, 2]] is at version 2251; expected version 2250 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
Here are the relevant parts of my code:
for k in range(1, K_intermediate+1):
flow = BasicFlow(dim=d, n_flows=n_flows, flow_layer=flow_layer)
ldj = train_flow(
flow, x, W, f_intermediate(x,k-1), lambda x:
f_intermediate(x,k), epochs=2500
)
x, xtransp = flow(x)
x = xtransp.data
And the train_flow function:
def train_flow(flow, sample, weights, f0, f1, epochs=1000):
optim = torch.optim.Adam(flow.parameters(), lr=1e-2)
for i in range(epochs):
x0, xtransp = flow(sample)
ldj = accumulate_kl_div(flow).reshape(sample.size(0))
loss = det_loss(
x_0 = x0,
x_transp = xtransp,
weights = weights,
ldj = ldj,
f0 = f0,
f1 = f1
)
loss.backward(retain_graph = True)
optim.step()
optim.zero_grad()
reset_kl_div(flow)
if i % 250 == 0:
if i > 0 and previous_loss - loss.item() < 1e-06:
break
print(loss.item())
previous_loss = loss.item()
if torch.isnan(loss) == True:
break
return ldj
Note that the problem only arises since I start capturing the ldj-value (log of the determinant jacobian, for those who wonder). Since this value is crucial for further computations i can not just delete this.
I read the following statement when covering Autographs and Tracing in Tensorflow.
TensorFlow will only capture for loops that iterate over a tensor or a
dataset. So make sure you use for i in tf.range(x) rather than for i
in range(x), or else the loop will not be captured in the graph.
Instead, it will run during tracing.
(This may be what you want if the for loop is meant to build the graph, for example to create each layer in a neural network.)
I am confused as to what exactly happens. If it runs during tracing how it not registered on the graph but also how would the for loop build the graph?
An example which shows the difference between a tf.range loop and a range loop:
for i in tf.range(3):
x = tf.add(x, i)
results in a graph which contains a tf.while_loop that matches the for loop; this is the translation that AutoGraph makes:
def cond(i, x):
return tf.lesss(i, 3)
def body(i, x):
x = tf.add(x, i)
return i, x
tf.while_loop(cond, body, ...)
In turn:
for i in range(3):
x = tf.add(x, i)
results in a graph which contains a three tf.add calls, and i is substituted by constants, without any loop ops:
x = tf.add(x, 0)
x = tf.add(x, 1)
x = tf.add(x, 2)
is there some way to create variables in a map_fn loop like shown in the code beneath? how can I solve this error while keeping a variable in the loop? the info log does not really help me either, so am I getting any concept of tensorflow fundamentally wrong here? [tensorflow 1.14.0, python 3.6.8]
import tensorflow as tf
### function called in map_fn
def opt_variable(theta):
init_theta = lambda: theta
var_theta = tf.get_variable(dtype=tf.float32, initializer=tf.Variable(init_theta))
### ... other steps which need variable type to optimize
return tf.constant(3.) # some return
def iterate_over_cols(theta):
iter_cols = tf.range(5)
map_theta = tf.map_fn(lambda x: (opt_variable(theta[x])),
iter_cols, dtype=tf.float32 )
return map_theta
### example run
t_test = tf.convert_to_tensor([1.4, 3.1, 4.6, 6.3], dtype=tf.float32)
iterate_over_cols(t_test)
leads to this error:
ValueError: Cannot use 'map_18/while/strided_slice' as input to
'map_18/while/Variable/Assign' because 'map_18/while/strided_slice' is
in a while loop. See info log for more details.
It seems that you can not use nested while loops in this version, that means you can not use the output of one map_fn to the input of the other.
I have to run something like the following code
import tensorflow as tf
sess = tf.Session()
x = tf.Variable(42.)
for i in range(10000):
sess.run(x.assign(42.))
sess.run(x)
print(i)
several times. The actual code is much more complicated and uses more variables.
The problem is that the TensorFlow graph grows with each instantiated assign op, which makes the graph grow, eventually slowing down the computation.
I could use feed_dict= to set the value, but I would like to keep my state in the graph, so that I can easily query it in other places.
Is there some way of avoiding cluttering the current graph in this case?
I think I've found a good solution for this:
I define a placeholder y and create an op that assigns the value of y to x.
I can then use that op repeatedly, using feed_dict={y: value} to assign a new value to x.
This doesn't add another op to the graph.
It turns out that the loop runs much more quickly than before as well.
import tensorflow as tf
sess = tf.Session()
x = tf.Variable(42.)
y = tf.placeholder(dtype=tf.float32)
assign = x.assign(y)
sess.run(tf.initialize_all_variables())
for i in range(10000):
sess.run(assign, feed_dict={y: i})
print(i, sess.run(x))
Each time you call sess.run(x.assign(42.))
two things happen: (i) a new assign operation is added to the computational graph sess.graph, (ii) the newly added operation executes. No wonder the graph gets pretty large if loop repeats many times. If you define assignment operation before execution (asgnmnt_operation in example below), just a single operation is added to the graph so the performance is great:
import tensorflow as tf
x = tf.Variable(42.)
c = tf.constant(42.)
asgnmnt_operation = x.assign(c)
sess = tf.Session()
for i in range(10000):
sess.run(asgnmnt_operation)
sess.run(x)
print(i)
Suppose we have a variable:
x = tf.Variable(...)
This variable can be updated during the training process using the assign() method.
What is the best way to get the current value of a variable?
I know we could use this:
session.run(x)
But I'm afraid this would trigger a whole chain of operations.
In Theano, you could just do
y = theano.shared(...)
y_vals = y.get_value()
I'm looking for the equivalent thing in TensorFlow.
The only way to get the value of the variable is by running it in a session. In the FAQ it is written that:
A Tensor object is a symbolic handle to the result of an operation,
but does not actually hold the values of the operation's output.
So TF equivalent would be:
import tensorflow as tf
x = tf.Variable([1.0, 2.0])
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
v = sess.run(x)
print(v) # will show you your variable.
The part with init = global_variables_initializer() is important and should be done in order to initialize variables.
Also, take a look at InteractiveSession if you work in IPython.
In general, session.run(x) will evaluate only the nodes that are necessary to compute x and nothing else, so it should be relatively cheap if you want to inspect the value of the variable.
Take a look at this great answer https://stackoverflow.com/a/33610914/5543198 for more context.
tf.Print can simplify your life!
tf.Print will print the value of the tensor(s) you tell it to print at the moment where the tf.Print line is called in your code when your code is evaluated.
So for example:
import tensorflow as tf
x = tf.Variable([1.0, 2.0])
x = tf.Print(x,[x])
x = 2* x
tf.initialize_all_variables()
sess = tf.Session()
sess.run()
[1.0 2.0 ]
because it prints the value of x at the moment when the tf.Print line is. If instead you do
v = x.eval()
print(v)
you will get:
[2.0 4.0 ]
because it will give you the final value of x.
As they cancelled tf.Variable() in tensorflow 2.0.0,
If you want to extract values from a tensor(ie "net"), you can use this,
net.[tf.newaxis,:,:].numpy().