I have a tensorflow code that runs well and accurately, but occupies a lot of memory. Specifically, in my code, I have a for-loop that looks something like this:
K = 10
myarray1 = tf.placeholder(tf.float32, shape=[None,5,5]) # shape = [None, 5, 5]
myarray2 = tf.Variable( np.zeros([K,5,5]), dtype=tf.float32 )
vals = []
for k in range(0,K):
tmp = tf.reduce_sum(myarray1*myarray2[k],axis=(1,2))
vals.append(tmp)
result = tf.min( tf.stack(vals,axis=-1), axis=-1 )
Unfortunately, that takes a lot of memory as K gets to be big in my application. So, I want to have a better way of doing it. For example, in numpy/python, you would just keep track of the minimum value as you iterate through the loops, and update it on each iteration. It seems like I could use tf.assign, as:
K = 10
myarray1 = tf.placeholder(tf.float32, shape=[None,5,5]) # shape = [None, 5, 5]
myarray2 = tf.Variable( np.zeros([K,5,5]), dtype=tf.float32 )
min_value = tf.Variable(myarray1, validate_shape=False, trainable=False)
for k in range(0,K):
tmp = myarray1*myarray2[k]
idx = tf.where(tmp<min_value)
tf.scatter_nd_assign(min_value, idx, tmp[idx], use_locking=True)
result = min_value
While this code builds the graph (when validate_shape=False), it fails to run because it complains that min_value has not been initialized. The issue is, when I run the initializer as:
sess.run(tf.global_variables_initializer())
or
sess.run(tf.variables_initializer(tf.trainable_variables()))
it complains that I am not feeding in a placeholder. This actually makes sense because the definition of min_value depends on myarray1 in the graph.
What I would actually want to do is define a dummy variable that doesn't depend on myarray1's values, but does match its shape. I would like these values to be initialized as some number (in this case something large is fine), as I will manually ensure these are overwritten in the network.
Note: as far as I know, currently you cannot define a variable with an unknown shape unless you feed in another variable of the desired shape and set validate_shape=False). Maybe there is another way?
Any help / suggestions appreciated.
Try this, if don't know how to feed placeholder, read the tutorial.
K = 10
myarray1 = tf.placeholder(tf.float32, shape=[None,5,5]) # shape = [None, 5, 5]
###################ADD THIS ####################
sess=tf.Session()
FOO = tf.run(myarray1,feed_dict={myarray1: YOURDATA}) #get myarray1 value
#replace all myarray1 below with FOO
################################################
myarray2 = tf.Variable( np.zeros([K,5,5]), dtype=tf.float32 )
min_value = tf.Variable(FOO, validate_shape=False, trainable=False)
for k in range(0,K):
tmp = FOO*myarray2[k]
idx = tf.where(tmp<min_value)
tf.scatter_nd_assign(min_value, idx, tmp[idx], use_locking=True)
result = min_value
-------above new 15.April.2018------
Since I don't know your input data, I would like to try to make some steps.
Step_1: make a place for input data
x = tf.placeholder(tf.float32, shape=[None,2])
Step_2: Get batches of data
batch_x=[[1,2],[3,4]] #example
#since x=[None,2], the batch size would be batch_x_size/x_size=2
Step_3: make a session
sess=tf.Session()
if you have variables add the following code to initialize before calculation
init=tf.gobal_variables_initializer()
sess.run(init)
Step_4:
yourplaceholderdictiornay={x: batch_x}
sess.run(x, feed_dict=yourplaceholderdictiornay)
always feed your placeholder so it gets the value to calculate.
There is a Tensorflow and Deep Learning without a PHD, very helpful PDF file, you can also find it on youtube with this title.
Related
I am looking for help thinking through this.
I have a function (that is not a generator) that will give me any number of samples.
Let's say that getting all the data I want to train (1000 samples) can't fit into memory.
So I want to call this function 10 times to get smaller number of samples that fit into memory.
This is a dummy example for simplicity.
def get_samples(num_samples: int, random_seed=0):
np.random.seed(random_seed)
x = np.random.randint(0,100, num_samples)
y = np.random.randint(0,2, num_samples)
return np.array(list(zip(x,y))
Again lets say get_samples(1000,0) won't fit into memory.
So in theory I am looking for something like this:
batch_size = 100
total_num_samples = 1000
batches = []
for i in range(total_num_samples//batch_size):
batches.append(get_samples(batch_size, i))
But this still loads everything into memory.
Again this function is a dummy representation and the real one is already defined and not a generator.
In tf land. I was hoping that:
tf.data.Dataset.batch[0] would equal to the output of get_data(100,0)
tf.data.Dataset.batch[1] would equal to the output of get_data(100,1)
tf.data.Dataset.batch[2] would equal to the output of get_data(100,2)
...
tf.data.Dataset.batch[9] would equal to the output of get_data(100,9)
I understand that I can use tf.data.Datasets with a generator (and I think you can set a generator per batch). But the function I have gives more than a single sample. The set up is too expensive to set it up for a every single sample.
I was wanting to use tf.data.Dataset.prefetch() to run the get_batch function on every batch. And of course, it would call the get_batch with the same parameters on every epoch.
Sorry if the explaination is convoluted. Trying my best to describe the problem.
Anyone have any ideas?
This what I came up with:
def simple_static_synthesizer(batch_size, seed=1, verbose=True):
if verbose:
print(f"Creating Synthetic Data with seed {seed}")
rng = np.random.default_rng(seed)
all_x = []
all_y = []
for i in range(batch_size):
x = np.array(np.concatenate((rng.integers(0,100, 1, dtype=int), rng.integers(0,100, 1, dtype=int), rng.integers(0,100, 1, dtype=int))))
y = np.array(rng.integers(0,2,1, dtype=int))
all_x.append(x)
all_y.append(y)
return all_x, all_y
def my_generator(total_size, batch_size, seed=0, verbose=True):
counter = 0
for i in range(total_size):
# Regenerate for each batch
if counter%batch_size == 0: # Regen data for every batch
x,y = simple_static_synthesizer(batch_size,seed,verbose)
seed += 1
yield x[i%batch_size],y[i%batch_size]
counter += 1
my_gen = my_generator(10,2,seed=1)
# See values
for x,y in my_gen:
print(x,y)
# Call again, this give same answer as above
my_gen = my_generator(10,2,seed=1)
for x,y in my_gen:
print(x,y)
# Dataset with small batches to see if it is doing it correctly
total_samples = 10
batch_size = 2
seed = 5
dataset = tf.data.Dataset.from_generator(
my_generator,
args=[total_samples,batch_size,seed],
output_signature=(
tf.TensorSpec(shape=(3,), dtype=tf.uint8),
tf.TensorSpec(shape=(1,), dtype=tf.uint8),
)
)
for i,(x,y) in enumerate(dataset):
print(x.numpy(),y.numpy())
if i == 4:
break # shows first 3 syn calls
Wish we could have notebook answers!
I want to loop over a tensor which contains a list of Int, and apply a function to each of the elements.
In the function every element will get the value from a dict of python.
I have tried the easy way with tf.map_fn, which will work on add function, such as the following code:
import tensorflow as tf
def trans_1(x):
return x+10
a = tf.constant([1, 2, 3])
b = tf.map_fn(trans_1, a)
with tf.Session() as sess:
res = sess.run(b)
print(str(res))
# output: [11 12 13]
But the following code throw the KeyError: tf.Tensor'map_8/while/TensorArrayReadV3:0' shape=() dtype=int32 exception:
import tensorflow as tf
kv_dict = {1:11, 2:12, 3:13}
def trans_2(x):
return kv_dict[x]
a = tf.constant([1, 2, 3])
b = tf.map_fn(trans_2, a)
with tf.Session() as sess:
res = sess.run(b)
print(str(res))
My tensorflow version is 1.13.1. Thanks ahead.
There is a simple way to achieve, what you are trying.
The problem is that the function passed to map_fn must have tensors as its parameters and tensor as the return value. However, your function trans_2 takes plain python int as parameter and returns another python int. That's why your code doesn't work.
However, TensorFlow provides a simple way to wrap ordinary python functions, which is tf.py_func, you can use it in your case as follows:
import tensorflow as tf
kv_dict = {1:11, 2:12, 3:13}
def trans_2(x):
return kv_dict[x]
def wrapper(x):
return tf.cast(tf.py_func(trans_2, [x], tf.int64), tf.int32)
a = tf.constant([1, 2, 3])
b = tf.map_fn(wrapper, a)
with tf.Session() as sess:
res = sess.run(b)
print(str(res))
you can see I have added a wrapper function, which expects tensor parameter and returns a tensor, that's why it can be used in map_fn. The cast is used because python by default uses 64-bit integers, whereas TensorFlow uses 32-bit integers.
You cannot use a function like that, because the parameter x is a TensorFlow tensor, not a Python value. So, in order for that to work, you would have to turn your dictionary into a tensor as well, but it's not so simple because keys in the dictionary may not be sequential.
You can instead solve this problem without mapping, but instead doing something similar to what is proposed here for NumPy. In TensorFlow, you could implement it like this:
import tensorflow as tf
def replace_by_dict(x, d):
# Get keys and values from dictionary
keys, values = zip(*d.items())
keys = tf.constant(keys, x.dtype)
values = tf.constant(values, x.dtype)
# Make a sequence for the range of values in the input
v_min = tf.reduce_min(x)
v_max = tf.reduce_max(x)
r = tf.range(v_min, v_max + 1)
r_shape = tf.shape(r)
# Mask replacements that are out of the input range
mask = (keys >= v_min) & (keys <= v_max)
keys = tf.boolean_mask(keys, mask)
values = tf.boolean_mask(values, mask)
# Replace values in the sequence with the corresponding replacements
scatter_idx = tf.expand_dims(keys, 1) - v_min
replace_mask = tf.scatter_nd(
scatter_idx, tf.ones_like(values, dtype=tf.bool), r_shape)
replace_values = tf.scatter_nd(scatter_idx, values, r_shape)
replacer = tf.where(replace_mask, replace_values, r)
# Gather the replacement value or the same value if it was not modified
return tf.gather(replacer, x - v_min)
# Test
kv_dict = {1: 11, 2: 12, 3: 13}
with tf.Graph().as_default(), tf.Session() as sess:
a = tf.constant([1, 2, 3])
print(sess.run(replace_by_dict(a, kv_dict)))
# [11, 12, 13]
This will allow you to have values in the input tensor without replacements (left as they are), and also does not require to have all the replacement values in the tensor. It should be efficient unless the minimum and maximum values in your input are very far away.
I am trying to produce a very easy example for combination of TensorArray and while_loop:
# 1000 sequence in the length of 100
matrix = tf.placeholder(tf.int32, shape=(100, 1000), name="input_matrix")
matrix_rows = tf.shape(matrix)[0]
ta = tf.TensorArray(tf.float32, size=matrix_rows)
ta = ta.unstack(matrix)
init_state = (0, ta)
condition = lambda i, _: i < n
body = lambda i, ta: (i + 1, ta.write(i,ta.read(i)*2))
# run the graph
with tf.Session() as sess:
(n, ta_final) = sess.run(tf.while_loop(condition, body, init_state),feed_dict={matrix: tf.ones(tf.float32, shape=(100,1000))})
print (ta_final.stack())
But I am getting the following error:
ValueError: Tensor("while/LoopCond:0", shape=(), dtype=bool) must be from the same graph as Tensor("Merge:0", shape=(), dtype=float32).
Anyone has on idea what is the problem?
There are several things in your code to point out. First, you don't need to unstack the matrix into the TensorArray to use it inside the loop, you can safely reference the matrix Tensor inside the body and index it using matrix[i] notation. Another issue is the different data type between your matrix (tf.int32) and the TensorArray (tf.float32), based on your code you're multiplying the matrix ints by 2 and writing the result into the array so it should be int32 as well. Finally, when you wish to read the final result of the loop, the correct operation is TensorArray.stack() which is what you need to run in your session.run call.
Here's a working example:
import numpy as np
import tensorflow as tf
# 1000 sequence in the length of 100
matrix = tf.placeholder(tf.int32, shape=(100, 1000), name="input_matrix")
matrix_rows = tf.shape(matrix)[0]
ta = tf.TensorArray(dtype=tf.int32, size=matrix_rows)
init_state = (0, ta)
condition = lambda i, _: i < matrix_rows
body = lambda i, ta: (i + 1, ta.write(i, matrix[i] * 2))
n, ta_final = tf.while_loop(condition, body, init_state)
# get the final result
ta_final_result = ta_final.stack()
# run the graph
with tf.Session() as sess:
# print the output of ta_final_result
print sess.run(ta_final_result, feed_dict={matrix: np.ones(shape=(100,1000), dtype=np.int32)})
Does Tensorflow support runtime determine the shape of Tensor?
The problem is to build a Constant tensor in runtime based on the input vector length_q. The number of columns of the target tensor is the sum of length_q. The code snippet is shown as follows, the length of length_q is fixed to 64.
T = tf.reduce_sum(length_q, 0)[0]
N = np.shape(length_q)[0]
wm = np.zeros((N, T), dtype=np.float32)
# Something inreletive.
count = 0
for i in xrange(N):
ones = np.ones(length_q[i])
wm[i][count:count+length_q[i]] = ones
count += length_q[i]
return tf.Constant(wm)
Update
I want to create a dynamic Tensor according to the input length_q. length_q is some input vector (64*1). The new tensor's shape I want to create depends on the sum of length_q because in each batch the data in length_q changes. The current code snippet is as follows:
def some_matrix(length_q):
T = tf.reduce_sum(length_q, 0)[0]
N = np.shape(length_q)[0]
wm = np.zeros((N, T), dtype=np.float32)
count = 0
return wm
def network_inference(length_q):
wm = tf.constant(some_matrix(length_q));
...
And the problem occurs probably because length_q is the placeholder and doesn't have summation operation. Are there some ways to solve this problem?
It sounds like the tf.fill() op is what you need. This op allows you to specify a shape as a tf.Tensor (i.e. a runtime value) along with a value:
def some_matrix(length_q):
T = tf.reduce_sum(length_q, 0)[0]
N = tf.shape(length_q)[0]
wm = tf.fill([T, N], 0.0)
return wm
Not clear about what you are calculating. If you need to calculate N shape, you can generate ones like this
T = tf.constant(20.0,tf.float32) # tf variable which is reduced sum , 20.0 is example float value
T = tf.cast(T,tf.int32) # columns will be integer only
N = 10 # if numpy integer- assuming np.shape giving 10
# N = length_q.getshape()[0] # if its a tensor, 'lenght_q' replace by your tensor name
wm = tf.ones([N,T],dtype=tf.float32) # N rows and T columns
sess = tf.Session()
sess.run(tf.initialize_all_variables())
sess.run(wm)
I am trying to use TensorFlow for calculating minimum Euclidean distance between each column in the matrix and all other columns (excluding itself):
with graph.as_default():
...
def get_diversity(matrix):
num_rows = matrix.get_shape()[0].value
num_cols = matrix.get_shape()[1].value
identity = tf.ones([1, num_cols], dtype=tf.float32)
diversity = 0
for i in range(num_cols):
col = tf.reshape(matrix[:, i], [num_rows, 1])
col_extended_to_matrix = tf.matmul(neuron_matrix, identity)
difference_matrix = (col_extended_to_matrix - matrix) ** 2
sum_vector = tf.reduce_sum(difference_matrix, 0)
mask = tf.greater(sum_vector, 0)
non_zero_vector = tf.select(mask, sum_vector, tf.ones([num_cols], dtype=tf.float32) * 9e99)
min_diversity = tf.reduce_min(non_zero_vector)
diversity += min_diversity
return diversity / num_cols
...
diversity = get_diversity(matrix1)
...
When I call get_diversity() once per 1000 iterations (on the scale of 300k) it works just fine. But when I try to call it at every iteration the interpreter returns:
W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 2.99MiB. See logs for memory state.
I was thinking that was because TF creates a new set of variables each time get_diversity() is called. I tried this:
def get_diversity(matrix, scope):
scope.reuse_variables()
...
with tf.variable_scope("diversity") as scope:
diversity = get_diversity(matrix1, scope)
But it did not fix the problem.
How can I fix this allocation issue and use get_diversity() with large number of iterations?
Assuming you call get_diversity() multiple times in your training loop, Aaron's comment is a good one: instead you can do something like the following:
diversity_input = tf.placeholder(tf.float32, [None, None], name="diversity_input")
diversity = get_diversity(matrix)
# ...
with tf.Session() as sess:
for _ in range(NUM_ITERATIONS):
# ...
diversity_val = sess.run(diversity, feed_dict={diversity_input: ...})
This will avoid creating new operations each time round the loop, which should prevent the memory leak. This answer has more details.