Does Tensorflow support runtime determine the shape of Tensor?
The problem is to build a Constant tensor in runtime based on the input vector length_q. The number of columns of the target tensor is the sum of length_q. The code snippet is shown as follows, the length of length_q is fixed to 64.
T = tf.reduce_sum(length_q, 0)[0]
N = np.shape(length_q)[0]
wm = np.zeros((N, T), dtype=np.float32)
# Something inreletive.
count = 0
for i in xrange(N):
ones = np.ones(length_q[i])
wm[i][count:count+length_q[i]] = ones
count += length_q[i]
return tf.Constant(wm)
Update
I want to create a dynamic Tensor according to the input length_q. length_q is some input vector (64*1). The new tensor's shape I want to create depends on the sum of length_q because in each batch the data in length_q changes. The current code snippet is as follows:
def some_matrix(length_q):
T = tf.reduce_sum(length_q, 0)[0]
N = np.shape(length_q)[0]
wm = np.zeros((N, T), dtype=np.float32)
count = 0
return wm
def network_inference(length_q):
wm = tf.constant(some_matrix(length_q));
...
And the problem occurs probably because length_q is the placeholder and doesn't have summation operation. Are there some ways to solve this problem?
It sounds like the tf.fill() op is what you need. This op allows you to specify a shape as a tf.Tensor (i.e. a runtime value) along with a value:
def some_matrix(length_q):
T = tf.reduce_sum(length_q, 0)[0]
N = tf.shape(length_q)[0]
wm = tf.fill([T, N], 0.0)
return wm
Not clear about what you are calculating. If you need to calculate N shape, you can generate ones like this
T = tf.constant(20.0,tf.float32) # tf variable which is reduced sum , 20.0 is example float value
T = tf.cast(T,tf.int32) # columns will be integer only
N = 10 # if numpy integer- assuming np.shape giving 10
# N = length_q.getshape()[0] # if its a tensor, 'lenght_q' replace by your tensor name
wm = tf.ones([N,T],dtype=tf.float32) # N rows and T columns
sess = tf.Session()
sess.run(tf.initialize_all_variables())
sess.run(wm)
Related
I'm trying to create an output tensor with dimensionality 32 x 576 x 2 from an operation between matrices M and X, with the following shapes:
M.shape: (576, 2, 2048)
X.shape: (32, 2048)
The operation I'm defining is an element-wise cosine similarity, from the following equation:
which represents the cosine similarity between the feature vector 𝑥 and the vector M_j,k.
This is how I've implemented it in code (incorrectly), where BATCH_SIZE=32, C=576, V=2:
#tf.function
def call(self, X):
M = self.kernel
norm_M = tf.norm(M, ord=2, axis=2)
norm_X = tf.norm(X, ord=2, axis=1)
l_r = (some scalar value, separate to this question)
# Compute cosine similarity between X and M
# as a matrix with dimensionality:
# BATCH_SIZE x C x V
feature_batch_size = tf.shape(X)[0]
c = tf.shape(M)[0]
v = tf.shape(M)[1]
output_matrix = tf.zeros([feature_batch_size, c, v])
output_matrix = tf.Variable(output_matrix, trainable=False)
for row in tf.range(feature_batch_size):
for column in tf.range(c):
for channel in tf.range(v):
a = tf.tensordot(M[column][channel], X[row], 1)
b = norm_M[column][channel] * norm_X[row]
output_matrix[row][column][channel] = a / b
return [output_matrix, l_r]
This fails on the line output_matrix[row][column][channel] = a / b because it's unhappy with an assignment to an individual row:column:channel of a tf.Variable.
Is there a better way to do this operation over these two matrices to create the desired output matrix so that it can be done without these three nested for loops and maintain compatibility with the tf.Function graph functionality?
If not, what can I do to assign variables to individual elements on a tf.Variable as I'm unsuccessfully attempting to do here?
Extra information:
norm_M.shape: (576, 2)
norm_X.shape: (32,)
You can replace these loops completely by using vectorized operations in the place of for loops.
num = tf.einsum('ij,klj->ikl',X,M)
denom = tf.einsum('i,jk->ijk',norm_X, norm_M)
output_matrix = num/denom
I am trying to write a custom loss function for a person-reidentification task which is trained in a multi-task learning setting along with object detection. The filtered label values are of the shape (batch_size, num_boxes). I would like to create a mask such that only the values which repeat in dim 1 are considered for further calculations. How do I do this in TF/Keras-backend?
Short Example:
Input labels = [[0,0,0,0,12,12,3,3,4], [0,0,10,10,10,12,3,3,4]]
Required output: [[0,0,0,0,1,1,1,1,0],[0,0,1,1,1,0,1,1,0]]
(Basically I want to filter out only duplicates and discard unique identities for the loss function).
I guess a combination of tf.unique and tf.scatter could be used but I do not know how.
This code works:
x = tf.constant([[0,0,0,0,12,12,3,3,4], [0,0,10,10,10,12,3,3,4]])
def mark_duplicates_1D(x):
y, idx, count = tf.unique_with_counts(x)
comp = tf.math.greater(count, 1)
comp = tf.cast(comp, tf.int32)
res = tf.gather(comp, idx)
mult = tf.math.not_equal(x, 0)
mult = tf.cast(mult, tf.int32)
res *= mult
return res
res = tf.map_fn(fn=mark_duplicates_1D, elems=x)
So I need a ND convolutional layer that also supports complex numbers. So I decided to code it myself.
I tested this code on numpy alone and it worked. Tested with several channels, 2D and 1D and complex. However, I have problems when I do it on TF.
This is my code so far:
def call(self, inputs):
with tf.name_scope("ComplexConvolution_" + str(self.layer_number)) as scope:
inputs = self._verify_inputs(inputs) # Check inputs are of expected shape and format
inputs = self.apply_padding(inputs) # Add zeros if needed
output_np = np.zeros( # I use np because tf does not support the assigment
(inputs.shape[0],) + # Per each image
self.output_size, # Image out size
dtype=self.input_dtype # To support complex numbers
)
img_index = 0
for image in inputs:
for filter_index in range(self.filters):
for i in range(int(np.prod(self.output_size[:-1]))): # for each element in the output
index = np.unravel_index(i, self.output_size[:-1])
start_index = tuple([a * b for a, b in zip(index, self.stride_shape)])
end_index = tuple([a+b for a, b in zip(start_index, self.kernel_shape)])
# set_trace()
sector_slice = tuple(
[slice(start_index[ind], end_index[ind]) for ind in range(len(start_index))]
)
sector = image[sector_slice]
new_value = tf.reduce_sum(sector * self.kernels[filter_index]) + self.bias[filter_index]
# I use Tied Bias https://datascience.stackexchange.com/a/37748/75968
output_np[img_index][index][filter_index] = new_value # The complicated line
img_index += 1
output = apply_activation(self.activation, output_np)
return output
input_size is a tuple of shape (dim1, dim2, ..., dim3, channels). An 2D rgb conv for example will be (32, 32, 3) and inputs will have shape (None, 32, 32, 3).
The output size is calculated from an equation I found in this paper: A guide to convolution arithmetic for deep learning
out_list = []
for i in range(len(self.input_size) - 1): # -1 because the number of input channels is irrelevant
out_list.append(int(np.floor((self.input_size[i] + 2 * self.padding_shape[i] - self.kernel_shape[i]) / self.stride_shape[i]) + 1))
out_list.append(self.filters)
Basically, I use np.zeros because if I use tf.zeros I cannot assign the new_value and I get:
TypeError: 'Tensor' object does not support item assignment
However, in this current state I am getting:
NotImplementedError: Cannot convert a symbolic Tensor (placeholder_1:0) to a numpy array.
On that same assignment. I don't see an easy fix, I think I should change the strategy of the code completely.
In the end, I did it in a very inefficient way based in this comment, also commented here but at least it works:
new_value = tf.reduce_sum(sector * self.kernels[filter_index]) + self.bias[filter_index]
indices = (img_index,) + index + (filter_index,)
mask = tf.Variable(tf.fill(output_np.shape, 1))
mask = mask[indices].assign(0)
mask = tf.cast(mask, dtype=self.input_dtype)
output_np = array * mask + (1 - mask) * new_value
I say inefficient because I create a whole new array for each assignment. My code is taking ages to compute for the moment so I will keep looking for improvements and post here if I get something better.
I have tried a custom Conv2d function which has to work similar to nn.Conv2d but the multiplication and addition used inside nn.Conv2d are replaced with mymult(num1,num2) and myadd(num1,num2).
As per insight from very helpful forums 1,2 what i can do is try unfolding it and then do matrix multiplication. That # part given in the code below can be done using loops with mymult() and myadd() as i believe this # is doing matmul.
def convcheck():
torch.manual_seed(123)
batch_size = 2
channels = 2
h, w = 2, 2
image = torch.randn(batch_size, channels, h, w) # input image
out_channels = 3
kh, kw = 1, 1# kernel size
dh, dw = 1, 1 # stride
size = int((h-kh+2*0)/dh+1) #include padding in place of zero
conv = nn.Conv2d(in_channels=channels, out_channels=out_channels, kernel_size=kw, padding=0,stride=dh ,bias=False)
out = conv (image)
#print('out', out)
#print('out.size()', out.size())
#print('')
filt = conv.weight.data
imageunfold = F.unfold(image,kernel_size=kh,padding=0,stride=dh)
print("Unfolded image","\n",imageunfold,"\n",imageunfold.shape)
kernels_flat = filt.view(out_channels,-1)
print("Kernel Flat=","\n",kernels_flat,"\n",kernels_flat.shape)
res = kernels_flat # imageunfold # I have to replace this operation with mymult() and myadd()
print(res,"\n",res.shape)
#print(res.size(2),"\n",res.shape)
res = res.view(-1, out_channels, size, size)
#print("Same answer as buitlin function",res)
res = kernels_flat # imageunfold can be replaced with this. although there can be some other efficient implementation which i am looking to get help for.
for m_batch in range(len(imageunfold)):
#iterate through rows of X
# iterate through columns of Y
for j in range(imageunfold.size(2)):
# iterate through rows of Y
for k in range(imageunfold.size(1)):
#print(result[m_batch][i][j]," +=", kernels_flat[i][k], "*", imageunfold[m_batch][k][j])
result[m_batch][i][j] += kernels_flat[i][k] * imageunfold[m_batch][k][j]
Can someone please help me vectorize these three loops for faster execution.
The problem was with the dimesions as kernels_flat[dim0_1,dim1_1] and imageunfold[batch,dim0_2,dim1_2] the resultant should have [batch,dim0_1,dim1_2]
res = kernels_flat # imageunfold can be replaced with this. although there can be some other efficient implementation.
for m_batch in range(len(imageunfold)):
#iterate through rows of X
# iterate through columns of Y
for j in range(imageunfold.size(2)):
# iterate through rows of Y
for k in range(imageunfold.size(1)):
#print(result[m_batch][i][j]," +=", kernels_flat[i][k], "*", imageunfold[m_batch][k][j])
result[m_batch][i][j] += kernels_flat[i][k] * imageunfold[m_batch][k][j]
Your code for the matrix multiplication is missing a loop for iterating over the filters.
In the code below I fixed your implementation.
I am currently also looking for optimizations on the code. In my use case, the individual results of the multiplications (without performing addition) need to be accessible after computation. I will post here in case I find a faster solution than this.
for batch_image in range (imageunfold.shape[0]):
for i in range (kernels_flat.shape[0]):
for j in range (imageunfold.shape[2]):
for k in range (kernels_flat.shape[1]):
res_c[batch_image][i][j] += kernels_flat[i][k] * imageunfold[batch_image][k][j]
I am trying to produce a very easy example for combination of TensorArray and while_loop:
# 1000 sequence in the length of 100
matrix = tf.placeholder(tf.int32, shape=(100, 1000), name="input_matrix")
matrix_rows = tf.shape(matrix)[0]
ta = tf.TensorArray(tf.float32, size=matrix_rows)
ta = ta.unstack(matrix)
init_state = (0, ta)
condition = lambda i, _: i < n
body = lambda i, ta: (i + 1, ta.write(i,ta.read(i)*2))
# run the graph
with tf.Session() as sess:
(n, ta_final) = sess.run(tf.while_loop(condition, body, init_state),feed_dict={matrix: tf.ones(tf.float32, shape=(100,1000))})
print (ta_final.stack())
But I am getting the following error:
ValueError: Tensor("while/LoopCond:0", shape=(), dtype=bool) must be from the same graph as Tensor("Merge:0", shape=(), dtype=float32).
Anyone has on idea what is the problem?
There are several things in your code to point out. First, you don't need to unstack the matrix into the TensorArray to use it inside the loop, you can safely reference the matrix Tensor inside the body and index it using matrix[i] notation. Another issue is the different data type between your matrix (tf.int32) and the TensorArray (tf.float32), based on your code you're multiplying the matrix ints by 2 and writing the result into the array so it should be int32 as well. Finally, when you wish to read the final result of the loop, the correct operation is TensorArray.stack() which is what you need to run in your session.run call.
Here's a working example:
import numpy as np
import tensorflow as tf
# 1000 sequence in the length of 100
matrix = tf.placeholder(tf.int32, shape=(100, 1000), name="input_matrix")
matrix_rows = tf.shape(matrix)[0]
ta = tf.TensorArray(dtype=tf.int32, size=matrix_rows)
init_state = (0, ta)
condition = lambda i, _: i < matrix_rows
body = lambda i, ta: (i + 1, ta.write(i, matrix[i] * 2))
n, ta_final = tf.while_loop(condition, body, init_state)
# get the final result
ta_final_result = ta_final.stack()
# run the graph
with tf.Session() as sess:
# print the output of ta_final_result
print sess.run(ta_final_result, feed_dict={matrix: np.ones(shape=(100,1000), dtype=np.int32)})