In Tensorflow, how would I go about selecting between a python list of Tensors in the middle of my graph as an input to the rest of the graph?
Basically, I have a python list of Tensors that are candidates to be used as inputs in the rest of the graph. I want to select from one of them without adding extra dependencies that require all of the Tensors in the list to be computed (I think that would happen if I used tf.cond). How can I select one of them? I can't do it at the python level because I choose the tensor based on a value computed from a placeholder. So for example:'
x = tf.placeholder(tf.int32, shape=(num_steps, None))
y = tf.placeholder(tf.int32, shape=(None,))
lengths = tf.placeholder(tf.int32, shape=(None,))
# Pretend there is a bunch of lines of code here
output_index = max_sequence_length = tf.reduce_max(lengths)
final_output = potential_outputs[output_index] # won't work, output_index is Tensor
# Pretend the rest of the model uses final_output
More info if you want it:
I am unrolling an RNN and I want to only unroll to the maximum length of the sequence. When this is less then the number of unrolling steps, there is a lot of wasted computation. Dynamic_rnn and static_rnn do not meet my needs, so I am trying to come up with my own custom method of unrolling the graph.
To index in tensorflow use tf.slice.
It should be noted that based on the code you provided, I don't think you are indexing the outputs correctly using tf.reduce_max function since this is providing the actual maximum value across a given axis which may not be an integer (but I'm not sure how your network works). You may be looking for tf.argmax that returns to index for the maximum value. The issue with this however is that tensorflow does not a have a gradient defined for tf.argmax so that function cannot be a learned part of your network.
Related
I have a custom loss function where I want to change values from a one-hot based encoding to values in a certain range to calculate an IOU.
Part of this code is to look at where I have a one in a tensor that has zeros otherwise. For this I am using tf.where which returns me the location. I have a vector of shape [batch_size,S1,S2,12] where I only care for the last dimension, thats why I take [...,2] of tf.where.
Now it often happens that my prediction is all zeros because I have background events without any values in them and also my network will predict an all zero vector every now and then. This means tf.where will return an empty tensor.
Thats why I want to use K.switch to check if the tensor is empty, because if it is I would like to have zeros returned.
The problem is now that K.switch expects the shape of the then else options to have the same shape but I need my output to have shape [batch_size,S1,S2,1]. I have tried different things but I cant get this to work.
I need to get zeros of shape [batch_size,S1,S2,1] or I need where_box1 to have [batch_size,S1,S2,1] with floats.
The way its implemented now, K.switch returns an empty vector of zeros when where_box1_temp is empty, which is not what I want.
When I use tf.zeros([batch_size,S1,S2,1]) instead it will complain that the conditions are of different shape when where_box1_temp is empty....
where_box1_temp = tf.where(y_pred[...,C+1:C+13])[...,2]
where_box1 = K.switch(tf.equal(tf.size(where_box1_temp),0) ,
tf.zeros_like(where_box1_temp) , where_box1_temp)
So I found a workaround, maybe this is helpful for someone else:
where_box1_temp = tf.where(y_pred[...,C+1:C+13],[1,2,3,4,5,6,7,8,9,10,11,12],0)
where_box1 = tf.reshape(K.sum(where_box1_temp,axis=3),[batch_size,5,5])
This allows me to have a tensor of my desired shape where all background/zero prediction values are 0 without having to use k.switch and having trouble with any empty dimensions or something like that.
Following the example from PyTorch docs I am trying to solve a problem where the padding is inconsistent rather than at the end of the tensor for each batch (in other words, no pun intended, I have a left-censored and right-censored problem across my batches):
# Data structure example from docs
seq = torch.tensor([[1,2,0], [3,0,0], [4,5,6]])
# Data structure of my problem
inconsistent_seq = torch.tensor([[1,2,0], [0,3,0], [0,5,6]])
lens = ...?
packed = pack_padded_sequence(seq, lens, batch_first=True, enforce_sorted=False)
How can I solve the problem of masking these padded 0’s when running them through an LSTM using (preferably) PyTorch functionality?
I "solved" this by essentially reindexing my data and padding left-censored data with 0's (makes sense for my problem). I also injected and extra tensor to the input dimension to track this padding. I then masked the right-censored data using the pack_padded_sequence method from the PyTorch library. Found a good source here:
https://www.kdnuggets.com/2018/06/taming-lstms-variable-sized-mini-batches-pytorch.html
I am reading the "Deep Learning for Coders with fastai & PyTorch" book. I'm still a bit confused as to what the Embedding module does. It seems like a short and simple network, except I can't seem to wrap my head around what Embedding does differently than Linear without a bias. I know it does some faster computational version of a dot product where one of the matrices is a one-hot encoded matrix and the other is the embedding matrix. It does this to in effect select a piece of data? Please point out where I am wrong. Here is one of the simple networks shown in the book.
class DotProduct(Module):
def __init__(self, n_users, n_movies, n_factors):
self.user_factors = Embedding(n_users, n_factors)
self.movie_factors = Embedding(n_movies, n_factors)
def forward(self, x):
users = self.user_factors(x[:,0])
movies = self.movie_factors(x[:,1])
return (users * movies).sum(dim=1)
Embedding
[...] what Embedding does differently than Linear without a bias.
Essentially everything. torch.nn.Embedding is a lookup table; it works the same as torch.Tensor but with a few twists (like possibility to use sparse embedding or default value at specified index).
For example:
import torch
embedding = torch.nn.Embedding(3, 4)
print(embedding.weight)
print(embedding(torch.tensor([1])))
Would output:
Parameter containing:
tensor([[ 0.1420, -0.1886, 0.6524, 0.3079],
[ 0.2620, 0.4661, 0.7936, -1.6946],
[ 0.0931, 0.3512, 0.3210, -0.5828]], requires_grad=True)
tensor([[ 0.2620, 0.4661, 0.7936, -1.6946]], grad_fn=<EmbeddingBackward>)
So we took the first row of the embedding. It does nothing more than that.
Where is it used?
Usually when we want to encode some meaning (like word2vec) for each row (e.g. words being close semantically are close in euclidean space) and possibly train them.
Linear
torch.nn.Linear (without bias) is also a torch.Tensor (weight) but it does operation on it (and the input) which is essentially:
output = input.matmul(weight.t())
every time you call the layer (see source code and functional definition of this layer).
Code snippet
The layer in your code snippet does this:
creates two lookup tables in __init__
the layer is called with input of shape (batch_size, 2):
first column contains indices of user embeddings
second column contains indices of movie embeddings
these embeddings are multiplied and summed returning (batch_size,) (so it's different from nn.Linear which would return (batch_size, out_features) and perform dot product instead of element-wise multiplication followed by summation like here)
This is probably used to train both representations (of users and movies) for some recommender-like system.
Other stuff
I know it does some faster computational version of a dot product
where one of the matrices is a one-hot encoded matrix and the other is
the embedding matrix.
No, it doesn't. torch.nn.Embedding can be one hot encoded and might also be sparse, but depending on the algorithms (and whether those support sparsity) there might be performance boost or not.
I was wondering if there is a way to know the list of inputs and outputs for a particular node in tflite? I know that I can get input/outputs details, but this does not allow me to reconstruct the computation process that happens inside an Interpreter. So what I do is:
interpreter = tf.lite.Interpreter(model_path=model_path)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
interpreter.get_tensor_details()
The last 3 commands basically give me dictionaries which don't seem to have the necessary information.
So I was wondering if there is way to know where each nodes outputs goes? Surely Interpreter knows this somehow. Can we? Thanks.
Note: this answer was written for Tensorflow 1.x and, while the concept and core idea remains the same in TensorFlow 2.x, the commands in this answer might be deprecated.
The mechanism of TF-Lite makes the whole process of inspecting the graph and getting the intermediate values of inner nodes a bit tricky. The get_tensor(...) method suggested by the other answer does not work.
How to visualize TF-Lite inference graph?
TensorFlow Lite models can be visualized using the visualize.py script in the TensorFlow Lite repository. You just need to:
Clone the TensorFlow repository
Run the visualize.py script with bazel:
bazel run //tensorflow/lite/tools:visualize \
model.tflite \
visualized_model.html
Does the nodes in my TF model have a equivalent one in TF-Lite?
NO! In fact, TF-Lite can modify your graph so that it become more optimal. Here are some words about it from the TF-Lite documentation:
A number of TensorFlow operations can be processed by TensorFlow Lite even though they have no direct equivalent. This is the case for operations that can be simply removed from the graph (tf.identity), replaced by tensors (tf.placeholder), or fused into more complex operations (tf.nn.bias_add). Even some supported operations may sometimes be removed through one of these processes.
Moreover, the TF-Lite API currently doesn't allow to get node correspondence; it's hard to interpret the inner format of TF-Lite. So, you can't get the intermediate outputs for any nodes you want, even without the one more issue below...
Can I get intermediate values of some TF-Lite nodes?
NO! Here, I will explain why get_tensor(...) wouldn't work in TF-Lite. Suppose in the inner representation, the graph contains of 3 tensors, together with some dense operations (nodes) in-between (you can think of tensor1 as input and tensor3 as output of your model). During inference of this particular graph, TF-Lite only needs 2 buffers, let's show how.
First, use tensor1 to compute tensor2 by applying dense operation. This only requires 2 buffers to store the values:
dense dense
[tensor1] -------> [tensor2] -------> [tensor3]
^^^^^^^ ^^^^^^^
bufferA bufferB
Second, use the value of tensor2 stored in bufferB to compute tensor3... but wait! We don't need bufferA anymore, so let's use it to store the value of tensor3:
dense dense
[tensor1] -------> [tensor2] -------> [tensor3]
^^^^^^^ ^^^^^^^
bufferB bufferA
Now is the tricky part. The "output value" of tensor1 will still point to bufferA, which now holds the values of tensor3. So if you call get_tensor(...) for the 1st tensor, you'll get incorrect values. The documentation of this method even states:
This function cannot be used to read intermediate results.
How to get around this?
Easy but limited way. You can specify the names of the nodes, output tensors of which you want to get the values of during conversion:
tflite_convert \
-- # other options of your model
--output_arrays="output_node,intermediate/node/n1,intermediate/node/n2"
Hard but flexible way. You can compile TF-Lite with Bazel (using this instruction). Then you can actually inject some logging code to Interpreter::Invoke() in the file tensorflow/lite/interpreter.cc. An ugly hack, but it works.
As #FalconUA has pointed out, we cannot directly get intermediate inputs and outputs from a TFlite model. But, we can get inputs and outputs of layers by modifying the model buffer. This repo shows how it is done. We need to modify flat buffer schema for this to work. The modified TFlite schema (tflite folder in the repo) is available in the repo.
For the completeness of the answer, below is the relevant code:
def buffer_change_output_tensor_to(model_buffer, new_tensor_i):
# from https://github.com/raymond-li/tflite_tensor_outputter
# Set subgraph 0's output(s) to new_tensor_i
# Reads model_buffer as a proper flatbuffer file and gets the offset programatically
# It might be much more efficient if Model.subgraphs[0].outputs[] was set to a list of all the tensor indices.
fb_model_root = tflite_model.Model.GetRootAsModel(model_buffer, 0)
output_tensor_index_offset = fb_model_root.Subgraphs(0).OutputsOffset(0) # Custom added function to return the file offset to this vector
# print("buffer_change_output_tensor_to. output_tensor_index_offset: ")
# print(output_tensor_index_offset)
# output_tensor_index_offset = 0x5ae07e0 # address offset specific to inception_v3.tflite
# output_tensor_index_offset = 0x16C5A5c # address offset specific to inception_v3_quant.tflite
# Flatbuffer scalars are stored in little-endian.
new_tensor_i_bytes = bytes([
new_tensor_i & 0x000000FF, \
(new_tensor_i & 0x0000FF00) >> 8, \
(new_tensor_i & 0x00FF0000) >> 16, \
(new_tensor_i & 0xFF000000) >> 24 \
])
# Replace the 4 bytes corresponding to the first output tensor index
return model_buffer[:output_tensor_index_offset] + new_tensor_i_bytes + model_buffer[output_tensor_index_offset + 4:]
def get_tensor(path_tflite, tensor_id):
with open(path_tflite, 'rb') as fp:
model_buffer = fp.read()
model_buffer = buffer_change_output_tensor_to(model_buffer, int(tensor_id))
interpreter = tf.lite.Interpreter(model_content=model_buffer)
interpreter.allocate_tensors()
tensor_details = interpreter._get_tensor_details(tensor_id)
tensor_name = tensor_details['name']
input_details = interpreter.get_input_details()
interpreter.set_tensor(input_details[0]['index'], input_tensor)
interpreter.invoke()
tensor = interpreter.get_tensor(tensor_id)
return tensor
I'm working with MNIST and I have a tensor of gradients with size [?,28,28,1] and I want to zero out a few of the [28,28,1] sub-tensors inside it, how should I accomplish this?
I know the indices (as a list) where I need to zero out the sub-tensors. I tried doing something like this (given below) but, scatter.update can only change variables not tensors. I also tried stacking up the required sub-tensors of zeroes and ones but couldn't build up the required result.
dy_dx, = tf.gradients(loss, x_adv)
zeroes = tf.zeros(dy_dx[0].get_shape(), tf.float32)
dy_dx = tf.scatter_update(dy_dx, indices, zeroes)
Thanks!
I'd suggest creating a TensorFlow constant with zeros at the locations you want to zero out and ones everywhere else. Then you could create an op that uses tf.multiply to do elementwise multiplication of the constant and dy_dx. Depending on the structure of your graph, you might need to feed the result to dy_dx in your next call to session.run; you can replace any Tensor with feed data, including variables and constants.
Incidentally, if you just want to apply dropout to the input layer you can use tf.layers.dropout