I am using the tf.nn.dynamic_rnn class to create an LSTM. I have trained this model on some data, and now I want to inspect what are the values of the hidden states of this trained LSTM at each time step when I provide it some input.
After some digging around on SO and on TensorFlow's GitHub page, I saw that some people mentioned that I should write my own LSTM cell that returns whatever I want printed as part of the output of the LSTM. However, this does not seem straight forward to me since the hidden states and the output of the LSTM do not have the same shapes.
My output tensor from the LSTM has shape [16, 1] and the hidden state is a tensor of shape [16, 16]. Concatenating them results in a tensor of shape [16, 17]. When I tried to return it, I get an error saying that some TensorFlow op required a tensor of shape [16,1].
Does anyone know an easier work around to this situation? I was wondering if it is possible to use tf.Print to just print the required tensors.
Okay, so the issue was that I was modifying the output but wasn't updating the output_size of the LSTM itself. Hence the error. It works perfectly fine now. However, I still find this method to be extremely annoying. Not accepting my own answer with the hope that somebody will have a cleaner solution.
Related
I've been trying to understand LSTM inputs for a while now and I think I understand but I keep getting confused on how to implement them.
This is what I think, please correct me if I am wrong.
When specifying an LSTM, you specify the number of cells and the input shape (I've been having issues with the input shape). The Number of cells specifies how many cells should look at the data given and does not affect the required input shape. The input shape (When Stateful) goes by batch size, Timesteps in a batch, and features in a time step. A stateful LSTM retains it's Internal States until reset. Is this right?
If so I'm having so much confusion trying to specify the input shape for my network. This is because I'm trying to upgrade a current network and I cant figure out how and where to specify the input shape without an error.
The way I'm trying to upgrade it is initially I have a CNN going to a dense layer. I'm trying to change it such that it adds an LSTM that takes the CNN's flattened 1D output as one batch and one time step with features dependent on the size of the CNN's Output. Then concatenates its output with the CNN's output (The LSTM's Input) then feeds into the dense layer. Thus it now behaves like LSTM with a skip connection. The issue that I cant seem to understand is when and how to specify the LSTM layer's Input_shape as it has NO INPUT_SHAPE parameter for the functional API? Or maybe I'm just super confused, Everyone uses different API's going over different examples and it gets super confusing what is and isn't specified and how.
Thank you, even if you just help with one of the two parts.
TLDR:
Do I understand LSTM Parameters correctly?
How and when do I specify LSTM Input_shapes if at all?
LSTM units argument means dimensions of LSTM matrices and output shape.
With Functional API you can specify input shape for the very first layer only. If your LSTM layer follows CNN - then its input shape is determined automatically as CNN output.
Im new to keras and i was wondering if I could do some work regarding text classification using neural networks.
So i went ahead and got a data set regarding spam or ham and I vectorized the data using tfidf and converted the labels to a numpy array using the to_categorical() and managed to split my data into train and test each of which is a numpy array having around 7k columns.
This the code i used.
model.add(Dense(8,input_dim=7082,activation='relu'))
model.add(Dense(8,input_dim=7082))
model.add(Dense(2,activation='softmax'))
model.compile(loss="categorical_crossentropy",optimizer="adam",metrics=['accuracy'])
I dont know if im doing something totally wrong. Could someone point me in the right direction as to what i should change.
The error thrown:
Error when checking input: expected dense_35_input to have 2 dimensions, but got array with shape ()
Dense layers doesn't seem to have any input_dim parameter according to the documentation.
input_shape is a tuple and must be used in the first layer of your model. It refers to the shape of the input data.
units refers to the dimension of the output space, that is the shape of each output element processed by the dense layer.
In your case, if your input data has is of dimensionality 7082, this should work:
model.add(Dense(8,input_shape=(7082,),activation='relu'))
model.add(Dense(2,activation='softmax'))
model.compile(loss="categorical_crossentropy",optimizer="adam",metrics=['accuracy'])
I want to extract four intermediate layers of my model. Here is how I setup my function:
K.function([yolo_model.layers[0].input, yolo_model.layers[4].input, K.learning_phase()],
[yolo_model.layers[1].layers[17].output,
yolo_model.layers[1].layers[27].output,
yolo_model.layers[1].layers[43].output,
yolo_model.layers[1].layers[69].output])
I always got the error that tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input_3' with dtype float and shape [?,416,416,3]
It seems my input has dimension or type error but I used this input for my model train_on_batch or predict and it worked. I got the same error even when I passed np.zeros((1,416,416,3)) into it. Another wired thing is I don't have input_3 since my model only takes two inputs. I have no idea where input_3 tensor comes from.
Thanks in advance if anyone can give me some hints.
I found out where the problem is. My model is composed of one inner model and several layers. When I build the function, my input is from outer model but output is from inner model, which causes disconnection between input and output. I just changed the original input to the input layer of inner model and it works.
I'm trying to build a fully convolutional neural network. My problem is that at some phase the shape of the tensors no longer match, causing and Exception, and I would like to print the shape of the tensors after each step to be able to pin point the problem. However the problem is that the tf.Print does not seem to print anything if the graph is broken and an Exception is thrown at some point (even the exception occurs after the print statement in the pipeline). I'm using the below code in printing. It is working OK if I have a working graph. So is it really that the tf.Print can be used only with working graphs? If this is the case how could I print the shape of tensors or is the only possibility to use some debugger for example tfdbg?
upsample = custom_layers.crop_center(input_layer, upsample)
upsample_print = tf.Print(upsample, [tf.shape(upsample)], "shape of tensor is ")
logits = tf.reshape(upsample_print, [-1, 2])
...
The error given is
ValueError: Dimension size must be evenly divisible by 2898844 but is 2005644 for 'gradients/Reshape_grad/Reshape' (op: 'Reshape') with input shapes: [1002822,2], [4] and with input tensors computed as partial shapes: input[1] = [?,1391,1042,2].
Try print(upsample.get_shape()).
tf.Print only prints during runtime. It simply adds a node to the graph that upon execution prints something to the console. So, if your graph cannot be constructed, i.e. no computations can be executed, you will never see an output from tf.Print.
At construction time, you can only see static shapes of your tensors (and e.g. print them with the Python native print statement). I am not aware of any way to get the dynamic shape at construction time (the dynamic shape is dependent on the actual input you feed, so there is no way of knowing that before you actually feed something, which only happens during runtime). Knowing the static shapes was often enough for my purposes. If it is not the case for you, try to make the dynamic dimensions static in a toy example and then Python-print all the shapes to track down the problem.
I'm using tensorflow 1.8.0rc1. I'm trying to save a very simple NN model to tflite format, with weight quantization, following this documentation: https://www.tensorflow.org/performance/quantization.
However, when converting with toco, I receive this error:
Array Relu, which is an input to the Div operator producing the output array dropout/div, is lacking min/max data, which is necessary for quantization. Either target a non-quantized output format, or change the input graph to contain min/max information, or pass --default_ranges_min= and --default_ranges_max= if you do not care about the accuracy of results.\n"
And this is the graph:
At some point it was not complaining about RELU, but Assign operations (fixed that no idea how), and if I remove the RELU layers it complains about the Add layers. Any idea what's going on?
EDIT:
Just realized that between dropout_1 and activation2 (see picture) there's an act_quant node that must be the fake quantization of activation2 (a RELU). This is not happening in the first layer, between dropout and activation1. I guess this the problem? According to the tensorflow quantization tutorial (attached before) the scripts described there should rewrite the graph with all the necessary information for toco to quantize the weights.