I am working on neural networks and many times faced problems with shapes.....Tensorflow provides us a keyword None so that we don't have to worry about the size of the tensor.....
Is there any disadvantage of using None in place of known numeric value for shape.
method 1
input_placeholder = tf.placeholder(tf.float32,[None,None])
method 2
input_placeholder = tf.placeholder(tf.float32,[64,100])
Will it make any difference while running the code ?
Tensorflow's tf.placeholder() tensors do not require a fixed shape to be passed to them. This allows you to pass different shapes in later tf.Session.run() call.
So your code will work just fine.
It doesn't have any disadvantage because when we create a placeholder, Tensorflow doesn't allocate any memory. When you feed the placeholder, in the call to tf.Session.run(), TensorFlow will allocate appropriately sized memory for the input tensors.
If you use these input_placeholder in some operation in your code further, defining them with None i.e. an unconstrained shape, can cause Tensorflow to perform some checks, related to the shape of the tensors, while performing those ops dynamically during the Session.run() call. This is because while building the graph Tensorflow doesn't know about the exact shape of your input.
Related
I have been trying to understand RNNs better and am creating an RNN from scratch myself using numpy. I am at the point where I have calculated a Loss but it was suggested to me that rather than do the gradient descent and weight matrix updates myself, I use pytorch .backward function. I started to read some of the documentation and posts here about how it works and it seems like it will calculate the gradients where a torch tensor has requires_grad=True in the function call.
So it seems that unless create a torch tensor, I am not able to use the .backward. When I try to do this on the loss scalar, I get a 'numpy.float64' object has no attribute 'backward' error. I just wanted to confirm. Thank you!
Yes, this will only work on PyTorch Tensors.
If the tensors are on CPU, they are basically numpy arrays wrapped into PyTorch Tensors API (i.e., running .numpy() on such a tensor returns exactly the data, it can modified etc.)
Is it possible to reuse tensors in multiple tf-graphs, even after they are reset?
Problem:
I have a large dataset that I want to evaluate with many different tf-graphs.
For each evaluation, tensorflow is reset with tf.compat.v1.reset_default_graph() and initialized completely from scratch.
Imho, it seems kind of dull and slow to call the data-to-tensor procedure every time, so I thought I could just define the data-tensor once and use it for all future evaluation.
Unfortunately, reusing tensors does not seem to be possible, as 'Tensor must be from the same graph as Tensor'.
ValueError: Tensor("Const:0", shape=(1670,), dtype=float32, device=/device:GPU:0) must be from the same graph as Tensor("Const_1:0", shape=(1670,), dtype=float32).
Is it possible to reuse these tensors somehow?
Check out this answer in another answered on another questio. https://stackoverflow.com/a/42616834/13514201
TensorFlow stores all operations on an operational graph. This graph defines what functions output to where, and it links it all together so that it can follow the steps you have set up in the graph to produce your final output. If you try to input a Tensor or operation on one graph into a Tensor or operation on another graph it will fail. Everything must be on the same execution graph.
Try removing with tf.Graph().as_default():
Note: I already solved my issue, but I'm posting the question in case others have it too and because I don't understand how I solved it.
I was building a Named Entity Classifier (sequence labelling model) in Keras with Tensorflow backend. When I tried to fit the model, I got this error (which, amazingly, returns only 4 Google results):
"If your data is in the form of symbolic tensors, you should specify the `steps_per_epoch` argument (instead of the batch_size argument, because symbolic tensors are expected to produce batches of input data)."
This stackoverflow post discussed the issue, and someone suggested to the op:
one of your data tensors that is being used by Fit() is a symbolic tensor. The one hot label function returns a symbolic tensor. Try something like:
label_onehot = tf.Session().run(K.one_hot(label, 5))
Then I read on this (not related) site:
The Wolfram System also has powerful algorithms to manipulate algebraic combinations of expressions representing [...] arrays. These expressions are called symbolic arrays or symbolic tensors.
These two sources made me think symbolic arrays (at least in TensorFlow) might be something more like arrays of functions that are yet to be evaluated, rather than actual values.
So, using %whos to view all my variables, I saw that my X and Y data were tensors (rather than arrays, like I normally use for my models). The data/info column had quite a complicated description for them, but I lost it once I solved my issue and I can't work out how to get back to the state where I was getting the error.
In any case, I know I solved the problem by changing my data pre-processing so that the X and y data (i.e. X_train and y_train) were of type <class 'numpy.ndarray'> and of dimensions (num sents, max len) for X_train and (num_sents, max len, 1) for y_train (the 1 is necessary because my final layer expects 3D input). Now the model works fine. But I'm still wondering, what are these symbolic tensors and how/why is using steps per epoch instead of batch size supposed to help? I tried that too initially but had no luck.
This can be solved bu using the eval() or numpy() function of your tensors.
Check:
How can I convert a tensor into a numpy array in TensorFlow?
So I have a custom layer, that does not have any weights.
In a fist step, I tried to implement the functions manipulating the input tensors in Kers. But I did not succeed because of many reasons. My second approach was to implement the functions with numpy operations, since the custom layer I am implementing does not have any weights, from my understanding, I would say, that I could use numpy operarations, as I don't need backpropagation, since there are no weights, right? And then, I would just convert the output of my layer to a tensor with:
Keras.backend.variable(value = output)
So the main idea is to implement a custom layer, that takes tensors, convert them to numpy arrays, operate on them with numpy operations, then convert the output to a tensor.
The problem is that I seem not to be able to use .eval() in order to convert the input tensors of my layer into numpy arrays, so that they could be manipulated with numpy operations.
Can anybody tell, how I can get around this problem ?
As mentioned by Daniel Möller in the comments, Keras needs to be able to backpropagate through your layer, in order the calculate the gradients for the previous layers. Your layer needs to be differentiable for this reason.
For the same reason you can only use Keras operations, as those can be automatically differentiated with autograd. If your layer is something simple, have a look at the Lambda layer where you can implement custom layers quickly.
As an aside, Keras backend functions should cover a lot of use cases, so if you're stuck with writing your layer through those, you might want to post a another question here.
Hope this helps.
When I read TensorFlow codes, I see people specify placeholders for the input arguments of the functions and then feed the input data in a session.run. A trivial example can be like:
def sigmoid(z):
x = tf.placeholder(tf.float32, name='x')
sigmoid = tf.sigmoid(x)
with tf.Session() as session:
result = session.run(sigmoid, feed_dict={x:z})
return result
I wonder why don't they directly feed the z into the tf.sigmoid(z) and get rid of the placeholder x?
If this is a best practice, what is the reason behind it?
In your example method sigmoid, you basically built a small computation graph (see below) and run it with session.run (in the same method). Yes, it does not add any benefit to use a place-holder in your case.
However, usually people just built the computation graph (and execute the graph with data later). But at the time of building the graph, the data is not needed. That's why we use a place-holder to hold the place of data. Or in other words, it allows us to create our computing operations without needing any data.
Also this should explain why we want to use tf.placehoder instead of tf.Variable for holding training data. In short:
tf.Variable is for trainable parameters of the model.
tf.placeholder is for training data which does not change as model trains.
No initial values are needed for placeholders.
The first dimension of data through feeding could be None thus supporting any batch_size.