I'm playing around with tensorflow and ran into a problem with the following code:
def _init_parameters(self, input_data, labels):
# the input shape is (batch_size, input_size)
input_size = tf.shape(input_data)[1]
# labels in one-hot format have shape (batch_size, num_classes)
num_classes = tf.shape(labels)[1]
stddev = 1.0 / tf.cast(input_size, tf.float32)
w_shape = tf.pack([input_size, num_classes], 'w-shape')
normal_dist = tf.truncated_normal(w_shape, stddev=stddev, name='normaldist')
self.w = tf.Variable(normal_dist, name='weights')
(I'm using tf.pack as suggested in this question, since I was getting the same error)
When I run it (from a larger script that invokes this one), I get this error:
ValueError: initial_value must have a shape specified: Tensor("normaldist:0", shape=TensorShape([Dimension(None), Dimension(None)]), dtype=float32)
I tried to replicate the process in the interactive shell. Indeed, the dimensions of normal_dist are unspecified, although the supplied values do exist:
In [70]: input_size.eval()
Out[70]: 4
In [71]: num_classes.eval()
Out[71]: 3
In [72]: w_shape.eval()
Out[72]: array([4, 3], dtype=int32)
In [73]: normal_dist.eval()
Out[73]:
array([[-0.27035281, -0.223277 , 0.14694688],
[-0.16527176, 0.02180306, 0.00807841],
[ 0.22624688, 0.36425814, -0.03099642],
[ 0.25575709, -0.02765726, -0.26169327]], dtype=float32)
In [78]: normal_dist.get_shape()
Out[78]: TensorShape([Dimension(None), Dimension(None)])
This is weird. Tensorflow generates the vector but can't say its shape. Am I doing something wrong?
As Ishamael says, all tensors have a static shape, which is known at graph construction time and accessible using Tensor.get_shape(); and a dynamic shape, which is only known at runtime and is accessible by fetching the value of the tensor, or passing it to an operator like tf.shape. In many cases, the static and dynamic shapes are the same, but they can be different - the static shape can be partially defined - in order allow the dynamic shape to vary from one step to the next.
In your code normal_dist has a partially-defined static shape, because w_shape is a computed value. (TensorFlow sometimes attempts to evaluate
these computed values at graph construction time, but it gets stuck at tf.pack.) It infers the shape TensorShape([Dimension(None), Dimension(None)]), which means "a matrix with an unknown number of rows and columns," because it knowns that w_shape is a vector of length 2, so the resulting normal_dist must be 2-dimensional.
You have two options to deal with this. You can set the static shape as Ishamael suggests, but this requires you to know the shape at graph construction time. For example, the following may work:
normal_dist.set_shape([input_data.get_shape()[1], labels.get_shape()[1]])
Alternatively, you can pass validate_shape=False to the tf.Variable constructor. This allows you to create a variable with a partially-defined shape, but it limits the amount of static shape information that can be inferred later on in the graph.
Similar question is nicely explained in TF FAQ:
In TensorFlow, a tensor has both a static (inferred) shape and a
dynamic (true) shape. The static shape can be read using the
tf.Tensor.get_shape method: this shape is inferred from the operations
that were used to create the tensor, and may be partially complete. If
the static shape is not fully defined, the dynamic shape of a Tensor t
can be determined by evaluating tf.shape(t).
So tf.shape() returns you a tensor, will always have a size of shape=(N,), and can be calculated in a session:
a = tf.Variable(tf.zeros(shape=(2, 3, 4)))
with tf.Session() as sess:
print sess.run(tf.shape(a))
On the other hand you can extract the static shape by using x.get_shape().as_list() and this can be calculated anywhere.
The variable can have a dynamic shape. get_shape() returns the static shape.
In your case you have a tensor that has a dynamic shape, and currently happens to hold value that is 4x3 (but at some other time it can hold a value with a different shape -- because the shape is dynamic). To set the static shape, use set_shape(w_shape). After that the shape you set will be enforced, and the tensor will be a valid initial_value.
Related
What is the difference between these two?
1- tf.reshape(tensor, [-1])
2- tf.reshape(tensor, -1)
I can not find any difference between these two, but when I use -1 without brackets, an error occurs when trying to map the function to a TensorSliceDataset.
Here is the simplified version of the code:
def reshapeME(tensor):
reshaped = tf.reshape(tensor,-1)
return reshaped
new_y_test = y_test.map(reshapeME)
and here is the Error:
ValueError: Shape must be rank 1 but is rank 0 for '{{node Reshape}} = Reshape[T=DT_FLOAT, Tshape=DT_INT32](one_hot, Reshape/shape)' with input shapes: [6], [].
If I add the bracket, there is no error. Also, there is no error when the function is used by calling and feeding a tensor.
tf.reshape expects a tensor or tensor-like variable as the shape in Graph mode:
A Tensor. Must be one of the following types: int32, int64. Defines the shape of the output tensor.
So, simple scalars will not work in this case. The map function of a tf.data.Dataset is always executed in Graph mode:
Note that irrespective of the context in which map_func is defined
(eager vs. graph), tf.data traces the function and executes it as a
graph.
I have a n-D array. I need to create a 1-D range tensor based on dimensions.
for an example:
x = tf.placeholder(tf.float32, shape=[None,4])
r = tf.range(start=0, limit=, delta=x.shape[0],dtype=tf.int32, name='range')
sess = tf.Session()
result = sess.run(r, feed_dict={x: raw_lidar})
print(r)
The problem is, x.shape[0] is none at the time of building computational graph. So I can not build the tensor using range. It gives an error.
ValueError: Cannot convert an unknown Dimension to a Tensor: ?
Any suggestion or help for the problem.
Thanks in advance
x.shape[0] might not exist yet when running this code is graph mode. If you want a value, you need to use tf.shape(x)[0].
More information about that behaviour in the documentation for tf.Tensor.get_shape. An excerpt (emphasis is mine):
tf.Tensor.get_shape() is equivalent to tf.Tensor.shape.
When executing in a tf.function or building a model using tf.keras.Input, Tensor.shape may return a partial shape (including None for unknown dimensions). See tf.TensorShape for more details.
>>> inputs = tf.keras.Input(shape = [10])
>>> # Unknown batch size
>>> print(inputs.shape)
(None, 10)
The shape is computed using shape inference functions that are registered for each tf.Operation.
The returned tf.TensorShape is determined at build time, without executing the underlying kernel. It is not a tf.Tensor. If you need a shape tensor, either convert the tf.TensorShape to a tf.constant, or use the tf.shape(tensor) function, which returns the tensor's shape at execution time.
I am trying to reshape an array of size (14,14,3) to (None, 14,14,3). I have seen that the output of each layer in convolutional neural network has shape in the format(None, n, n, m).
Consider that the name of my array is arr
I tried arr[None,:,:] but it converts it to a dimension of (1,14,14,3).
How should I do it?
https://www.tensorflow.org/api_docs/python/tf/TensorShape
A TensorShape represents a possibly-partial shape specification for a Tensor. It may be one of the following:
Partially-known shape: has a known number of dimensions, and an unknown size for one or more dimension. e.g. TensorShape([None, 256])
That is not possible in numpy. All dimensions of a ndarray are known.
arr[None,:,:] notation adds a new size 1 dimension, (1,14,14,3). Under broadcasting rules, such a dimension may be changed to match a dimension of another array. In that sense we often treat the None as a flexible dimension.
I haven't worked with tensorflow though I see a lot of questions with both tags. tensorflow should have mechanisms for transfering values to and from tensors. It knows about numpy, but numpy does not 'know' anything about tensorflow.
A ndarray is an object with known values, and its shape is used to access those values in a multidimensional way. In contrast a tensor does not have values:
https://www.tensorflow.org/api_docs/python/tf/Tensor
It does not hold the values of that operation's output, but instead provides a means of computing those values
Looks like you can create a TensorProt from an array (and return an array from one as well):
https://www.tensorflow.org/api_docs/python/tf/make_tensor_proto
and to make a Tensor from an array:
https://www.tensorflow.org/api_docs/python/tf/convert_to_tensor
The shape (None, 14,14,3) represent ,(batch_size,imgH,imgW,imgChannel) now imgH and imgW can be use interchangeably depends on the network and the problem.
But the batchsize is given as "None" in the neural network because we don't want to restrict our batchsize to some specific value as our batchsize depends on a lot of factors like memory available for our model to run etc.
So lets say you have 4 images of size 14x14x3 then you can append each image into the array say L1, and now the L1 will have the shape 4x14x14x3 i.e you made a batch of 4 images and now you can feed this to your neural network.
NOTE here None will be replaced by 4 and for the whole training process it will be 4. Similarly when you feed your network only one image it assumes the batchsize of 1 and set None equal to 1 giving you the shape (1X14X14X3)
I have an existing complex model. Inside there is tensor x with shape (None, 128, 128, 3). First axis has dynamic shape, that should be materialized when batch is passed to feed_dict in session.run. However when I attempt to define broadcast operation to shape of x:
y = tf.broadcast_to(z, (x.shape[0], x.shape[1], x.shape[2], 1))
Exception is raised:
Failed to convert object of type <class 'tuple'> to
Tensor. Contents: (Dimension(None), Dimension(128),
Dimension(128), 1). Consider casting elements to a supported type.
Exception occurs when creating model, not when running it. Converting first element to number helps, but this is not the solution.
The .shape attribute gives you the shape known at graph construction time, which is a tf.TensorShape structure. If the shape of x were fully known, you could get your code to work as follows:
y = tf.broadcast_to(z, (x.shape[0].value, x.shape[1].value, x.shape[2].value, 1))
However, in your case x has an unknown first dimension. In order to use the actual tensor shape as a regular tf.Tensor (with value only known at runtime), you can use tf.shape:
x_shape = tf.shape(x)
y = tf.broadcast_to(z, (x_shape[0], x_shape[1], x_shape[2], 1))
I am working on OCR software optimized for phone camera images.
Currently, each 300 x 1000 x 3 (RGB) image is reformatted as a 900 x 1000 numpy array. I have plans for a more complex model architecture, but for now I just want to get a baseline working. I want to get started by training a static RNN on the data that I've generated.
Formally, I am feeding in n_t at each timestep t for T timesteps, where n_t is a 900-vector and T = 1000 (similar to reading the whole image left to right). Here is the Tensorflow code in which I create batches for training:
sequence_dataset = tf.data.Dataset.from_generator(example_generator, (tf.int32,
tf.int32))
sequence_dataset = sequence_dataset.batch(experiment_params['batch_size'])
iterator = sequence_dataset.make_initializable_iterator()
x_batch, y_batch = iterator.get_next()
The tf.nn.static_bidirectional_rnn documentation claims that the input must be a "length T list of inputs, each a tensor of shape [batch_size, input_size], or a nested tuple of such elements." So, I go through the following steps in order to get the data into the correct format.
# Dimensions go from [batch, n , t] -> [t, batch, n]
x_batch = tf.transpose(x_batch, [2, 0, 1])
# Unpack such that x_batch is a length T list with element dims [batch_size, n]
x_batch = tf.unstack(x_batch, experiment_params['example_t'], 0)
Without altering the batch any further, I make the following call:
output, _, _ = tf.nn.static_rnn(lstm_fw_cell, x_batch, dtype=tf.int32)
Note that I do not explicitly tell Tensorflow the dimensions of the matrices (this could be the problem). They all have the same dimensionality, yet I am getting the following bug:
ValueError: Input size (dimension 0 of inputs) must be accessible via shape
inference, but saw value None.
At which point in my stack should I be declaring the dimensions of my input? Because I am using a Dataset and hoping to get its batches directly to the RNN, I am not sure that the "placeholder -> feed_dict" route makes sense. If that in fact is the method that makes the most sense, let me know what that looks like (I definitely do not know). Otherwise, let me know if you have any other insights to the problem. Thanks!
The reason for the absence of static shape information is that TensorFlow doesn't understand enough about the example_generator function to determine the shapes of the arrays it yields, and so it assumes the shapes can be completely different from one element to the next. The best way to constrain this is to specify the optional output_shapes argument to tf.data.Dataset.from_generator(), which accepts a nested structure of shapes matching the structure of the yielded elements (and the output_types argument).
In this case you'd pass a tuple of two shapes, which can be partially specified. For example, if the x elements are 900 x 1000 arrays and the y elements are scalars:
sequence_dataset = tf.data.Dataset.from_generator(
example_generator, (tf.int32, tf.int32),
output_shapes=([900, 1000], []))