In Keras, how to use Reshape layer with None dimension? - python

In my model, a layer has a shape of [None, None, 40, 64]. I want to reshape this into [None, None, 40*64]. However, if I simply do the following:
reshaped_layer = Reshape((None, None, 40*64))(my_layer)
It throws an error complaining that None values not supported.
(Just to be clear, this is not tf.keras, this is just Keras).

First of all, the argument you pass to Reshape layer is the desired shape of one sample in the batch and not the whole batch of samples. So since each of the samples in the batch is a 3D tensor, the argument must also consider only that 3D tensor (i.e. excluding the batch axis).
Second, you can use -1 as the shape of only one axis. It tells to the Reshape layer to automatically infer the shape of that axis based on the shape of other axes you provide. So considering these two points, it would be:
reshaped_out = Reshape((-1, 40*64))(layer_out)

Related

Pytorch copy a neuron in a layer

I am using pytorch 0.3.0. I'm trying to selectively copy a neuron and it's weights within the same layer, then replace the original neuron with an another set of weights. Here's my attempt at that:
reshaped_data2 = data2.unsqueeze(0)
new_layer_data = torch.cat([new_layer.data, reshaped_data2], dim=0)
new_layer_data[i] = data1
new_layer.data.copy_(new_layer_data)
First I unsqueezed data2 to make it a 1*X tensor instead of 0*X.
Then I concatenate my layer's tensor with the reshaped data2 along dimension 0.
I then replace the original data2 located at index i with data1.
Finally, I copy all of that into my layer.
The error I get is:
RuntimeError: inconsistent tensor size, expected tensor [10 x 128] and src [11 x 128] to have the same number of elements, but got 1280 and 1408 elements respectively at /Users/soumith/code/builder/wheel/pytorch-src/torch/lib/TH/generic/THTensorCopy.c:86
If I do a simple assignment instead of copy I get
RuntimeError: The expanded size of the tensor (11) must match the existing size (10) at non-singleton dimension 1. at /Users/soumith/code/builder/wheel/pytorch-src/torch/lib/TH/generic/THTensor.c:309
I understand the error, but what is the right way to go about this?
You're trying to replace a 10x128 tensor with a 11x128 tensor, which the model doesn't allow. Is new_layer initialised with the size (11, 128)?
If not, try creating your new layer with your desired size (11, 128) and then copy/assign your new_layer_data.
The solution here is to create a new model with the correct size and pass in weights as default values. No dynamic expansion solution was found.

converting an array of size (n,n,m) to (None,n,n,m)

I am trying to reshape an array of size (14,14,3) to (None, 14,14,3). I have seen that the output of each layer in convolutional neural network has shape in the format(None, n, n, m).
Consider that the name of my array is arr
I tried arr[None,:,:] but it converts it to a dimension of (1,14,14,3).
How should I do it?
https://www.tensorflow.org/api_docs/python/tf/TensorShape
A TensorShape represents a possibly-partial shape specification for a Tensor. It may be one of the following:
Partially-known shape: has a known number of dimensions, and an unknown size for one or more dimension. e.g. TensorShape([None, 256])
That is not possible in numpy. All dimensions of a ndarray are known.
arr[None,:,:] notation adds a new size 1 dimension, (1,14,14,3). Under broadcasting rules, such a dimension may be changed to match a dimension of another array. In that sense we often treat the None as a flexible dimension.
I haven't worked with tensorflow though I see a lot of questions with both tags. tensorflow should have mechanisms for transfering values to and from tensors. It knows about numpy, but numpy does not 'know' anything about tensorflow.
A ndarray is an object with known values, and its shape is used to access those values in a multidimensional way. In contrast a tensor does not have values:
https://www.tensorflow.org/api_docs/python/tf/Tensor
It does not hold the values of that operation's output, but instead provides a means of computing those values
Looks like you can create a TensorProt from an array (and return an array from one as well):
https://www.tensorflow.org/api_docs/python/tf/make_tensor_proto
and to make a Tensor from an array:
https://www.tensorflow.org/api_docs/python/tf/convert_to_tensor
The shape (None, 14,14,3) represent ,(batch_size,imgH,imgW,imgChannel) now imgH and imgW can be use interchangeably depends on the network and the problem.
But the batchsize is given as "None" in the neural network because we don't want to restrict our batchsize to some specific value as our batchsize depends on a lot of factors like memory available for our model to run etc.
So lets say you have 4 images of size 14x14x3 then you can append each image into the array say L1, and now the L1 will have the shape 4x14x14x3 i.e you made a batch of 4 images and now you can feed this to your neural network.
NOTE here None will be replaced by 4 and for the whole training process it will be 4. Similarly when you feed your network only one image it assumes the batchsize of 1 and set None equal to 1 giving you the shape (1X14X14X3)

Unflattening Layer in Keras

I would like to create a simple Keras neural network that accepts an input matrix of dimension (rows, columns) = (n, m), flattens the matrix to a dimension (n*m, 1), sends the flattened matrix through a number of arbitrary layers, and in the final layer, once more unflattens the matrix to a dimension of (n, m) before releasing this final matrix as an output.
The issue I'm having is that I haven't found any documentation for an Unflatten layer at the keras.io page, and I'm wondering whether there is a reason that such a seemingly standard common use layer doesn't exist. Is there a much more natural and easy way to do what I'm proposing?
You can use the Reshape layer for this purpose. It accepts the desired output shape as its argument and would reshape the input tensor to that shape. For example:
from keras.layers import Reshape
rsh_inp = Reshape((n*m, 1))(inp) # if you don't want the last axis with dimension 1, you can also use Flatten layer
# rsh_inp goes through a number of arbitrary layers ...
# reshape back the output
out = Reshape((n,m))(out_rsh_inp)

Concatenation layer in tensorflow

Given 2 3D tensors t1 = [?, 1, 1, 1, 2048] and t2 = [?, 3, 1, 1, 256] seen in the image, how would these be concatenated? Currently, I am using:
tf.concat([t1, t2], 4)
However, given that my architecture has a large amount of layers with many concatenations, I eventually have a tensor that is too large (in terms of channels/features) to initialize. Is this the correct way to implement a concatenation layer?
First of all, the shapes of tensors in the inception layer are not like you define. 1x1, 1x3 and 3x1 are the shapes of the filters applied to the image. There are two more parameters in convolution: padding and striding, and depending on their exact values, the result shape can be very different.
In this particular case, the spatial shape doesn't change, only the channels dimension will be 2048 and 256, that's why they can be concatenated. The concatenation of your original t1 and t2 will result in error.
Is this the correct way to implement a concatenation layer?
Yes, feature map concatenation is one of key ideas of inception network and its implementation indeed uses tf.concat (e.g. see inception v1 source code).
Note that this tensor will grow in one direction (channels / features), but contract in spatial dimensions because of downsampling, so it won't get too large. Also note that this tensor is the transformed input data (image), hence unlike the weights, it's not initialized, but rather flows through the network. The weights will be the tensors 1x1x2048=2048, 1x3x224=672, 3x1x256=768, etc - as you can see they are not very big at all, and that's another idea of the inception network.

Tensor with unspecified dimension in tensorflow

I'm playing around with tensorflow and ran into a problem with the following code:
def _init_parameters(self, input_data, labels):
# the input shape is (batch_size, input_size)
input_size = tf.shape(input_data)[1]
# labels in one-hot format have shape (batch_size, num_classes)
num_classes = tf.shape(labels)[1]
stddev = 1.0 / tf.cast(input_size, tf.float32)
w_shape = tf.pack([input_size, num_classes], 'w-shape')
normal_dist = tf.truncated_normal(w_shape, stddev=stddev, name='normaldist')
self.w = tf.Variable(normal_dist, name='weights')
(I'm using tf.pack as suggested in this question, since I was getting the same error)
When I run it (from a larger script that invokes this one), I get this error:
ValueError: initial_value must have a shape specified: Tensor("normaldist:0", shape=TensorShape([Dimension(None), Dimension(None)]), dtype=float32)
I tried to replicate the process in the interactive shell. Indeed, the dimensions of normal_dist are unspecified, although the supplied values do exist:
In [70]: input_size.eval()
Out[70]: 4
In [71]: num_classes.eval()
Out[71]: 3
In [72]: w_shape.eval()
Out[72]: array([4, 3], dtype=int32)
In [73]: normal_dist.eval()
Out[73]:
array([[-0.27035281, -0.223277 , 0.14694688],
[-0.16527176, 0.02180306, 0.00807841],
[ 0.22624688, 0.36425814, -0.03099642],
[ 0.25575709, -0.02765726, -0.26169327]], dtype=float32)
In [78]: normal_dist.get_shape()
Out[78]: TensorShape([Dimension(None), Dimension(None)])
This is weird. Tensorflow generates the vector but can't say its shape. Am I doing something wrong?
As Ishamael says, all tensors have a static shape, which is known at graph construction time and accessible using Tensor.get_shape(); and a dynamic shape, which is only known at runtime and is accessible by fetching the value of the tensor, or passing it to an operator like tf.shape. In many cases, the static and dynamic shapes are the same, but they can be different - the static shape can be partially defined - in order allow the dynamic shape to vary from one step to the next.
In your code normal_dist has a partially-defined static shape, because w_shape is a computed value. (TensorFlow sometimes attempts to evaluate
these computed values at graph construction time, but it gets stuck at tf.pack.) It infers the shape TensorShape([Dimension(None), Dimension(None)]), which means "a matrix with an unknown number of rows and columns," because it knowns that w_shape is a vector of length 2, so the resulting normal_dist must be 2-dimensional.
You have two options to deal with this. You can set the static shape as Ishamael suggests, but this requires you to know the shape at graph construction time. For example, the following may work:
normal_dist.set_shape([input_data.get_shape()[1], labels.get_shape()[1]])
Alternatively, you can pass validate_shape=False to the tf.Variable constructor. This allows you to create a variable with a partially-defined shape, but it limits the amount of static shape information that can be inferred later on in the graph.
Similar question is nicely explained in TF FAQ:
In TensorFlow, a tensor has both a static (inferred) shape and a
dynamic (true) shape. The static shape can be read using the
tf.Tensor.get_shape method: this shape is inferred from the operations
that were used to create the tensor, and may be partially complete. If
the static shape is not fully defined, the dynamic shape of a Tensor t
can be determined by evaluating tf.shape(t).
So tf.shape() returns you a tensor, will always have a size of shape=(N,), and can be calculated in a session:
a = tf.Variable(tf.zeros(shape=(2, 3, 4)))
with tf.Session() as sess:
print sess.run(tf.shape(a))
On the other hand you can extract the static shape by using x.get_shape().as_list() and this can be calculated anywhere.
The variable can have a dynamic shape. get_shape() returns the static shape.
In your case you have a tensor that has a dynamic shape, and currently happens to hold value that is 4x3 (but at some other time it can hold a value with a different shape -- because the shape is dynamic). To set the static shape, use set_shape(w_shape). After that the shape you set will be enforced, and the tensor will be a valid initial_value.

Categories