Tensorflow feature_column expecting a different shape than input data

Tensorflow feature_column expecting a different shape than input data - python

I'm trying to implement a tensorflow Estimator, and getting a shape mismatch error I don't know how to debug. I think I may be misunderstanding how to specify the tf.feature_column's shape. My intention is to create a model with 6010 inputs. Any suggestions would be appreciated.
def train_input_fn():
with np.load(TRAIN_NN_FEATURES) as train:
train_features = train['features']
train_labels = train['labels']
train_dataset = tf.data.Dataset.from_tensor_slices(
({'all_features': train_features}, train_labels))
train_iterator = train_dataset.make_one_shot_iterator()
return train_iterator.get_next()
all_features = tf.feature_column.numeric_column(
'all_features',
shape=(6010,),
dtype=tf.float64
)
estimator = tf.estimator.DNNClassifier(
feature_columns=[all_features],
hidden_units=[1024, 512, 256]
)
estimator.train(input_fn=train_input_fn)
When I run this, I get the following error:
InvalidArgumentError (see above for traceback): Input to reshape
is a tensor with 6010 values, but the requested shape has 36120100
[[Node: dnn/input_from_feature_columns/input_layer/all_features/Reshape =
Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]
(dnn/input_from_feature_columns/input_layer/all_features/ToFloat,
dnn/input_from_feature_columns/input_layer/all_features/Reshape/shape)]]
The shape of the data is as I expect, but the feature_column seems to be expecting its square.
>>> train_features.shape
(10737, 6010)
>>>train_labels.shape
(10737, 1)
>>> 36120100./6010
6010.0
My understanding is that Dataset.from_tensor_slices takes slices along axis 0 of the given tensor, which corresponds with the error message "Input to reshape is a tensor with 6010 values." But why is a shape with 36120100 values being requested?

I'd still like to know why the above wasn't working, or how to debug though.
The problem is with the tensor size produced from the train_iterator.get_next(). If the batch size is not specified, the iterator returns:
({'all_features': <tf.Tensor 'IteratorGetNext:0' shape=(6010,) dtype=float64>},
<tf.Tensor 'IteratorGetNext:1' shape=(1,) dtype=float64>)
... tuple. As you can see the features tensor shape is (6010,), which DNNClassifier interprets as the batch_size=6010 (by convention the first dimension is the batch size), and it still expects 6010 features. Hence the error: it can't reshape (6010,) to (6010, 6010).
In order to make it working you have to either reshape this tensor manually, or simply set the batch size by calling:
train_dataset = train_dataset.batch(16)
Even batch size 1 will do fine, as it will force the get_next tensor to be:
({'all_features': <tf.Tensor 'IteratorGetNext:0' shape=(?, 6010) dtype=float64>},
<tf.Tensor 'IteratorGetNext:1' shape=(?, 1) dtype=float64>)
... but you clearly want to set it bigger for efficiency.

Related

Multi input problem Keras. Expected to see 2 array(s), but instead got the following list of 1 arrays

I have a model that takes two inputs of the same shape (batch_size,512,512,1), and predict two masks each of shape (batch_size,512,512,1).
dataset_input = tf.data.Dataset.zip((dataset_img_A, dataset_img_B))
dataset_output = tf.data.Dataset.zip((seg_A, seg_B))
dataset = tf.data.Dataset.zip((dataset_input, dataset_output))
dataset = dataset.repeat()
dataset = dataset.batch(batch_size, drop_remainder=True)
I'm creating a model like so:
image_inputs_A = layers.Input((512,512,1), batch_size=self.batch_size)
image_inputs_B = layers.Input((512,512,1), batch_size=self.batch_size)
output_A = some_layers(image_inputs_A)
output_B = some_layers(image_inputs_B)
model = models.Model([image_inputs_A, image_inputs_B],[output_A, output_B])
However I'm getting the following error
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [<tf.Tensor 'IteratorGetNext:0' shape=(?, 2, ?, ?, ?) dtype=float32>]...
It seems that its concatenating the inputs to (batch_size,2,512,512,1), instead of listing them as a tuple of two tensors (batch_size,512,512,1 ). Is this the expected behaviour? How can I use multiple inputs without them concatenating?
EDIT:
I have tried to use an layers.Input with shape (batch_size, 2, 512, 512, 1) and then pass through two lambda layers to split the tensor along the second axis, however.. I get the following error
ValueError: Error when checking input: expected input_1 to have 5 dimensions, but got array with shape (None, None, None, None)
EDIT 2:
I've double checked the data im inputing into the model.
INPUT: (512, 512, 1) <dtype: 'float32'> INPUT: (512, 512, 1) <dtype: 'float32'> OUTPUT: (512, 512, 1) <dtype: 'int64'> OUTPUT: (512, 512, 1) <dtype: 'int64'>

SOLVED: Turns out it was an issue with the data augmentation step where tensors where concatenated inputs. Lesson learnt

errors with tensorflow reshape and resize layer

I want to reshape and resize an image in the first layers before using Conv2D and other layers. The input will be a flattend array. Here is my code:
#Create flat example image:
img_test = np.zeros((120,160))
img_test_flat = img_test.flatten()
reshape_model = Sequential()
reshape_model.add(tf.keras.layers.InputLayer(input_shape=(img_test_flat.shape)))
reshape_model.add(tf.keras.layers.Reshape((120, 160,1)))
reshape_model.add(tf.keras.layers.experimental.preprocessing.Resizing(28, 28, interpolation='nearest'))
result = reshape_model(img_test_flat)
result.shape
Unfortunately this code results in the error I added down below. What is the issue and how do I correctly reshape and resize the flattend array?
WARNING:tensorflow:Model was constructed with shape (None, 19200) for input Tensor("input_13:0", shape=(None, 19200), dtype=float32), but it was called on an input with incompatible shape (19200,).
InvalidArgumentError: Input to reshape is a tensor with 19200 values, but the requested shape has 368640000 [Op:Reshape]
EDIT:
I tried:
reshape_model = Sequential()
reshape_model.add(tf.keras.layers.InputLayer(input_shape=(None, img_test_flat.shape[0])))
reshape_model.add(tf.keras.layers.Reshape((120, 160,1)))
reshape_model.add(tf.keras.layers.experimental.preprocessing.Resizing(28, 28, interpolation='nearest'))
Which gave me:
WARNING:tensorflow:Model was constructed with shape (None, None, 19200) for input Tensor("input_19:0", shape=(None, None, 19200), dtype=float32), but it was called on an input with incompatible shape (19200,).
EDIT2:
I recieve the input in C++ from a 1D array and pass it with
// Copy value to input buffer (tensor)
for (size_t i = 0; i < fb->len; i++){
model_input->data.i32[i] = (int32_t) (fb->buf[i]);
so what I pass on to the model is a flat array.

Your use of shapes simply doesn't make sense here. The first dimension of your input should be the number of samples. Is it supposed to be 19,200, or 1 sample?
input_shape should omit the number of samples, so if you want 1 sample, input shape should be 19,200. If you have 19,200 samples, shape should be 1.
The reshaping layer also omits the number of samples, so Keras is confused. What exactly are you trying to do?
This seems to be roughly what you're trying to achieve but I would personally resize the image outside of the neural network:
import numpy as np
import tensorflow as tf
img_test = np.zeros((120,160)).astype(np.float32)
img_test_flat = img_test.reshape(1, -1)
reshape_model = tf.keras.Sequential()
reshape_model.add(tf.keras.layers.InputLayer(input_shape=(img_test_flat.shape[1:])))
reshape_model.add(tf.keras.layers.Reshape((120, 160,1)))
reshape_model.add(tf.keras.layers.Lambda(lambda x: tf.image.resize(x, (28, 28))))
result = reshape_model(img_test_flat)
print(result.shape)
TensorShape([1, 28, 28, 1])
Feel free to use the Resizing layer instead of the Lambda layer, I can't use it due to my Tensorflow version.

Do I understand batch_size correctly in Keras?

I'm using Keras' built-in inception_resnet_v2 to train a CNN to recognize images. When training the model, I have a numpy array of data as inputs, with input shape (1000, 299, 299, 3),
model.fit(x=X, y=Y, batch_size=16, ...) # Output shape `Y` is (1000, 6), for 6 classes
At first, When trying to predict, I passed in a single image of shape (299, 299, 3), but got the error
ValueError: Error when checking input: expected input_1 to have 4 dimensions, but got array with shape (299, 299, 3)
I reshaped my input with:
x = np.reshape(x, ((1, 299, 299, 3)))
Now, when I predict,
y = model.predict(x, batch_size=1, verbose=0)
I don't get an error.
I want to make sure I understand batch_size correctly in both training and predicting. My assumptions are:
1) With model.fit, Keras takes batch_size elements from the input array (in this case, it works through my 1000 examples 16 samples at a time)
2) With model.predict, I should reshape my input to be a single 3D array, and I should explicitly set batch_size to 1.
Are these correct assumptions?
Also, would it be better (possible even) to provide training data to the model so that this sort of reshape before prediction was not necessary? Thank you for helping me learn this.

No, you got the idea wrong. batch_size specifies how many data examples are "forwarded" through the network at once (using GPU usually).
By default, this value is set to 32 inside model.predict method, but you may specify otherwise (as you did with batch_size=1). Because of this default value you got an error:
ValueError: Error when checking input: expected input_1 to have 4
dimensions, but got array with shape (299, 299, 3)
You should not reshape your input this way, rather you would provide it with the correct batch size.
Say, for the default case you would pass an array of shape (32, 299, 299, 3), analogous for different batch_size, e.g. with batch_size=64 this function requires you to pass an input of shape (64, 299, 299, 3.
EDIT:
It seems you need to reshape your single sample into a batch. I would advise you to use np.expand_dims for improved readability and portability of your code, like this:
y = model.predict(np.expand_dims(x, axis=0), batch_size=1)

Why keras middle layer output tensor shape is (?,?)

I loaded a keras model and used its middle output as follows:
pretrain_model = load_model(path)
model = Model(inputs=pretrain_model.input, outputs=pretrain_model.layers[-2].output)
When I run:
>>model.output
<tf.Tensor 'pretrain_variable/dropout_2/cond/Merge:0' shape=(?, ?) dtype=float32>
>>model.output_shape
(None, 2000)
This influences my next step:
input = tf.placeholder(tf.float32,shape=(None,X_all.shape[1],X_all.shape[2]),name='X')
pretrain_output = pretrain_model(input) # pretrain_output shape should be none*2000, however it is none*none
output_y = Dense(units=y.shape[1])(pretrain_output) # this works
output_y = tf.keras.layers.Dense(units=y.shape[1])(pretrain_output) # won't work, cause pretrain_output shape is (none,none)
I cannot use tf.keras.layers.Dense directly due to Dimension equals none.
Can anyone teach me how to get the middle layer output with correct shape?

Shapes must be equal rank

I'd like to do transfer learning from a pre-trained model. I'm following the guide for retrain from Tensorflow.
However, I'm stuck in an error tensorflow.python.framework.errors_impl.InvalidArgumentError: Shapes must be equal rank, but are 3 and 2 for 'input_1/BottleneckInputPlaceholder' (op: 'PlaceholderWithDefault') with input shapes: [1,?,128].
# Last layer of pre-trained model
# `[<tf.Tensor 'embeddings:0' shape=(?, 128) dtype=float32>]`
with tf.name_scope('input'):
bottleneck_input = tf.placeholder_with_default(
bottleneck_tensor,
shape=[None, 128],
name='BottleneckInputPlaceholder')
Any ideas?

This is happening because your bottleneck_tensor is of shape [1, ?, 128] and you are explicitly stating that the shape should be [?, 128]. You can use tf.squeeze to reduce convert your tensor in the required shape as
tf.squeeze(bottleneck_tensor)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Tensorflow feature_column expecting a different shape than input data - python

Related

Multi input problem Keras. Expected to see 2 array(s), but instead got the following list of 1 arrays

errors with tensorflow reshape and resize layer

Do I understand batch_size correctly in Keras?

Why keras middle layer output tensor shape is (?,?)

Shapes must be equal rank

Categories

Resources