Keras dimensionality in convolutional layer mismatch - python

I'm trying to play around with Keras to build my first neural network. I have zero experience and I can't seem to figure out why my dimensionality isn't right. I can't figure it out from their docs what this error is complaining about, or even what layer is causing it.
My model takes in a 32byte array of numbers, and is supposed to give a boolean value on the other side. I want a 1D convolution on the input byte array.
arr1 is the 32byte array, arr2 is an array of booleans.
inputData = np.array(arr1)
inputData = np.expand_dims(inputData, axis = 2)
labelData = np.array(arr2)
print inputData.shape
print labelData.shape
model = k.models.Sequential()
model.add(k.layers.convolutional.Convolution1D(32,2, input_shape = (32, 1)))
model.add(k.layers.Activation('relu'))
model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))
model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))
model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))
model.add(k.layers.core.Dense(32))
model.add(k.layers.Activation('sigmoid'))
model.compile(loss = 'binary_crossentropy',
optimizer = 'rmsprop',
metrics=['accuracy'])
model.fit(
inputData,labelData
)
The output of the print of shapes are
(1000, 32, 1) and (1000,)
The error I receive is:
Traceback (most recent call last): File "cnn/init.py", line
50, in
inputData,labelData File "/home/steve/Documents/cnn/env/local/lib/python2.7/site-packages/keras/models.py",
line 863, in fit
initial_epoch=initial_epoch) File "/home/steve/Documents/cnn/env/local/lib/python2.7/site-packages/keras/engine/training.py",
line 1358, in fit
batch_size=batch_size) File "/home/steve/Documents/cnn/env/local/lib/python2.7/site-packages/keras/engine/training.py",
line 1238, in _standardize_user_data
exception_prefix='target') File "/home/steve/Documents/cnn/env/local/lib/python2.7/site-packages/keras/engine/training.py",
line 128, in _standardize_input_data
str(array.shape)) ValueError: Error when checking target: expected activation_5 to have 3 dimensions, but got array with shape (1000, 1)

Well It seems to me that you need to google a bit more about convolutional networks :-)
You are applying at each step 32 filters of length 2 over yout sequence. So if we follow the dimensions of the tensors after each layer :
Dimensions : (None, 32, 1)
model.add(k.layers.convolutional.Convolution1D(32,2, input_shape = (32, 1)))
model.add(k.layers.Activation('relu'))
Dimensions : (None, 31, 32)
(your filter of length 2 goes over the whole sequence so the sequence is now of length 31)
model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))
Dimensions : (None, 30, 32)
(you lose again one value because of your filters of length 2, but you still have 32 of them)
model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))
Dimensions : (None, 29, 32)
(same...)
model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))
Dimensions : (None, 28, 32)
Now you want to use a Dense layer on top of that... the thing is that the Dense layer will work as follow on your 3D input :
model.add(k.layers.core.Dense(32))
model.add(k.layers.Activation('sigmoid'))
Dimensions : (None, 28, 32)
This is your output. First thing that I find weird is that you want 32 outputs out of your dense layer... You should have put 1 instead of 32. But even this will not fix your problem. See what happens if we change the last layer :
model.add(k.layers.core.Dense(1))
model.add(k.layers.Activation('sigmoid'))
Dimensions : (None, 28, 1)
This happens because you apply a dense layer to a '2D' tensor. What it does in case you apply a dense(1) layer to an input [28, 32] is that it produces a weight matrix of shape (32,1) that it applies to the 28 vectors so that you find yourself with 28 outputs of size 1.
What I propose to fix this is to change the last 2 layers like this :
model = k.models.Sequential()
model.add(k.layers.convolutional.Convolution1D(32,2, input_shape = (32, 1)))
model.add(k.layers.Activation('relu'))
model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))
model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))
# Only use one filter so that the output will be a sequence of 28 values, not a matrix.
model.add(k.layers.convolutional.Convolution1D(1,2))
model.add(k.layers.Activation('relu'))
# Change the shape from (None, 28, 1) to (None, 28)
model.add(k.layers.core.Flatten())
# Only one neuron as output to get the binary target.
model.add(k.layers.core.Dense(1))
model.add(k.layers.Activation('sigmoid'))
Now the last two steps will take your tensor from
(None, 29, 32) -> (None, 28, 1) -> (None, 28) -> (None, 1)
I hope this helps you.
ps. if you were wondering what None is , it's the dimension of the batch, you don't feed the 1000 samples at onces, you feed it batch by batch and as the value depends on what is chosen, by convension we put None.
EDIT :
Explaining a bit more why the sequences length loses one value at each step.
Say you have a sequence of 4 values [x1 x2 x3 x4], you want to use your filter of length 2 [f1 f2] to convolve over the sequence. The first value will be given by y1 = [f1 f2] * [x1 x2], the second will be y2 = [f1 f2] * [x2 x3], the third will be y3 = [f1 f2] * [x3 x4]. Then you reached the end of your sequence and cannot go further. You have as a result a sequnce [y1 y2 y3].
This is due to the filter length and the effects at the borders of your sequence. There are multiple options, some pad the sequence with 0's in order to get exactly the same length of output... You can chose that option with the parameter 'padding'. You can read more about this here and find the different values possible for the padding argument here. I encourage you to read this last link, it gives informations about input and output shapes...
From the doc :
padding: One of "valid" or "same" (case-insensitive). "valid" means "no padding". "same" results in padding the input such that the output has the same length as the original input.
the default is 'valid', so you don't pad in your example.
I also recommend you to upgrade your keras version to the latest. Convolution1D is now Conv1D, so you might find the doc and tutorials confusing.

Related

TensorFlow keeping shape the same when slicing?

I am trying to take out a single element out of one dimension, while keeping the shapes the same.
The shape of the tensor is: (BATCH_SIZE, N_STEPS, NUM_FEATURES)
I want to create a new tensor that is (BATCH_SIZE, 1, NUM_FEATURES), where 1 is the final step.
The input tensor shape is (None, 128,16)
I tried to create a new tensor with the following:
X = X[:,-1,:]
X's shape becomes (None, 16) , but I need this to be (None, 1,16)
Update: I got this to work with the following code:
s = tf.shape(X)
X = tf.reshape(X[:,-1,:],shape=[s[0],1,s[2]])

errors with tensorflow reshape and resize layer

I want to reshape and resize an image in the first layers before using Conv2D and other layers. The input will be a flattend array. Here is my code:
#Create flat example image:
img_test = np.zeros((120,160))
img_test_flat = img_test.flatten()
reshape_model = Sequential()
reshape_model.add(tf.keras.layers.InputLayer(input_shape=(img_test_flat.shape)))
reshape_model.add(tf.keras.layers.Reshape((120, 160,1)))
reshape_model.add(tf.keras.layers.experimental.preprocessing.Resizing(28, 28, interpolation='nearest'))
result = reshape_model(img_test_flat)
result.shape
Unfortunately this code results in the error I added down below. What is the issue and how do I correctly reshape and resize the flattend array?
WARNING:tensorflow:Model was constructed with shape (None, 19200) for input Tensor("input_13:0", shape=(None, 19200), dtype=float32), but it was called on an input with incompatible shape (19200,).
InvalidArgumentError: Input to reshape is a tensor with 19200 values, but the requested shape has 368640000 [Op:Reshape]
EDIT:
I tried:
reshape_model = Sequential()
reshape_model.add(tf.keras.layers.InputLayer(input_shape=(None, img_test_flat.shape[0])))
reshape_model.add(tf.keras.layers.Reshape((120, 160,1)))
reshape_model.add(tf.keras.layers.experimental.preprocessing.Resizing(28, 28, interpolation='nearest'))
Which gave me:
WARNING:tensorflow:Model was constructed with shape (None, None, 19200) for input Tensor("input_19:0", shape=(None, None, 19200), dtype=float32), but it was called on an input with incompatible shape (19200,).
EDIT2:
I recieve the input in C++ from a 1D array and pass it with
// Copy value to input buffer (tensor)
for (size_t i = 0; i < fb->len; i++){
model_input->data.i32[i] = (int32_t) (fb->buf[i]);
so what I pass on to the model is a flat array.
Your use of shapes simply doesn't make sense here. The first dimension of your input should be the number of samples. Is it supposed to be 19,200, or 1 sample?
input_shape should omit the number of samples, so if you want 1 sample, input shape should be 19,200. If you have 19,200 samples, shape should be 1.
The reshaping layer also omits the number of samples, so Keras is confused. What exactly are you trying to do?
This seems to be roughly what you're trying to achieve but I would personally resize the image outside of the neural network:
import numpy as np
import tensorflow as tf
img_test = np.zeros((120,160)).astype(np.float32)
img_test_flat = img_test.reshape(1, -1)
reshape_model = tf.keras.Sequential()
reshape_model.add(tf.keras.layers.InputLayer(input_shape=(img_test_flat.shape[1:])))
reshape_model.add(tf.keras.layers.Reshape((120, 160,1)))
reshape_model.add(tf.keras.layers.Lambda(lambda x: tf.image.resize(x, (28, 28))))
result = reshape_model(img_test_flat)
print(result.shape)
TensorShape([1, 28, 28, 1])
Feel free to use the Resizing layer instead of the Lambda layer, I can't use it due to my Tensorflow version.

Python, Keras - ValueError: Cannot feed value of shape (10, 70, 1025) for Tensor u'dense_2_target:0', which has shape '(?, ?)'

I am trying to train a RNN by batches.
The input input size
(10, 70, 3075),
where 10 is the batch size, 70 the time dimension, 3075 are the frequency dimension.
There are three outputs whose size is
(10, 70, 1025)
each, basically 10 spectrograms with size (70,1025).
I would like to train this RNN by regression, whose structure is
input_img = Input(shape=(70,3075 ) )
x = Bidirectional(LSTM(n_hid,return_sequences=True, dropout=0.5, recurrent_dropout=0.2))(input_img)
x = Dropout(0.2)(x)
x = Bidirectional(LSTM(n_hid, dropout=0.5, recurrent_dropout=0.2))(x)
x = Dropout(0.2)(x)
o0 = ( Dense(1025, activation='sigmoid'))(x)
o1 = ( Dense(1025, activation='sigmoid'))(x)
o2 = ( Dense(1025, activation='sigmoid'))(x)
The problem is that output dense layers cannot take into account three dimensions, they want something like (None, 1025), which I don't know how to provide, unless I concatenate along the time dimension.
The following error occurs:
ValueError: Cannot feed value of shape (10, 70, 1025) for Tensor u'dense_2_target:0', which has shape '(?, ?)'
Would be the batch_shape option useful in the input layer? I have actually tried it, but I've got the same error.
In this instance the second RNN is collapsing the sequence to a single vector because by default return_sequences=False. To make the model return sequences and run the Dense layer over each timestep separately just add return_sequences=True to the second RNN as well:
x = Bidirectional(LSTM(n_hid, return_sequences=True, dropout=0.5, recurrent_dropout=0.2))(x)
The Dense layers automatically apply to the last dimension so no need to reshape afterwards.
To get the right output shape, you can use the Reshape layer:
o0 = Dense(70 * 1025, activation='sigmoid')(x)
o0 = Reshape((70, 1025)))(o0)
This will output (batch_dim, 70, 1025). You can do exactly the same for the other two outputs.

Shape of 1D convolution output on a 2D data using keras

I am trying to implement a 1D convolution on a time series classification problem using keras. I am having some trouble interpreting the output size of the 1D convolutional layer.
I have my data composed of the time series of different features over a time interval of 128 units and I apply a 1D convolutional layer:
x = Input((n_timesteps, n_features))
cnn1_1 = Conv1D(filters = 100, kernel_size= 10, activation='relu')(x)
which after compilation I obtain the following shapes of the outputs:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_26 (InputLayer) (None, 128, 9) 0
_________________________________________________________________
conv1d_28 (Conv1D) (None, 119, 100) 9100
I was assuming that with 1D convolution, the data is only convoluted across the time axis (axis 1) and the size of my output would be:
119, 100*9. But I guess that the network is performing some king of operation across the feature dimension (axis 2) and I don't know which operation is performing.
I am saying this because what I interpret as 1d convolution is that the features shapes must be preserved because I am only convolving the time domain: If I have 9 features, then for each filter I have 9 convolutional kernels, each of these applied to a different features and convoluted across the time axis. This should return 9 convoluted features for each filter resulting in an output shape of 119, 9*100.
However the output shape is 119, 100.
Clearly something else is happening and I can't understand it or get it.
where am I failing my reasoning? How is the 1d convolution performed?
I add one more comment which is my comment on one of the answers provided:
I understand the reduction from 128 to 119, but what I don't understand is why the feature dimension changes. For example, if I use
Conv1D(filters = 1, kernel_size= 10, activation='relu')
, then the output dimension is going to be (None, 119, 1), giving rise to only one feature after the convolution. What is going on in this dimension, which operation is performed to go from from 9 --> 1?
Conv1D needs 3D tensor for its input with shape (batch_size,time_step,feature). Based on your code, the filter size is 100 which means filter converted from 9 dimensions to 100 dimensions. How does this happen? Dot Product.
In above, X_i is the concatenation of k words (k = kernel_size), l is number of filters (l=filters), d is the dimension of input word vector, and p_i is output vector for each window of k words.
What happens in your code?
[n_features * 9] dot [n_features * 9] => [1] => repeat l-times => [1 * 100]
do above for all sequences => [128 * 100]
Another thing that happens here is you did not specify the padding type. According to the docs, by default Conv1d use valid padding which caused your dimension to reduce from 128 to 119. If you need the dimension to be the same as the input you can choose the same option:
Conv1D(filters = 100, kernel_size= 10, activation='relu', padding='same')
It Sums over the last axis, which is the feature axis, you can easily check this by doing the following:
input_shape = (1, 128, 9)
# initialize kernel with ones, and use linear activations
y = tf.keras.layers.Conv1D(1,3, activation="linear", input_shape=input_shape[2:],kernel_initializer="ones")(x)
y :
if you sum x along the feature axis you will get:
x
Now you can easily see that the sum of the first 3 values of sum of x is the first value of convolution, I used a kernel size of 3 to make this verification easier

Keras SimpleRNN confusion

...coming from TensorFlow, where pretty much any shape and everything is defined explicitly, I am confused about Keras' API for recurrent models. Getting an Elman network to work in TF was pretty easy, but Keras resists to accept the correct shapes...
For example:
x = k.layers.Input(shape=(2,))
y = k.layers.Dense(10)(x)
m = k.models.Model(x, y)
...works perfectly and according to model.summary() I get an input layer with shape (None, 2), followed by a dense layer with output shape (None, 10). Makes sense since Keras automatically adds the first dimension for batch processing.
However, the following code:
x = k.layers.Input(shape=(2,))
y = k.layers.SimpleRNN(10)(x)
m = k.models.Model(x, y)
raises an exception ValueError: Input 0 is incompatible with layer simple_rnn_1: expected ndim=3, found ndim=2.
It works only if I add another dimension:
x = k.layers.Input(shape=(2,1))
y = k.layers.SimpleRNN(10)(x)
m = k.models.Model(x, y)
...but now, of course, my input would not be (None, 2) anymore.
model.summary():
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 2, 1) 0
_________________________________________________________________
simple_rnn_1 (SimpleRNN) (None, 10) 120
=================================================================
How can I have an input of type batch_size x 2 when I just want to feed vectors with 2 values to the network?
Furthermore, how would I chain RNN cells?
x = k.layers.Input(shape=(2, 1))
h = k.layers.SimpleRNN(10)(x)
y = k.layers.SimpleRNN(10)(h)
m = k.models.Model(x, y)
...raises the same exception with incompatible dim sizes.
This sample here works:
x = k.layers.Input(shape=(2, 1))
h = k.layers.SimpleRNN(10, return_sequences=True)(x)
y = k.layers.SimpleRNN(10)(h)
m = k.models.Model(x, y)
...but then layer h does not output (None, 10) anymore, but (None, 2, 10) since it returns the whole sequence instead of just the "regular" RNN cell output.
Why is this needed at all?
Moreover: where are the states? Do they just default to 1 recurrent state?
The documentation touches on the expected shapes of recurrent components in Keras, let's look at your case:
Any RNN layer in Keras expects a 3D shape (batch_size, timesteps, features). This means you have timeseries data.
The RNN layer then iterates over the second, time dimension of the input using a recurrent cell, the actual recurrent computation.
If you specify return_sequences then you collect the output for every timestep getting another 3D tensor (batch_size, timesteps, units) otherwise you only get the last output which is (batch_size, units).
Now returning to your questions:
You mention vectors but shape=(2,) is a vector so this doesn't work. shape=(2,1) works because now you have 2 vectors of size 1, these shapes exclude batch_size. So to feed vectors of size to you need shape=(how_many_vectors, 2) where the first dimension is the number of vectors you want your RNN to process, the timesteps in this case.
To chain RNN layers you need to feed 3D data because that what RNNs expect. When you specify return_sequences the RNN layer returns output at every timestep so that can be chained to another RNN layer.
States are collection of vectors that a RNN cell uses, LSTM uses 2, GRU has 1 hidden state which is also the output. They default to 0s but can be specified when calling the layer using initial_states=[...] as a list of tensors.
There is already a post about the difference between RNN layers and RNN cells in Keras which might help clarify the situation further.

Categories