the input shape of array about Keras on Tensorflow - python

I have a question about the 4D tensor on keras about Convolution2D Layers.
The Keras doc says:
4D tensor with shape: (samples, channels, rows, cols) if dim_ordering='th' or 4D tensor with shape: (samples, rows, cols, channels) if dim_ordering='tf'.
I use 'tf', how about my input? When I use (samples, channels, rows, cols), it is ok, but when I use (samples, rows, cols, channels) as input, it has some problems.

Keras assumes that if you are using tensorflow, you are going with (samples, channels, rows, cols)

Related

Processing 4D input with 2D Convolution in Tensorflow

I am having trouble understanding how 2D Conv calculations are done on 4D inputs. Basically, this is the situation, I have an image of height, width, channels = 128, 128, 103. I want each of these 103 channels to be processed individually as if I'm inputting them to the network one by one. Would the following line work?
import tensorflow.keras
from tensorflow.keras.layers import Conv2D
model1 = tensorflow.keras.models.Sequential()
model1.add(Conv2D(1, kernel_size=(3,3), input_shape = (128, 128,103,1), padding='same'))
I want to avoid splitting the image and inputting it into the network as 103 batches of (128,128,1)
As explained in the documentation: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D?version=nightly
4+D tensor with shape: batch_shape + (channels, rows, cols) if data_format='channels_first' or
4+D tensor with shape: batch_shape + (rows, cols, channels) if data_format='channels_last'.
(by default: data_format='channels_last'.)
You are passing a 5D tensor of shape (batch_shape, 128, 128, 103, 1).
I suggest you reshape your tensor into something that will yield a shape like this one (None, 128, 128, 103).
Also please change input_shape = (128, 128,103,1) to input_shape = (128, 128,103)

How to understand dimension of Keras Conv2D layer weights

When I tried to experiment with CNN pruning I was stopped right in the beginning because I couldn't explain the weight dimensions to myself.
The CNN has the following structure (exported from model.layers()):
conv2d (64 filters with filter dimension 5x5)
max_pooling2d
dropout
conv2d (128 filters with filter dimension 5x5)
max_pooling2d
flatten
dense (128 units)
dense (39 classes)
The corresponding weights have the following dimensions (from .get_weights()):
conv2d: shape(5,5,1,64)
max_pooling2d: shape(64,)
dropout: shape(5,5,64,128)
conv2d: shape(128,)
max_pooling2d: shape(6272,128)
flatten: shape(128,)
dense: shape(128,39)
dense: shape(39,)
Please have a look at the Conv2D layers and their parameters and dimensions. The first Conv2D layer (conv2d: shape(5,5,1,64)) seems to have an explainable number of weights: 5 x 5 (filter size) and 64 filters.
What is unclear to me is why the second Conv2D layer (conv2d: shape(128,)) only has 128 entries in the weights array. The dropout layer before (dropout: shape(5,5,64,128)) seems to have the weights dimensions I would expect the Con2D layer to have.

How to reshape the input to put in Conv2d layers?

I am trying to use Conv2d to train my model, and i have problem/ with the input of Conv2d layers. How can i reshape it to put into Conv2d
I am building model for classify voice accent, and using CNN. I am using Conv2d for this problem.
The shape of:
X_train: (78952, 26) (26 features)
X_test : (2574, 26)
I reshape it as (78952, 13, 2 , 1), (2574, 13, 2, 1) it ran well. But I cannot use kernel such as (3x3), (7x7), ....
How can I change the right input to put in Conv2d layers ?

Keras: Difference between AveragePooling1D layer and GlobalAveragePooling1D layer

I'm a bit confused when it comes to the average pooling layers of Keras. The documentation states the following:
AveragePooling1D: Average pooling for temporal data.
Arguments
pool_size: Integer, size of the average pooling windows.
strides: Integer, or None. Factor by which to downscale. E.g. 2 will halve the input. If None, it will default to pool_size.
padding: One of "valid" or "same" (case-insensitive).
data_format: A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs.
channels_last corresponds to inputs with shape (batch, steps,
features) while channels_first corresponds to inputs with shape
(batch, features, steps).
Input shape
If data_format='channels_last': 3D tensor with shape: (batch_size, steps, features)
If data_format='channels_first': 3D tensor with shape: (batch_size, features, steps)
Output shape
If data_format='channels_last': 3D tensor with shape: (batch_size, downsampled_steps, features)
If data_format='channels_first': 3D tensor with shape: (batch_size, features, downsampled_steps)
and
GlobalAveragePooling1D: Global average pooling operation for temporal data.
Arguments
data_format: A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs.
channels_last corresponds to inputs with shape (batch, steps,
features) while channels_first corresponds to inputs with shape
(batch, features, steps).
Input shape
If data_format='channels_last': 3D tensor with shape: (batch_size, steps, features)
If data_format='channels_first': 3D tensor with shape: (batch_size, features, steps)
Output shape
2D tensor with shape: (batch_size, features)
I (think that I) do get the concept of average pooling but I don't really understand why the GlobalAveragePooling1D layer simply drops the steps parameter. Thank you very much for your answers.
GlobalAveragePooling1D is same as AveragePooling1D with pool_size=steps. So, for each feature dimension, it takes average among all time steps. The output thus have shape (batch_size, 1, features) (if data_format='channels_last'). They just flatten the second (or third if data_format='channels_first') dimension, that is how you get output shape equal to (batch_size, features).

Tensorflow, What is the best way to arrange tensors dimensions?

I am using tensorflow to train an RNN model. I store my input tensors with the shape (Batch Size, Time Steps, 128) where the 128 is the length of the one hot encoding to represent ASCII characters. To input a time step into an RNN I use the following function to reshape it to (Batch Size, 128)...
def getTimeStep(x, t):
return tf.reshape(x[:, t, :], (-1, 128))
I am wondering if this is the most efficient way to feed my RNN the time steps. I am not sure about how memory is ordered in tensorflow. Here is the rest of my code for a sequence-sequence encoder. Notice that I am saving the output after each timestep since I want to feed it into an attention model in my decoder. Could I be doing something more efficiently?
input_tensor = tf.placeholder(tf.float32, (BATCH_SIZE, TIME_STEPS, 128), 'input_tensor')
expected_output = tf.placeholder(tf.float32, (BATCH_SIZE, TIME_STEPS, 128), 'expected_output')
with tf.variable_scope('encoder') as encode_scope:
encoder_rnn = rnn.MultiRNNCell([rnn.GRUCell(1024)] * 3)
encoder_state = tf.zeros((BATCH_SIZE, encoder_rnn.state_size))
encoder_outputs = [None] * TIME_STEPS
for t in range(TIME_STEPS):
encoder_outputs[t], encoder_state = encoder_rnn(getTimeStep(input_tensor, t), encoder_state)
encode_scope.reuse_variables()
encoder_outputs = tf.concat(1, [tf.reshape(t, (BATCH_SIZE, 1, 1024)) for t in encoder_outputs])

Categories