ValueError: expected ndim=3, found ndim=2 after replacing BatchNormalization - python

I'm programming in python 3.7.5 using keras and TensorFlow 1.13.1
I want remove batch normalization layer from model coded below:
from keras import backend as K
from keras.callbacks import *
from keras.layers import *
from keras.models import *
from keras.utils import *
from keras.optimizers import Adadelta, RMSprop, Adam, SGD
from keras.callbacks import ModelCheckpoint
from keras.callbacks import TensorBoard
from config import *
def ctc_lambda_func(args):
iy_pred, ilabels, iinput_length, ilabel_length = args
# the 2 is critical here since the first couple outputs of the RNN
# tend to be garbage:
iy_pred = iy_pred[:, 2:, :] # no such influence
return K.ctc_batch_cost(ilabels, iy_pred, iinput_length, ilabel_length)
def CRNN_model(is_training=True):
inputShape = Input((width, height, 1), name='input') # base on Tensorflow backend
conv_1 = Conv2D(64, (3, 3), activation='relu', padding='same')(inputShape)
conv_2 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv_1)
#batchnorm_2 = BatchNormalization()(conv_2)
pool_2 = MaxPooling2D(pool_size=(2, 2))(conv_2)
conv_3 = Conv2D(64, (3, 3), activation='relu', padding='same')(pool_2)
conv_4 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv_3)
#batchnorm_4 = BatchNormalization()(conv_4)
pool_4 = MaxPooling2D(pool_size=(2, 2))(conv_4)
conv_5 = Conv2D(128, (3, 3), activation='relu', padding='same')(pool_4)
conv_6 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv_5)
pool_5 = MaxPool2D(pool_size=(2, 2))(conv_6)
#batchnorm_6 = BatchNormalization()(conv_6)
#bn_shape = batchnorm_6.get_shape()
#x_reshape = Reshape(target_shape=(int(bn_shape[1]), int(bn_shape[2] * bn_shape[3])))(batchnorm_6)
#drop_reshape = Dropout(0.25, name='d1')(x_reshape)
fl_1 = Flatten()(pool_5)
fc_1 = Dense(256, activation='relu')(fl_1)
bi_LSTM_1 = Bidirectional(LSTM(256, return_sequences=True, kernel_initializer='he_normal'), merge_mode='sum')(fc_1)
bi_LSTM_2 = Bidirectional(LSTM(128, return_sequences=True, kernel_initializer='he_normal'), merge_mode='concat')(bi_LSTM_1)
#drop_rnn = Dropout(0.3, name='d2')(bi_LSTM_2)
fc_2 = Dense(label_classes, kernel_initializer='he_normal', activation='softmax')(bi_LSTM_2)
base_model = Model(inputs=[inputShape], outputs=fc_2)
labels = Input(name='the_labels', shape=[label_len], dtype='float32')
input_length = Input(name='input_length', shape=[1], dtype='int64')
label_length = Input(name='label_length', shape=[1], dtype='int64')
loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')([fc_2, labels, input_length, label_length])
if is_training:
return Model(inputs=[inputShape, labels, input_length, label_length], outputs=[loss_out]), base_model
return base_model
but I get this error:
Traceback (most recent call last):
File "C:/Users/Babak/PycharmProjects/CRNN-OCR/captcha-recognition-master1/captcha-recognition-master/", line 79, in <module>
model, base_model = CRNN_model(is_training=True)
File "C:\Users\Babak\PycharmProjects\CRNN-OCR\captcha-recognition-master1\captcha-recognition-master\", line 51, in CRNN_model
bi_LSTM_1 = Bidirectional(LSTM(256, return_sequences=True, kernel_initializer='he_normal'), merge_mode='sum')(fc_1)
File "C:\Program Files\Python37\lib\site-packages\keras\layers\", line 437, in __call__
return super(Bidirectional, self).__call__(inputs, **kwargs)
File "C:\Program Files\Python37\lib\site-packages\keras\engine\", line 446, in __call__
File "C:\Program Files\Python37\lib\site-packages\keras\engine\", line 342, in assert_input_compatibility
ValueError: Input 0 is incompatible with layer bidirectional_1: expected ndim=3, found ndim=2
Process finished with exit code 1
How can I remove batch norm layers which is commented. I note that I manually remove drop out layers. So assume that dropout are removed. I remove dropout layers without problem. But I have problem in removing batch normalization layers

As per the error code, LSTM layers expect 3D input tensors, but Dense outputs only 2D. Many possible fixes exist, but not all will work equally well:
Conv2D outputs 4D tensors, shaped (samples, height, width, channels)
LSTM expects input shaped (samples, timesteps, channels)
Thus, you need to somehow transform the (height, width) dimensions into timesteps
In existing research, image data is flattened and treated sequentially - however, channels remain untouched. Thus, a viable approach is to use Reshape to yield a 3D tensor shaped (samples, height*width, channels). Finally, as Dense cannot work with 3D data, you'll need the TimeDistributed wrapper that'll apply the same Dense weights to dim 1 of input - i.e. to timesteps:
pool_shapes = K.int_shape(pool_5)
fl_1 = Reshape((pool_shapes[1] * pool_shapes[2], pool_shapes[3]))(pool_5)
fc_1 = TimeDistributed(Dense(256, activation='relu'))(fl_1)
Lastly, return_sequences=True outputs a 3D tensor, which your output Dense cannot handle - so either use return_sequences=False to output 2D, or insert a Flatten before the Dense.


conv2D does not give expected Output shape

I am trying to copy a model architecture. In the original model architecture, after applying Conv2d Output Shape is (None, 112, 112, 16) with 432 params. As shown in the image attached
original Model
But when I apply the conv2d output shape I am getting is (None, 225, 225, 16) with 448 params. as shown in my model image
My model
This is the code I wrote
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D, ZeroPadding2D, InputLayer, Input
from tensorflow.keras.models import Model
inputs = Input(shape=(224, 224, 3))
x = ZeroPadding2D(padding=((1,0),(1,0)))(inputs)
x = Conv2D(16, kernel_size=(3, 3), padding="same")(x)
x = Dense(1500, activation="relu")(x)
x = Dense(1000, activation="relu")(x)
prediction = Dense(784, activation='softmax')(x)
model_functional = Model(inputs=inputs, outputs=prediction)

Predict_classes() keras model ValueError: Input 0 of layer sequential_1 is incompatible with the layer:

Okay I'm new to this I'm trying to classify Traffic Sign images Im actually following a notebook from Kaggle, after building the Keras model
from keras.models import Sequential
from keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu', input_shape=X_train.shape[1:]))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dense(256, activation='relu'))
model.add(Dense(43, activation='softmax'))
#Compilation of the model
now for the test part after fitting the model the code i found is like this
for f in labels:
image=cv2.imread('C:/Users/Louay/input/Test/'+f.replace('Test/', ''))
image_from_array = Image.fromarray(image, 'RGB')
size_image = image_from_array.resize((height, width))
X_test = X_test.astype('float32')/255
pred = model.predict_classes(X_test)
this works and it predicts the classes of all images in the test set correctly my problem is when I tried to predict only 1 image from the test set okay I thought I would repeat the same image processing part and then use predict_classes() so my code should be like this
image_from_array = Image.fromarray(image, 'RGB')
size_image = image_from_array.resize((height, width))
pred = model.predict_classes(test.astype('float32')/255)
okay I'm working on 1 image so I thought I don't need the data[] list where I append all processed images but when I run the code I get this error
ValueError: Input 0 of layer sequential_1 is incompatible with the layer: : expected min_ndim=4, found ndim=3. Full shape received: [None, None, 3]
I know I'm doing something wrong but before correcting my code I really want to understand why am I getting this error what's causing it, what's actually happening?
Conv2D expects 4+D tensor with shape: batch_shape + (channels, rows, cols) if data_format='channels_first' or 4+D tensor with shape: batch_shape + (rows, cols, channels) if data_format='channels_last'.
Add extra dimension to your input using tf.expand_dims
Working sample code
import tensorflow as tf
image = tf.zeros([10,10,3])
input_shape = tf.expand_dims(image, axis=0).shape.as_list()
x = tf.random.normal(input_shape)
y = tf.keras.layers.Conv2D(
2, 3, activation='relu', input_shape=input_shape[1:])(x)
(1, 8, 8, 2)

ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=4, found ndim=2. Full shape received: [None, 2584]

I'm working in a project that isolate vocal parts from an audio. I'm using the DSD100 dataset, but for doing tests I'm using the DSD100subset dataset from I only use the mixtures and the vocals. I'm basing this work on this article
First I process the audios to extract a spectrogram and put it on a list, with all the audios forming four lists (trainMixed, trainVocals, testMixed, testVocals). Like this:
def to_spec(wav, n_fft=1024, hop_length=256):
return librosa.stft(wav, n_fft=n_fft, hop_length=hop_length)
def prepareData(filename, sr=22050, hop_length=256, n_fft=1024):
audio_wav = librosa.load(filename, sr=sr, mono=True, duration=30)[0]
audio_spec=to_spec(audio_wav, n_fft=n_fft, hop_length=hop_length)
audio_spec_mag = np.abs(audio_spec)
maxVal = np.max(audio_spec_mag)
return audio_spec_mag, maxVal
# FOR EVERY LIST (trainMixed, trainVocals, testMixed, testVocals)
trainMixed = []
trainMixedNum = 0
for (root, dirs, files) in walk('./Dev-subset-mix/Dev/'):
for d in dirs:
filenameMix = './Dev-subset-mix/Dev/'+d+'/mixture.wav'
spec_mag, maxVal = prepareData(filenameMix, n_fft=1024, hop_length=256)
Next i build the model:
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
from keras.optimizers import SGD
from keras.layers.advanced_activations import LeakyReLU
model = Sequential()
model.add(Conv2D(16, (3,3), padding='same', input_shape=(513, 25, 1)))
model.add(Conv2D(16, (3,3), padding='same'))
model.add(Conv2D(16, (3,3), padding='same'))
model.add(Conv2D(16, (3,3), padding='same'))
model.add(Dense(1, activation='sigmoid'))
sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss=keras.losses.binary_crossentropy, optimizer=sgd, metrics=['accuracy'])
And run the model:, trainVocals,epochs=10, validation_data=(testMixed, testVocals))
But I'm getting this result:
ValueError: in user code:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/ train_function *
return step_function(self, iterator)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/ step_function **
outputs =, args=(data,))
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/ run
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/ call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/ _call_for_each_replica
return fn(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/ run_step **
outputs = model.train_step(data)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/ train_step
y_pred = self(x, training=True)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/ __call__
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/ assert_input_compatibility
' input tensors. Inputs received: ' + str(inputs))
ValueError: Layer sequential_1 expects 1 inputs, but it received 2 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, 2584) dtype=float32>, <tf.Tensor 'IteratorGetNext:1' shape=(None, 2584) dtype=float32>]
I am new to this topic, thanks for the help provided in advance.
It's probably an issue with specifying input data to Keras' fit() function. I would recommend using a as input to fit() like so:
import tensorflow as tf
train_data =, trainVocals))
valid_data =, testVocals)), epochs=10, validation_data=valid_data)
You can then also use functions like shuffle() and batch() on the TF datasets.
EDIT: It also seems like your input shapes are incorrect. The input_shape you specified for the first conv layer is (513, 25, 1), so the input should be a batch tensor of shape (batch_size, 513, 25, 1), whereas you're inputting the shape (batch_size, 2584). So you'll need to reshape and probably cut your inputs to the specified shape, or specify a new shape.
Basically, no matter what you define the shape of Conv2D is 2D, 3D,... it requires 4D when you feeding input X to it, where X.shape is look like this (batch,row,col,channel).
The below example here is the clarify about Conv2D
input_layer= layers.InputLayer(input_shape=(2,2,1))
conv1 = layers.Conv2D(3,(2,2))
X= np.ones((2,2))
X =X.reshape(1,X.shape[0],X.shape[1],1) # shape of X is 4D, (1, 2, 2, 1)
Now let's elaborating above codes
Line 1 input_layer was defined with the shape of 3D, but at line no.4 X was reshaped to 4D shape which is not matching the shape at all. However, in order to feed any input X to input_layer or Conv2D must pass with 4D shape.

How to convert a tensorflow model to a pytorch model?

I'm new to pytorch. Here's an architecture of a tensorflow model and I'd like to convert it into a pytorch model.
I have done most of the codes but am confused about a few places.
1) In tensorflow, the Conv2D function takes filter as an input. However, in pytorch, the function takes the size of input channels and output channels as inputs. So how do I find the equivalent number of input channels and output channels, provided with the size of the filter.
2) In tensorflow, the dense layer has a parameter called 'nodes'. However, in pytorch, the same layer has 2 different inputs (the size of the input parameters and size of the targeted parameters), how do I determine them based on the number of the nodes.
Here's the tensorflow code.
from keras.utils import to_categorical
from keras.models import Sequential, load_model
from keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu', input_shape=X_train.shape[1:]))
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dense(256, activation='relu'))
model.add(Dense(43, activation='softmax'))
Here's my code.:
import torch.nn.functional as F
import torch
# The network should inherit from the nn.Module
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# Define 2D convolution layers
# 3: input channels, 32: output channels, 5: kernel size, 1: stride
self.conv1 = nn.Conv2d(3, 32, 5, 1) # The size of input channel is 3 because all images are coloured
self.conv2 = nn.Conv2d(32, 64, 5, 1)
self.conv3 = nn.Conv2d(64, 128, 3, 1)
self.conv3 = nn.Conv2d(128, 256, 3, 1)
# It will 'filter' out some of the input by the probability(assign zero)
self.dropout1 = nn.Dropout2d(0.25)
self.dropout2 = nn.Dropout2d(0.5)
# Fully connected layer: input size, output size
self.fc1 = nn.Linear(36864, 128)
self.fc2 = nn.Linear(128, 10)
# forward() link all layers together,
def forward(self, x):
x = self.conv1(x)
x = F.relu(x)
x = self.conv2(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = self.conv3(x)
x = F.relu(x)
x = self.conv4(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = F.relu(x)
x = self.dropout2(x)
x = self.fc2(x)
output = F.log_softmax(x, dim=1)
return output
Thanks in advance!
1) In pytorch, we take input channels and output channels as an input. In your first layer, the input channels will be the number of color channels in your image. After that it's always going to be the same as the output channels from your previous layer (output channels are specified by the filters parameter in Tensorflow).
2). Pytorch is slightly annoying in the fact that when flattening your conv outputs you'll have to calculate the shape yourself. You can either use an equation to calculate this (𝑂𝑢𝑡=(𝑊−𝐹+2𝑃)/𝑆+1), or make a shape calculating function to get the shape of a dummy image after it's been passed through the conv part of the network. This parameter will be your size of input argument; the size of your output argument will just be the number of nodes you want in your next fully connected layer.

Recurrentshop and Keras: multi-dimensional RNN results in a dimensions mismatch error

I have an issue with Recurrentshop and Keras. I am trying to use Concatenate and multidimensional tensors in a Recurrent Model, and I get dimension issue regardless of how I arrange the Input, shape and batch_shape.
Minimal code:
from keras.layers import *
from keras.models import *
from recurrentshop import *
from keras.layers import Concatenate
x_t = Input(shape=(128,128,3,))
h_tm1 = Input(shape=(128,128,3, ))
h_t1 = Concatenate()([x_t, h_tm1])
last = Conv2D(3, kernel_size=(3,3), strides=(1,1), padding='same', name='conv2')(h_t1)
# Build the RNN
rnn = RecurrentModel(input=x_t, initial_states=[h_tm1], output=last, final_states=[last], state_initializer=['zeros'])
x = Input(shape=(128,128,3, ))
y = rnn(x)
model = Model(x, y)
model.predict(np.random.random((1, 128, 128, 3)))
ValueError: Shape must be rank 3 but it is rank 4 for 'recurrent_model_1/concatenate_1/concat' (op:ConcatV2) with input shapes: [?,128,3], [?,128,128,3], [].
Please help.
Try this (the changed lines are commented):
from recurrentshop import *
from keras.layers import Concatenate
x_t = Input(shape=(128, 128, 3,))
h_tm1 = Input(shape=(128, 128, 3,))
h_t1 = Concatenate()([x_t, h_tm1])
last = Conv2D(3, kernel_size=(3, 3), strides=(1, 1), padding='same', name='conv2')(h_t1)
rnn = RecurrentModel(input=x_t,
x = Input(shape=(1, 128, 128, 3,)) # a series of 3D tensors -> 4D
y = rnn(x)
model = Model(x, y)
model.predict(np.random.random((1, 1, 128, 128, 3))) # a batch of x -> 5D
