I am attempting to train the keras VGG-19 model on RGB images, when attempting to feed forward this error arises:
ValueError: Input 0 of layer block1_conv1 is incompatible with the layer: expected ndim=4, found ndim=3. Full shape received: [224, 224, 3]
When reshaping image to (224, 224, 3, 1) to include batch dim, and then feeding forward as shown in code, this error occurs:
ValueError: Dimensions must be equal, but are 1 and 3 for '{{node BiasAdd}} = BiasAdd[T=DT_FLOAT, data_format="NHWC"](strided_slice, Const)' with input shapes: [64,224,224,1], [3]
for idx in tqdm(range(train_data.get_ds_size() // batch_size)):
# train step
batch = train_data.get_train_batch()
for sample, label in zip(batch[0], batch[1]):
sample = tf.reshape(sample, [*sample.shape, 1])
label = tf.reshape(label, [*label.shape, 1])
train_step(idx, sample, label)
vgg is intialized as:
vgg = tf.keras.applications.VGG19(
include_top=True,
weights=None,
input_tensor=None,
input_shape=[224, 224, 3],
pooling=None,
classes=1000,
classifier_activation="softmax"
)
training function:
#tf.function
def train_step(idx, sample, label):
with tf.GradientTape() as tape:
# preprocess for vgg-19
sample = tf.image.resize(sample, (224, 224))
sample = tf.keras.applications.vgg19.preprocess_input(sample * 255)
predictions = vgg(sample, training=True)
# mean squared error in prediction
loss = tf.keras.losses.MSE(label, predictions)
# apply gradients
gradients = tape.gradient(loss, vgg.trainable_variables)
optimizer.apply_gradients(zip(gradients, vgg.trainable_variables))
# update metrics
train_loss(loss)
train_accuracy(vgg, predictions)
I am wondering how the input should be formatted such that the keras VGG-19 implementation will accept it?
You will have to unsqueeze one dimension to turn your shape into [1, 224, 224, 3':
for idx in tqdm(range(train_data.get_ds_size() // batch_size)):
# train step
batch = train_data.get_train_batch()
for sample, label in zip(batch[0], batch[1]):
sample = tf.reshape(sample, [1, *sample.shape]) # added the 1 here
label = tf.reshape(label, [*label.shape, 1])
train_step(idx, sample, label)
You use wrong dimension for the image batch, "When reshaping image to (224, 224, 3, 1) to include batch dim" -- this should be (x, 224, 224, 3), where x is the number of the images in the batch.
Related
I'm new to TensorFlow and ML and I'm trying to create a GAN that will generate an array of 3 dimensions (output shape is 100, 3).
I have the following discriminator model:
def make_discriminator_model():
model = tf.keras.Sequential()
model.add(layers.LeakyReLU(input_shape=(100, 3), name="Input"))
model.add(layers.Flatten(name="Flatten"))
model.add(layers.Dense(1, name="Output"))
return model
When using this model as such it works okay:
if __name__ == "__main__":
# generate fake data to test the discriminator with
noise = tf.random.normal([1, 100])
generator = make_generator_model()
fake_data = generator(noise, training=False)
# create the discriminator
discriminator = make_discriminator_model()
# test with fake data
decision = discriminator(fake_data)
print(decision)
Outputs: tf.Tensor([[0.0120331]], shape=(1, 1), dtype=float32)
However when training with model.fit, I get ValueError: Input 0 of layer "sequential_1" is incompatible with the layer: expected shape=(None, 100, 3), found shape=(100, 3). This is the training code:
if __name__ == "__main__":
generator = make_generator_model()
discriminator = make_discriminator_model()
# constants
seed = tf.random.normal([1, 100])
epochs = 5
# generate fake data
fake_data = generator(seed, training=False)
# load dataset
dataset = all_data(max_size=10_000)
# split dataset into training (80%) and testing (20%)
training_dataset = dataset[:8000]
test_dataset = dataset[8000:]
# use optimizer and loss function
discriminator.compile(
loss=losses.Hinge(),
optimizer="adam",
metrics=tf.metrics.BinaryAccuracy(threshold=0.0)
)
discriminator.summary()
# convert training data into a Dataset
input_dataset = tf.data.Dataset.from_tensor_slices(training_dataset[:6000])
input_validation = tf.data.Dataset.from_tensor_slices(training_dataset[6000:])
# train discriminator
discriminator.fit(input_dataset, epochs=epochs, validation_data=input_validation)
I understand that it's expecting the shape None, 100, 3 and is getting the shape 100, 3 but I don't understand why it's adding None to the front of the shape when using model.fit.
The leading None represents the batch dimension. By specifying input_shape=(100, 3), you are telling the model that it should expect a batch of unknown number of samples, where each sample is of shape (100,3).
However, what you are feeding to it is a tensor of shape (100,3), which it interprets as "a batch of 100 samples each of shape (3,)" and complains. If 100 is actually the number of samples in a batch, you'll need to specify input_shape=(3,). Otherwise, if (100,3) is the sample shape and there is indeed only one sample in the batch, you'll need to expand the tensor in the first dimension to make it (1,100,3).
I am solving the digit recognition task using the MNIST dataset in keras.
The task itself runs smoothly but afterwards I have tried to use the same model
for some other handwritten digits that I created with 'paint'.
Since the original size was (192, 188, 3), I specifically resized to (28, 28).
However, once I try the model on this newly created digit (see attachment), this is the Warning message I get:
WARNING:tensorflow:Model was constructed with shape (None, 28, 28) for input KerasTensor(type_spec=TensorSpec(shape=(None, 28, 28), dtype=tf.float32, name='flatten_input'), name='flatten_input', description="created by layer 'flatten_input'"), but it was called on an input with incompatible shape (None, 28)
In addition to this error message:
ValueError: Input 0 of layer dense is incompatible with the layer: expected axis -1 of input shape to have value 784 but received input with shape (None, 28)
Here is my code:
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
# %matplotlib inline
import numpy as np
import pandas as pd
import cv2 as cv
(X_train, y_train),(X_test, y_test)=keras.datasets.mnist.load_data()
# Normalize the train dataset
X_train = tf.keras.utils.normalize(X_train, axis=1)
# Normalize the test dataset
X_test = tf.keras.utils.normalize(X_test, axis=1)
#Build the model object
model = tf.keras.models.Sequential()
# Add the Flatten Layer
model.add(tf.keras.layers.Flatten())
# Build the input and the hidden layers
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
# Build the output layer
model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))
# Compile the model
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy",
metrics=["accuracy"])
model.fit(x=X_train, y=y_train, epochs=20) # Start training process
# Evaluate the model performance
test_loss, test_acc = model.evaluate(x=X_test, y=y_test)
# Print out the model accuracy
print('\nTest accuracy:', test_acc)
predictions = model.predict([X_test]) # Make prediction
# TRY SAME MODEL WITH NEW DIGIT
img_6 = cv.imread("6.png")
img_7 = cv.imread("7.png")
img_2 = cv.imread("2.png")
from tensorflow.keras.preprocessing import image
img = img_7
img=cv.resize(img, X_train[0].shape,
interpolation = cv.INTER_AREA)
img = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
plt.imshow(img)
plt.show()
img=np.invert(np.array([img]))
img=np.reshape(img, ( 784, 1))
print(img.shape,'fghjkljkhjgfgfgcgvhbjnmnbjv')
plt.imshow(img)
plt.show()
img=np.expand_dims(img, axis=0) # will move it to (1,784)
print(img.shape,'fghjkljkhjgfgfgcgvhbjnmnbjv')
plt.imshow(img)
plt.show()
prediction=model.predict(img) # predict
print ('prediction=',np.argmax(prediction))
plt.imshow(img)
plt.show()
The problem with your code is that your model is expecting a 3-dimensional input (batch_size, width, height), while you're giving it a single 2-dimensional image (width, height).
You can first reshape your input image to the correct shape, like so:
np.reshape(img_6, (1, 28, 28))
The first layer on your model is tf.keras.layers.Flatten() i.e flatten. means it's like an array. what is that array length is 784(28X28X1 ~ length x width x channel). so if you put model.summary() the first layer is :
Layer (type) Output Shape Param #
=================================================================
flatten (Flatten) (None, 784) 0
so it means that predict is expecting input data as
(1,784). you are on right track to resize and gray out the input image, few more steps are needed. please refer to the below code and comment against each line:
from tensorflow.keras.preprocessing import image # import image preprocessing
img_6 = cv.imread("6.png") # shape if (352, 324, 3) for screen snap, this could be different based on read image.
img_6=cv.resize(img_6, X_train[0].shape,
interpolation = cv.INTER_AREA) # now its in shape (28, 28, 3) which is~ 2352(28x28x3)
img_6 = cv.cvtColor(img_6, cv.COLOR_BGR2GRAY) # now gray image
img_6=image.img_to_array(img_6) # shape (28, 28, 1) i.e channel 1
img_6= img_6.flatten() # flatten it as model is expecting (None,784) , this will be (784,) i.e 28x28x1 =
img_6=np.expand_dims(img_6, axis=0) # will move it to (1,784)
prediction=model.predict(im1) # predict
print (np.argmax(prediction))
Indeed the keras model has first layer Flatten, but since the training is done on X_train of shape (60000,28,28) and the first successful prediction is done on X_test of shape (10000,28,28), what you need for prediction is a pandas array of shape (1,28,28).
Also be sure that the images in the taining MINST database are on black background (0 color) written with white nuances (closer to 1), so you need to normalize the img array with img = (255-img) / 255
So with the following additional code I could predict successfuly 2 and 6 image :
img = img2
img=cv.resize(img, X_train[0].shape, interpolation = cv.INTER_AREA)
img = cv.cvtColor(img, cv.COLOR_BGR2GRAY) # now gray image (28,28)
img = (255-img) / 255 # normalize as white on black
img=np.expand_dims(img, axis=0) # will move it to (1,28,28)
pred=model.predict(img) # predict
print(pred)
I'm building a RNN and I use LSTM.
The X matrix has this dimension (1824, 7) instead Y has this dim (1824, 1).
This is my model:
num_units = 64
learning_rate = 0.0001
activation_function = 'sigmoid'
adam = Adam(lr=learning_rate)
loss_function = 'mse'
batch_size = 5
num_epochs = 50
# Initialize the RNN
model = Sequential()
model.add(LSTM(units = num_units, activation=activation_function, input_shape=(1824, 7, )))
model.add(LeakyReLU(alpha=0.5))
model.add(Dropout(0.1))
model.add(Dense(units = 1))
# Compiling the RNN
model.compile(optimizer=adam, loss=loss_function, metrics=['accuracy'])
history = model.fit(
X,
y,
validation_split=0.1,
batch_size=batch_size,
epochs=num_epochs,
shuffle=False
)
I know the error is in input_shape parameter. When I try to fit the model I get this error:
ValueError: Input 0 of layer sequential is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 7]
I have seen similar questions, And I tried to apply some of that changes, such as:
input_dim = X.shape
input_dim=(7,)
input_dim=(1824, 7, 1)
But in any case I got this kind of error. How can I fix it?
As commented by #Nicolas Gervais,
Tensorflow Keras LSTM expects inputs: A 3D tensor with shape [batch, timesteps, feature].
Working sample code
import tensorflow as tf
inputs = tf.random.normal([32, 10, 8])
print(inputs.shape)
lstm = tf.keras.layers.LSTM(4)
output = lstm(inputs)
print(output.shape)
Output
(32, 10, 8)
(32, 4)
I have been trying to train a dataset using TFLearn to implement a convolutional neural network.
I have a dataset of 10 classes with image size is 64*32, 3 channels of input and 2 outputs i.e image detected/not detected.
Here is my code.
# Load the data set
def read_data():
with open("deep_logo.pickle", 'rb') as f:
save = pickle.load(f)
X = save['train_dataset']
Y = save['train_labels']
X_test = save['test_dataset']
Y_test = save['test_labels']
del save
return [X, X_test], [Y, Y_test]
def reformat(dataset, labels):
dataset = dataset.reshape((-1, 64, 32,3)).astype(np.float32)
labels = (np.arange(10) == labels[:, None]).astype(np.float32)
return dataset, labels
dataset, labels = read_data()
X,Y = reformat(dataset[0], labels[0])
X_test, Y_test = reformat(dataset[2], labels[2])
print('Training set', X.shape, Y.shape)
print('Test set', X_test.shape, Y_test.shape)
#building convolutional layers
network = input_data(shape=[None, 64, 32, 3],data_preprocessing=img_prep,
data_augmentation=img_aug)
network = conv_2d(network, 32, 3, activation='relu')
network = max_pool_2d(network, 2)
network = conv_2d(network, 64, 3, activation='relu')
network = conv_2d(network, 128, 3, activation='relu')
network = max_pool_2d(network, 2)
network = fully_connected(network, 512, activation='relu')
network = dropout(network, 0.5)
# Step 8: Fully-connected neural network with two outputs to make the final
prediction
network = fully_connected(network, 2, activation='softmax')
network = regression(network, optimizer='adam',
loss='categorical_crossentropy',
learning_rate=0.001)
# Wrap the network in a model object
model = tflearn.DNN(network, tensorboard_verbose=0, checkpoint_path='logo-
classifier.tfl.ckpt')
# Training it . 100 training passes and monitor it as it goes.
model.fit(X,Y, n_epoch=100, shuffle=True, validation_set=(X_test, Y_test),
show_metric=True, batch_size=64,
snapshot_epoch=True,
run_id='logo-classifier')
# Save model when training is complete to a file
model.save("logo-classifier.tfl")
print("Network trained and saved as logo-classifier.tfl!")
I get the following error
ValueError: Cannot feed value of shape (64, 10) for Tensor 'TargetsData/Y:0', which has shape '(?, 2)'
I have X and X_test with parameters of images and Y and Y_test with labeles in the pickle file. I have tried solutions from similar question, but the didn't work for me.
Any help would be appericiated.
Thanks.
Youve specified your output tensor shape as (?,2) and your labels is of the shape (?,10). Your label and output tensor shape must be the same.
You are getting that error because there is a mismatch between the shape of what you are feeding and what the tensorflow is expecting. To fix the issue, you might want to reshape your Y which is currently shaped at (64,10) to (?, 2). For example, you would do the following:
Y = np.reshape(Y, (-1, 2))
I'm trying to replicate the CNN described in
https://pdfs.semanticscholar.org/3b57/85ca3c29c963ae396c2f94ba1a805c787cc8.pdf
and I'm stuck at the last layer. I've modeled the cnn like this
# Model function for CNN
def cnn_model_fn(features, labels, mode):
# Input Layer
# Reshape X to 4-D tensor: [batch_size, width, height, channels]
# Taxes images are 150x150 pixels, and have one color channel
input_layer = tf.reshape(features, [-1, 150, 150, 1])
# Convolutional Layer #1
# Input Tensor Shape: [batch_size, 150, 150, 1]
# Output Tensor Shape: [batch_size, 144, 144, 20]
conv1 = tf.layers.conv2d(
inputs=input_layer,
filters=20,
kernel_size=[7, 7],
padding="valid",
activation=tf.nn.relu)
# Pooling Layer #1
# Input Tensor Shape: [batch_size, 144, 144, 20]
# Output Tensor Shape: [batch_size, 36, 36, 20]
pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[4, 4], strides=4)
# Convolutional Layer #2
# Input Tensor Shape: [batch_size, 36, 36, 20]
# Output Tensor Shape: [batch_size, 32, 32, 50]
conv2 = tf.layers.conv2d(
inputs=pool1,
filters=50,
kernel_size=[5, 5],
padding="valid",
activation=tf.nn.relu)
# Pooling Layer #2
# Input Tensor Shape: [batch_size, 32, 32, 50]
# Output Tensor Shape: [batch_size, 8, 8, 50]
pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[4, 4], strides=4)
# Flatten tensor into a batch of vectors
# Input Tensor Shape: [batch_size, 8, 8, 50]
# Output Tensor Shape: [batch_size, 8 * 8 * 50]
pool2_flat = tf.reshape(pool2, [-1, 8 * 8 * 50])
# Dense Layer #1
# Densely connected layer with 1000 neurons
# Input Tensor Shape: [batch_size, 8 * 8 * 50]
# Output Tensor Shape: [batch_size, 1000]
dense1 = tf.layers.dense(inputs=pool2_flat, units=1000, activation=tf.nn.relu)
# Dense Layer #2
# Densely connected layer with 1000 neurons
# Input Tensor Shape: [batch_size, 1000]
# Output Tensor Shape: [batch_size, 1000]
dense2 = tf.layers.dense(inputs=dense1, units=1000, activation=tf.nn.relu)
# Add dropout operation; 0.5 probability that element will be kept
dropout = tf.layers.dropout(
inputs=dense2, rate=0.5, training=mode == learn.ModeKeys.TRAIN)
# Logits layer
# Input Tensor Shape: [batch_size, 1000]
# Output Tensor Shape: [batch_size, 4]
logits = tf.layers.dense(inputs=dropout, units=nClass)
loss = None
train_op = None
# Calculate Loss (for both TRAIN and EVAL modes)
if mode != learn.ModeKeys.INFER:
onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=nClass)
loss = tf.losses.softmax_cross_entropy(
onehot_labels=onehot_labels, logits=logits)
# Configure the Training Op (for TRAIN mode)
if mode == learn.ModeKeys.TRAIN:
train_op = tf.contrib.layers.optimize_loss(
loss=loss,
global_step=tf.contrib.framework.get_global_step(),
learning_rate=0.001,
optimizer="SGD")
# Generate Predictions
predictions = {
"classes": tf.argmax(
input=logits, axis=1)
}
# Return a ModelFnOps object
return model_fn_lib.ModelFnOps(
mode=mode, predictions=predictions, loss=loss, train_op=train_op)
but the final accuracy is really poor (0.25). So I realized that actually the paper states that the last layer is a softmax layer. So i tried changed my logits layer to
logits = tf.layers.softmax(dropout)
but when I run it, it says
ValueError: Shapes (?, 1000) and (?, 4) are incompatible
So, what I'm missing here?
The original one was correct. The softmax activation is applied while calculating the loss with tf.losses.softmax_cross_entropy. If you want to calculate it separately you should add it after the logits calculation, but without replacing it as you did.
logits = tf.layers.dense(inputs=dropout, units=nClass)
softmax = tf.layers.softmax(logits)
Or you can combine both in one, but I wouldn't recommend it. It is better to calculate the softmax with the loss.
logits = tf.layers.dense(inputs=dropout, units=nClass, activation=tf.layers.softmax)
Your classifier is not doing better than random, so I would say that the problem lays somewhere else, maybe in the data loading and preprocessing.