Using Keras to design a CNN: Understanding Tensor Shape - python

just starting out with ML, and wanted to create my own CNN to detect orientation of images with faces. I followed a tutorial to accept input images of 64x64x1, and here is my code:
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator()
train_it = datagen.flow_from_directory('firstThousandTransformed/', class_mode='categorical', batch_size=64, color_mode="grayscale")
val_it = datagen.flow_from_directory('validation/', class_mode='categorical', batch_size=64, color_mode="grayscale")
imageInput = Input(shape=(64,64,1))
conv1 = Conv2D(32, kernel_size=4, activation='relu')(imageInput)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(16, kernel_size=4, activation='relu')(pool1)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
flat = Flatten()(pool2)
hidden1 = Dense(10, activation='relu')(flat)
output = Dense(4, activation='sigmoid')(hidden1)
model = Model(inputs=imageInput, outputs=output)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(train_it, steps_per_epoch=16, validation_data=val_it, validation_steps=8)
However, I get this error when I try to run:
Input to reshape is a tensor with 3810304 values, but the requested
shape requires a multiple of 2704 [[node model/flatten/Reshape
(defined at c:\Users\cdues\Desktop\kerasTutorial\orentationTry.py:33)
]] [Op:__inference_train_function_836]
Below is my model summary:
I need some help understanding what a Tensor shape is and where my code has gone wrong here. Just working through the tutorial with Keras, I didn't encounter Tensor shape and now I am sort of lost. Sorry for the basic question, can yall help a noobie out? Thanks!

Try using the target_size argument while calling flow_from_directory.
train_it = datagen.flow_from_directory('firstThousandTransformed/',
class_mode='categorical',
batch_size=64,
color_mode='grayscale',
target_size=(64,64))
val_it = datagen.flow_from_directory('validation/',
class_mode='categorical',
batch_size=64,
color_mode='grayscale',
target_size=(64,64))
This way you can reshape the images from the directories before feeding to the model.

First in ImageDataGenerator there is a parameter called rescale. Typically with pixel values in the range 0 to 255 rescale is set to 1/255 so pixel value fall in the range from 0 to 1. I recommend you use that. Documentation for ImageDataGenerator is here.. In flow from directory you can specify the image size with parameter target_size: Tuple of integers (height, width), default: (256, 256). Documentation is at location specified earlier. In your model you have 4 nodes in your output layer. This implies you are classifying images into one of 4 classes. If that is the case in model.compile you should use categorical cross entropy as the loss. Change the activation function in your output layer to softmax.

Related

Trying to make machine learning code with python. but i get error when first epoch finishes

i am trying to make machine learning model that reads images. but getting error when first epoch finishes.
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 147456 values, but the requested shape requires a multiple of 12544
any ideas?
validation_generator = ImageDataGenerator(rescale=1./255)
train_data_gen = train_generator.flow_from_dataframe(
dataframe=df,
#directory="CatDog",
x_col="images",
y_col="label",
class_mode="binary",
batch_size=64,
target_size=(128,128))
validation_data_gen = validation_generator.flow_from_dataframe(
dataframe=df,
#directory='',
x_col="images",
y_col="label",
class_mode="binary",
batch_size=64,
target_size=(64,64))
from keras.models import Sequential
from keras.layers import Dense, Conv2D, MaxPooling2D, Dropout, Flatten
model = Sequential([
Conv2D(16, (3,3), activation='relu', input_shape=(128,128,3)),
MaxPool2D((2,2)),
Conv2D(32, (3,3), activation='relu'),
MaxPool2D((2,2)),
Conv2D(64, (3,3), activation='relu'),
MaxPool2D((2,2)),
Flatten(),
Dense(512, activation='relu'),
Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.summary()
history = model.fit(train_data_gen, epochs=2, validation_data=validation_data_gen)
I'm not sure if there are your full code.
There are already some problem.
line1:rescale=1./255
rescale=1./255 will make your image from "0~255 8bit int image" to "0~1 float image". So be sure that your training data, validation data and test data in the same format. That is you may want to rescale train_generator as well.
line19:target_size=(64,64)
It may be better to use the same target_size. If original sizes of images are large enough, set target_size not small than input_shape of your model to avoid resizing images multiple times. It may lose some information of your images.
line33:Dense(1, activation='sigmoid')
The last Dense layer is usually your classification layer. Setting Dense(1, ...) means the number of your label is only 1. If you want to classify the image is cat or dog, you may set Dense(2, ...).
activation='sigmoid' will give you like a percentage of how much the model thinks the image matching the label.
You may want to use activation='softmax' to give you a result of classification.
Basically, I guessed, the error is caused by wrong Dense(unit, ...).

Shape incompatible error while using ImageDataGenerator for transfer learning

I want to create a classification model. For this purpose I have collected some images from 3 different classes. First, I have implemented Xception model ( freezed all layers except the last one). However, it overfitted. Then, I have decided to use data augmentation strategy. This is the first time I have used Keras module for this purpose. I belive that I have correctly used it. But getting error ValueError: Shapes (None, None) and (None, None, None, 3) are incompatible. I have tried what I found from the web, but did not works. Can anyone point the what I am doing wrong? Here is the code.
from tensorflow import keras
from matplotlib import pyplot as plt
from keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.imagenet_utils import preprocess_input
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten, Activation
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.models import Model
# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1./255)
# this is a generator that will read pictures found in
# subfolers of 'data/train', and indefinitely generate
# batches of augmented image data
train_generator = train_datagen.flow_from_directory(
'data2/train', # this is the target directory
target_size=(299, 299), # all images will be resized to 299x299 for the Xception
batch_size=32,
class_mode="categorical")
# this is a similar generator, for validation data
validation_generator = test_datagen.flow_from_directory(
'data2/validation',
target_size=(299, 299),
batch_size=32,
class_mode="categorical")
Xception = keras.applications.Xception(weights='imagenet', include_top=False)
num_classes=3
inp = Xception.input
new_classification_layer = Dense(num_classes, activation='softmax')
out = new_classification_layer(Xception.layers[-2].output)
model_Xception = Model(inp, out)
model_Xception.summary()
for l, layer in enumerate(model_Xception.layers[:-1]):
layer.trainable = False
for l, layer in enumerate(model_Xception.layers[-1:]):
layer.trainable = True
opt=keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07)
model_Xception.compile(loss='categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
model_Xception.summary()
model_Xception.fit_generator(
train_generator,
epochs=5,
validation_data=validation_generator)
model_Xception.save_weights('first_try.h5')
That's because you are feeding a convolution's output to a Dense layer.
You need to add one of Flatten, GlobalMaxPooling2D or GlobalAveragePooling2D in order to transform your output to (batch_size, input_size). You can change these lines:
inp = Xception.input
out_xception = Xception.layers[-2].output
flatten = tf.keras.layers.Flatten()(out_xception)
new_classification_layer = tf.keras.layers.Dense(num_classes, activation='softmax')
out = new_classification_layer(flatten)
model_Xception = tf.keras.Model(inp, out)
model_Xception.summary()
Second thing is since you did not specify input_shape while defining the Xception model, Flatten will throw an error. Simply change it to:
Xception = tf.keras.applications.Xception(weights='imagenet', include_top=False,
input_shape = (299,299,3))

Matrix size-incompatible - Keras Tensorflow

I'm trying to train a simple model over some picture data that belongs to 10 classes.
The images are in B/W format (not gray scale), I'm using the image_dataset_from_directory to import the data into python as well as split it into validation/training sets.
My code is as below:
My Imports
import matplotlib.pyplot as plt
import numpy as np
import os
import PIL
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense
Read Image Data
trainDT = tf.keras.preprocessing.image_dataset_from_directory(
data_path,
labels="inferred",
label_mode="categorical",
class_names=['0','1','2','3','4','5','6','7','8','9'],
color_mode="grayscale",
batch_size=4,
image_size=(256, 256),
shuffle=True,
seed=44,
validation_split=0.1,
subset='validation',
interpolation="bilinear",
follow_links=False,
)
Model Creation/Compile/Fit
model = Sequential([
Dense(units=128, activation='relu', input_shape=(256,256,1), name='h1'),
Dense(units=64, activation='relu',name='h2'),
Dense(units=16, activation='relu',name='h3'),
layers.Flatten(name='flat'),
Dense(units=10, activation='softmax',name='out')
],name='1st')
model.summary()
model.compile(optimizer='adam' , loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x=trainDT, validation_data=train_data, epochs=10, verbose=2)
The model training returns an error:
InvalidArgumentError Traceback (most recent call last)
....
/// anaconda paths and anaconda python code snippets in the error reporting \\\
....
InvalidArgumentError: Matrix size-incompatible: In[0]: [1310720,3], In[1]: [1,128]
[[node 1st/h1/Tensordot/MatMul (defined at <ipython-input-38-58d6507e2d35>:1) ]] [Op:__inference_test_function_11541]
Function call stack:
test_function
I don't understand where the size mismatch comes from, I've spent a few hours looking around for a solution and trying different things but nothing seems to work for me.
Appreciate any help, thank you in advance!
Dense layers expect flat input (not 3d tensor), but you are sending (256,256,1) shaped tensor into the first dense layer. If you want to use dense layers from the beginning then you will need to move the flatten to be the first layer or you will need to properly reshape your data.
model = tf.keras.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax")
])
Also, the flatten between 2 dense layers makes no sense because the output of a dense layer is flat anyway.
From the structure of your model (especially the flatten placement), I assume that
those dense layers were supposed to be convolutional layers instead.
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax")
])
Convolutional layers can process 2D input and they will also produce more dimensional output which you need to flatten before passing it to the dense top (note that you can add more convolutional layers).
Hy mhk777 Hope you are doing well. Brother, I think that you are confusing dense layers with convolution layers. You have to apply some convolution layers to the image before giving it to dense layers. If you don't want to apply convolution than you have to give 2d array to the dense layer i.e (number of samples, data)
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
model = models.Sequential()
# Here are convolutional Layer
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(256,256,1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
# Here are your dense layers
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))
model.summary()
model.compile(optimizer='adam' , loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x=trainDT, validation_data=train_data, epochs=10, verbose=2)

Improving a bad CNN- Detecting Image Orientation

just starting out in ML and finally got my first CNN up and running :) except its accuracy is only slightly better than a random guess (~27%). I give the model a set of 2000 pictures of faces sorted into either 0 degrees, 90 degrees, 180 degrees, or 270 degrees rotated. Below is my code:
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras.preprocessing.image import ImageDataGenerator
#import matplotlib.pyplot as plt
datagen = ImageDataGenerator()
train_it = datagen.flow_from_directory('firstThousandTransformed/', class_mode='categorical', batch_size=64, color_mode="grayscale", target_size=(64,64))
val_it = datagen.flow_from_directory('validation/', class_mode='categorical', batch_size=64, color_mode="grayscale", target_size=(64,64))
test_it = datagen.flow_from_directory('test/', class_mode='categorical', batch_size=64, color_mode='grayscale', target_size=(64,64))
imageInput = Input(shape=(64,64,1))
conv1 = Conv2D(128, kernel_size=8, activation='relu')(imageInput)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(64, kernel_size=4, activation='relu')(pool1)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(64, kernel_size=4, activation='relu')(pool2)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
flat = Flatten()(pool3)
hidden1 = Dense(10, activation='relu')(flat)
output = Dense(4, activation='softmax')(hidden1)
model = Model(inputs=imageInput, outputs=output)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(train_it, steps_per_epoch=16, validation_data=val_it, validation_steps=8)
loss = model.evaluate(test_it, steps=16)
_, accuracy = model.evaluate(train_it)
print('Accuracy: %.2f' % (accuracy*100))
print(model.summary())
The way I envisioned this network working was that the convolution layers might detect some hair or a chin in a certain place and be able to distinguish that hair or chin placement from another image. This clearly is not working. Could you give a noobie some advice? How can I make this better? How can I think about this problem? Am I using the wrong kind of layers? Do I need more pictures?
EDIT:
So I have been playing around with it a little bit by messing with the kernel_sizes (now they are 12, 8, and 4) and changing the number of epochs to 20, and something crazy happened. When I ran the program, I got an accuracy of 99%!! (see screenshot below)
HOWEVER, when I ran it again to double check, it went back to ~27%. What does this mean?

Tripleloss from Tensorflow add on give reshape error

I have been getting this error and cannot figure it out.
ValueError: Cannot reshape a tensor with 48032 elements to shape [32,1] (32 elements) for 'Reshape' (op: 'Reshape') with input shapes: [32,1501], [2] and with input tensors computed as partial shapes: input[1] = [32,1].
What I am doing is trying to use a tripleloss function from the tensorflow_addons library using an example from here
https://www.tensorflow.org/addons/tutorials/losses_triplet
I pretty much copied it and change the data.
My data set contains 1501 different classes separated into folders for each class. I am using a data generator from tf.data.Dataset which seems to work fine too.
This is what I have
BATCH_SIZE = 32
train_datagen = ImageDataGenerator(
preprocessing_function=preprocess_input,
shear_range=0,
rotation_range=20,
zoom_range=0.15,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
ds = tf.data.Dataset.from_generator(generator=train_datagen.flow_from_directory,
args=[train_dir, (224, 224), 'categorical'],
output_types=(tf.float32, tf.float32),
output_shapes=([32, 224,224,3], [32,1501]))
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(filters=64, kernel_size=2, padding='same', activation='relu', input_shape=(224, 224, 3)),
tf.keras.layers.MaxPooling2D(pool_size=2),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Conv2D(filters=32, kernel_size=2, padding='same', activation='relu'),
tf.keras.layers.MaxPooling2D(pool_size=2),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(256, activation=None), # No activation on final dense layer
tf.keras.layers.Lambda(lambda x: tf.math.l2_normalize(x, axis=1)) # L2 normalize embeddings
])
model.compile(optimizer=tf.keras.optimizers.Adam(0.001),
loss=tfa.losses.TripletSemiHardLoss())
history = model.fit(ds, epochs=45, verbose=1, callbacks=None)
Its pretty much a verbatim copy other than the dataset.
Do I have to make a map function like ds.map(function)?
exactly the problem I ran into.
As it is stated in https://www.tensorflow.org/addons/api_docs/python/tfa/losses/TripletSemiHardLoss
"We expect labels y_true to be provided as 1-D integer Tensor with shape [batch_size] of multi-class integer labels."
ImageDataGenerator produces [batchsize,nclasses] tensor which has to be preprocessed to be fed into TripletSemiHardLoss.
I personally just made own trainig function instead of model.fit:
for e in range(EPOCHS):
print('Epoch', e)
for b in range(int(STEPS_PER_EPOCH)):
batch=train_data_gen.next()
x_batch=batch[0]
y_batch=np.argmax(batch[1],axis=1) # <- class labels: y_true: 1-D integer
history=model.fit(x_batch, y_batch) # 1 step fit
print(e,b)
which does train the model, however at the moment I'm strugling with the loss value, which is random between 0 and 1 at every step. Must be the gradients being lost. Looking into it.
Edit1:
actually, this thing works:
image_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
train_data_gen = image_generator.flow_from_directory(directory=str(data_dir),
batch_size=BATCH_SIZE,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH),
classes = list(CLASS_NAMES),
color_mode='grayscale',
class_mode='sparse')
model.fit(train_data_gen, epochs=10)
The magic was to use class_mode='sparse'

Categories