I have been training a Neural Network for recognizing the differences between a paper with handwriting and a paper with Drawings, My images are all in (3508, 2480) size and I'm using a CNN for the task, the problem is that it is taking ages to train, I have 30,000 data belonging to 2 classes which are separated into validation and training, so I have:
13650 Images of Handwritten Paragraphs for training
13650 Images of Drawings for training
1350 Images of Drawings for validation
1250 Images of Drawings for validation
If you want to see my architecture here it is my
And here is my code:
import tensorflow as tf
import os
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
from google.colab import drive
drive.mount('/content/drive')
l0 = tf.keras.layers.Conv2D(32, (60,60), activation='relu', input_shape=(438, 310, 1), name='input')
l1 = tf.keras.layers.Dropout(.3)
l2 = tf.keras.layers.BatchNormalization()
l3 = tf.keras.layers.MaxPool2D(pool_size=(2,2),padding='same')
l12 = tf.keras.layers.Flatten()
l16 = tf.keras.layers.Dense(32, activation='relu')
l17 = tf.keras.layers.Dropout(.5)
l18 = tf.keras.layers.BatchNormalization()
l22 = tf.keras.layers.Dense(1, activation='sigmoid', name='output')
from keras.preprocessing.image import ImageDataGenerator
trdata = ImageDataGenerator(rescale=1/255)
traindata = trdata.flow_from_directory("/content/drive/MyDrive/Sae/TesisProgra/DataSets/ParagraphsVsDrawings/Paste/0_Final/Training",target_size=(438, 310), color_mode="grayscale", batch_size=250)
valdata = ImageDataGenerator(rescale=1/255)
validationdata = valdata.flow_from_directory("/content/drive/MyDrive/Sae/TesisProgra/DataSets/ParagraphsVsDrawings/Paste/0_Final/Validation",target_size=(438, 310), color_mode="grayscale", batch_size=250)
from keras.callbacks import ModelCheckpoint, EarlyStopping
checkpoint = ModelCheckpoint("ParagraphsVsDrawings.h5", monitor='val_accuracy', verbose=1, save_best_only=True, save_weights_only=False, save_freq='epoch', mode='auto')
history = model.fit(traindata, validation_data=validationdata, validation_steps=10,epochs=20, verbose=True, callbacks=[checkpoint])
I´m using Google Colab PRO for the training with TPU and Big RAM options activated
I have trained CNN before, but they trained really fast, I don´t know if it's for my images being to big maybe I could try resizing them with pillow, but I'm really lost at this point, I have been waiting 12 hours and It's still on first epoch
Your kernel size, 60 by 60, is quite big. Try 3 by 3 kernel or 5 by 5 kernel. It doesn't seem that image size is the problem since you are resizing from (3508, 2480) to (438, 310).
Also notice that the number of weights you have is very, very large. It is around 24 million. This is because you are flattening a (189, 125, 32) shape array and then your next layer (Dense) has 32 units, so 189 * 125 * 32 * 32 weights for that layer. That will take very, very long to train.
Try to add one or two more conv layers + pooling layers so that the number of weights when flattened is manageable.
For anyone who has my same problem here it is my final code of the CNN architecture, which runs much faster than the one I had previusly, credits to users Afif Al Mamun and ntlarry
ModelSummary()
Code
lc0 = tf.keras.layers.Conv2D(64, (5,5), activation='relu', input_shape=(438, 310, 1), name='input')
lc1 = tf.keras.layers.Dropout(.3)
lc2 = tf.keras.layers.BatchNormalization()
lc3 = tf.keras.layers.MaxPool2D(pool_size=(2,2),padding='same')
lc4 = tf.keras.layers.Conv2D(32, (3,3), activation='relu')
lc5 = tf.keras.layers.Dropout(.3)
lc6 = tf.keras.layers.BatchNormalization()
lc7 = tf.keras.layers.MaxPool2D(pool_size=(2,2),padding='same')
lc8 = tf.keras.layers.Conv2D(16, (3,3), activation='relu')
lc9 = tf.keras.layers.Dropout(.3)
lc10 = tf.keras.layers.BatchNormalization()
lc11 = tf.keras.layers.MaxPool2D(pool_size=(4,80),padding='same')
lc8 = tf.keras.layers.Conv2D(8, (3,3), activation='relu')
lc9 = tf.keras.layers.Dropout(.3)
lc10 = tf.keras.layers.BatchNormalization()
lc11 = tf.keras.layers.MaxPool2D(pool_size=(4,80),padding='same')
lf = tf.keras.layers.Flatten()
ld1 = tf.keras.layers.Dense(32, activation='relu')
ld2 = tf.keras.layers.Dropout(.5)
ld3 = tf.keras.layers.BatchNormalization()
lfinal = tf.keras.layers.Dense(1, activation='sigmoid', name='output')
model = tf.keras.Sequential([lc0,lc1,lc2,lc3,lc8,lc9,lc10,lc11,lf,ld1,ld2,ld3,lfinal], name="ParagraphIdentification")
model.compile(loss='binary_crossentropy', optimizer=tf.keras.optimizers.RMSprop(lr = 0.001), metrics=['accuracy'])
model.summary()
Related
just starting out in ML and finally got my first CNN up and running :) except its accuracy is only slightly better than a random guess (~27%). I give the model a set of 2000 pictures of faces sorted into either 0 degrees, 90 degrees, 180 degrees, or 270 degrees rotated. Below is my code:
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras.preprocessing.image import ImageDataGenerator
#import matplotlib.pyplot as plt
datagen = ImageDataGenerator()
train_it = datagen.flow_from_directory('firstThousandTransformed/', class_mode='categorical', batch_size=64, color_mode="grayscale", target_size=(64,64))
val_it = datagen.flow_from_directory('validation/', class_mode='categorical', batch_size=64, color_mode="grayscale", target_size=(64,64))
test_it = datagen.flow_from_directory('test/', class_mode='categorical', batch_size=64, color_mode='grayscale', target_size=(64,64))
imageInput = Input(shape=(64,64,1))
conv1 = Conv2D(128, kernel_size=8, activation='relu')(imageInput)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(64, kernel_size=4, activation='relu')(pool1)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(64, kernel_size=4, activation='relu')(pool2)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
flat = Flatten()(pool3)
hidden1 = Dense(10, activation='relu')(flat)
output = Dense(4, activation='softmax')(hidden1)
model = Model(inputs=imageInput, outputs=output)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(train_it, steps_per_epoch=16, validation_data=val_it, validation_steps=8)
loss = model.evaluate(test_it, steps=16)
_, accuracy = model.evaluate(train_it)
print('Accuracy: %.2f' % (accuracy*100))
print(model.summary())
The way I envisioned this network working was that the convolution layers might detect some hair or a chin in a certain place and be able to distinguish that hair or chin placement from another image. This clearly is not working. Could you give a noobie some advice? How can I make this better? How can I think about this problem? Am I using the wrong kind of layers? Do I need more pictures?
EDIT:
So I have been playing around with it a little bit by messing with the kernel_sizes (now they are 12, 8, and 4) and changing the number of epochs to 20, and something crazy happened. When I ran the program, I got an accuracy of 99%!! (see screenshot below)
HOWEVER, when I ran it again to double check, it went back to ~27%. What does this mean?
I've adapted a simple CNN from a tutorial on Analytics Vidhya.
Problem is that my accuracy on a holdout set is no better than random. I am training on ~8600 images each of cats and dogs, which should be enough data for decent model, but accuracy on the test set is at 49%. Is there a glaring omission in my code somewhere?
import os
import numpy as np
import keras
from keras.models import Sequential
from sklearn.model_selection import train_test_split
from datetime import datetime
from PIL import Image
from keras.utils.np_utils import to_categorical
from sklearn.utils import shuffle
def main():
cat=os.listdir("train/cats")
dog=os.listdir("train/dogs")
filepath="train/cats/"
filepath2="train/dogs/"
print("[INFO] Loading images of cats and dogs each...", datetime.now().time())
#print("[INFO] Loading {} images of cats and dogs each...".format(num_images), datetime.now().time())
images=[]
label = []
for i in cat:
image = Image.open(filepath+i)
image_resized = image.resize((300,300))
images.append(image_resized)
label.append(0) #for cat images
for i in dog:
image = Image.open(filepath2+i)
image_resized = image.resize((300,300))
images.append(image_resized)
label.append(1) #for dog images
images_full = np.array([np.array(x) for x in images])
label = np.array(label)
label = to_categorical(label)
images_full, label = shuffle(images_full, label)
print("[INFO] Splitting into train and test", datetime.now().time())
(trainX, testX, trainY, testY) = train_test_split(images_full, label, test_size=0.25)
filters = 10
filtersize = (5, 5)
epochs = 5
batchsize = 32
input_shape=(300,300,3)
#input_shape = (30, 30, 3)
print("[INFO] Designing model architecture...", datetime.now().time())
model = Sequential()
model.add(keras.layers.InputLayer(input_shape=input_shape))
model.add(keras.layers.convolutional.Conv2D(filters, filtersize, strides=(1, 1), padding='same',
data_format="channels_last", activation='relu'))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(units=2, input_dim=50,activation='softmax'))
#model.add(keras.layers.Dense(units=2, input_dim=5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print("[INFO] Fitting model...", datetime.now().time())
model.fit(trainX, trainY, epochs=epochs, batch_size=batchsize, validation_split=0.3)
model.summary()
print("[INFO] Evaluating on test set...", datetime.now().time())
eval_res = model.evaluate(testX, testY)
print(eval_res)
if __name__== "__main__":
main()
For me the problem comes from the size of your network, you have only one Conv2D with a filter size of 10. This is way too small to learn the deep reprensation of your image.
Try to increment this a lot by using blocks of common architectures like VGGnet !
Example of a block :
x = Conv2D(32, (3, 3) , padding='SAME')(model_input)
x = LeakyReLU(alpha=0.3)(x)
x = BatchNormalization()(x)
x = Conv2D(32, (3, 3) , padding='SAME')(x)
x = LeakyReLU(alpha=0.3)(x)
x = BatchNormalization()(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Dropout(0.25)(x)
You need to try multiple blocks like that, and increasing the filter size in order to capture deeper features.
Other thing, you don't need to specify the input_dim of your dense layer, keras automaticly take care of that !
Last but not least, you need to fully connected network in oder to correctly classify your images, not only a single layer.
For example :
x = Flatten()(x)
x = Dense(256)(x)
x = LeakyReLU(alpha=0.3)(x)
x = Dense(128)(x)
x = LeakyReLU(alpha=0.3)(x)
x = Dense(2)(x)
x = Activation('softmax')(x)
Try those changes and keep me in touch !
Update after op's questions
Images are complex, they contain much information like shapes, edges, colors, etc
In order to capture the maximum amont of information you need to passes through multiple convolutions which will learn the different aspects of the image.
Imagine that like for example first convolution will learn to recognise a square, the second conv to recognise circles, the third to recognise edges, etc ..
And for my second point, the final fully connected acts like a classifier, the conv network will output a vector that "represents" a dog or a cat, now you need to learn that this kind of vector is one class or the other one.
And directly feeding that vector in the final layer is not enough to learn this representation.
Is that more clear ?
Last update for op's second comment
Here the two ways for defining a Keras model, both output the same thing !
model_input = Input(shape=(200, 1))
x = Dense(32)(model_input)
x = Dense(16)(x)
x = Activation('relu')(x)
model = Model(inputs=model_input, outputs=x)
model = Sequential()
model.add(Dense(32, input_shape=(200, 1)))
model.add(Dense(16, activation = 'relu'))
Example of architecure
model = Sequential()
model.add(keras.layers.InputLayer(input_shape=input_shape))
model.add(keras.layers.convolutional.Conv2D(32, (3,3), strides=(2, 2), padding='same', activation='relu'))
model.add(keras.layers.convolutional.Conv2D(32, (3,3), strides=(2, 2), padding='same', activation='relu'))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(keras.layers.convolutional.Conv2D(64, (3,3), strides=(2, 2), padding='same', activation='relu'))
model.add(keras.layers.convolutional.Conv2D(64, (3,3), strides=(2, 2), padding='same', activation='relu'))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
Don't forget to normalize your data before feeding into your network.
A simple images_full = images_full / 255.0 on your data can boost your accuracy a lot.
Try it with grayscale images too, it's more computaly efficient.
I am trying to build a multi class image classifier using keras cnn. My input size of images is (256,256) pixels. But i used (128,128) instead, since it will take a lot of time to process (256,256)pixel images. But when i test the network with test set i barely get 50% accuracy although i get 97% accuracy during training. I think there is a problem with filters or number of layers. can anyone explain how to improve the efficiency of my cnn based classifier.
I tried changing number of epoches, i used input shape as (64,64) but these are producing small effects.
...enter code here
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import Dropout
import os
classifier = Sequential()
classifier.add(Conv2D(64,(3,3), input_shape = (128,128,3), activation = "relu"))
classifier.add(Conv2D(64,(3,3), input_shape = (128,128,3), activation = "relu"))
classifier.add(Conv2D(32,(3,3), input_shape = (128,128,3), activation = "relu"))
classifier.add(Conv2D(32,(3,3), input_shape = (128,128,3), activation = "relu"))
classifier.add(MaxPooling2D(pool_size = (2,2)))
classifier.add(Flatten())
classifier.add(Dropout(0.5))
classifier.add(Dense(units= 64, activation = "relu"))
classifier.add(Dense(units= 6, activation = "softmax"))
classifier.compile(optimizer = "adam", loss = "categorical_crossentropy", metrics = ['accuracy'])
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory("/home/user/Documents/final_year_project/dataset/training",
target_size = (128,128),
batch_size = 50,
class_mode="categorical")
test_set = test_datagen.flow_from_directory(
"/home/user/Documents/final_year_project/dataset/testing/",
target_size = (128,128),
batch_size = 32,
class_mode="categorical")
from IPython.display import display
from PIL import Image
classifier.fit_generator(training_set, steps_per_epoch=98, epochs=18)
target_dir = '/home/user/Documents/model'
if not os.path.exists(target_dir):
os.mkdir(target_dir)
classifier.save('/home/user/Documents/model/model.h5')
classifier.save_weights('/home/user/Documents/model/weights.h5')
print("Training Completed!!")
There are a couple of obvious improvements (to me) that you can do:
Change batch size to 2 ** n (i.e. 2 to the power of 5: batch_size = 32).
input_shape is reserved for your input layer only (first convolutional layer).
classifier = Sequential()
# Add extraction layers.
classifier.add(Conv2D(64,(3,3), input_shape = (128,128,3),
activation="relu"))
classifier.add(Conv2D(64,(3,3), activation="relu"))
classifier.add(MaxPooling2D(pool_size = (2,2))) # <= this may help as well
classifier.add(Conv2D(32,(3,3), activation="relu"))
classifier.add(Conv2D(32,(3,3), activation="relu"))
classifier.add(MaxPooling2D(pool_size = (2,2)))
# Add classifier layers.
classifier.add(Flatten())
classifier.add(Dropout(0.5)) # might be too big, can try 0.2
classifier.add(Dense(units=64, activation="relu"))
classifier.add(Dense(units=6, activation="softmax"))
classifier.compile(optimizer="adam", loss="categorical_crossentropy",
metrics = ['accuracy'])
MOST IMPORTANT: Add validation data to your training. The training:validation ratio is roughly 80:20.
fit_generator(
generator, # *
steps_per_epoch=None, # **
epochs=20,
verbose=1,
callbacks=None,
validation_data=None, # same format as training generator *
validation_steps=None, # same format as steps_per_epoch **
class_weight=None,
max_queue_size=10,
workers=1,
use_multiprocessing=False,
shuffle=True,
initial_epoch=0
)
I am trying to train a deep neural network using transfer learning in Keras with tensorflow. There are different ways to do that, if your data is small you can afford computing features using the pre-trained model for the entire data and then use those features to train and test a small network, this is good as you don't need to compute those features for each batch and at each epoch. However, if the data is large, it will be impossible to compute features for the entire data, in this case we use ImageDataGenerator, flow_from_directory and fit_generator. In this case features are computed each time fore each batch at each epoch which make things much slower. I was assuming that both approaches produce similar results in terms of accuracy and loss. The problem is that I took a small data-set and tried both approaches and got completely different results. I will appreciate if someone can tell if something is wrong in the provided code and/or why I am getting different results please?
Approach when having large data-set:
from keras.applications.inception_v3 import InceptionV3,preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Model
datagen= ImageDataGenerator(preprocessing_function=preprocess_input)
train_generator = datagen.flow_from_directory('data/train',
class_mode='categorical',
batch_size=64,...)
vaild_generator = datagen.flow_from_directory('data/valid',
class_mode='categorical',
batch_size=64,...)
base_model = InceptionV3(weights='imagenet', include_top=False)
x = base_model.output
x = Conv2D(filters = 128 , kernel_size = (2,2)) (x)
x = MaxPooling2D()(x)
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(2, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer='rmsprop', loss='categorical_crossentropy',...)
model.fit_generator(generator = train_generator,
steps_per_epoch = len (train_generator),
validation_data = valid_generator ,
validation_steps = len(valid_generator),
...)
Approach when having small data-set:
from keras.applications.inception_v3 import InceptionV3,preprocess_input
from keras.models import Sequential
from keras.utils import np_utils
base_model = InceptionV3(weights='imagenet', include_top=False)
train_features = base_model.predict(preprocess_input(train_data))
valid_features = base_model.predict(preprocess_input(valid_data))
model = Sequential()
model.add(Conv2D(filters = 128 , kernel_size = (2,2),
input_shape=(train_features [1],
train_features [2],
train_features [3])))
model.add(MaxPooling2D())
model.add(GlobalAveragePooling2D())
model.add(Dense(1024, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile(optimizer='rmsprop', loss='categorical_crossentropy',...)
model.fit(train_features, np_utils.to_categorical(y_train,2),
validation_data=(valid_features, np_utils.to_categorical(y_valid,2)),
batch_size=64,...)
I am currently developing a convolutional neural network in a keras framework using tensorflow backend that is going to be used to differentiate between a passed or failed indicator. The difference between the two (determining if it is a pass or a fail) is in a small colour change within a tube. However, when I am training the convolutional neural network on the images (approximately 1500 pictures of each) the network seems to always predict passes regardless of the image. My guess is that this is due to the vast similarities in the two but I am not sure why it is unable to detect this colour change as a differentiating feature.
The code that I am currently using to build the classifier is below to provide a reference of where the classifier may be building up such a bias.
# Imports from Keras Library to build Network
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Activation
from keras.callbacks import ModelCheckpoint
from keras.layers import BatchNormalization
# Initialising the CNN as a sequential network
classifier = Sequential()
# Addition of convultional layer
classifier.add(Conv2D(32, kernel_size=(3, 3), input_shape = (356, 356, 3)))
# Adding a dropout to prevent overstabilization on certain nodes
# Adding a second/third/fourth convolutional/pooling/dropout layer
classifier.add(BatchNormalization())
classifier.add(Activation("relu"))
classifier.add(Conv2D(32, (3, 3)))
classifier.add(BatchNormalization())
classifier.add(Activation("relu"))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Dropout(0.25))
classifier.add(Conv2D(32, (3, 3)))
classifier.add(BatchNormalization())
classifier.add(Activation("relu"))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Dropout(0.25))
classifier.add(Conv2D(64, (3, 3)))
classifier.add(BatchNormalization())
classifier.add(Activation("relu"))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Dropout(0.25))
# Flattening Layer
classifier.add(Flatten())
# Full connection using dense layers
classifier.add(Dense(units = 128))
classifier.add(BatchNormalization())
classifier.add(Activation("relu"))
classifier.add(Dense(units = 2, activation = 'softmax'))
# Compiling the CNN
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
classifier.summary()
# Fitting the CNN to the images
from keras.preprocessing.image import ImageDataGenerator
# Taining image generator (causes variation in how images may appear when trained upon)
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.4,
zoom_range = 0.4,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
# Creation of training set
training_set = train_datagen.flow_from_directory('dataset/TrainingSet',
target_size = (356, 356),
batch_size = 32,
class_mode = 'categorical',
shuffle = True)
# Creation of test set
test_set = test_datagen.flow_from_directory('dataset/TestSet',
target_size = (356, 356),
batch_size = 32,
class_mode = 'categorical',
shuffle = True)
caller = ModelCheckpoint('/Users/anishkhanna/Documents/Work/BI Test/BI Models/part3.weights.{epoch:02d}-{val_loss:.2f}.hdf5', monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', period=1)
# Training the model based on above set
# Can also be improved with more images
classifier.fit_generator(training_set,
steps_per_epoch = 200,
epochs = 200,
validation_data = test_set,
validation_steps = 15,
shuffle = True,
callbacks = [caller])
# Creates a HDF5 file to save the imformation of the model so it can be used later without retraining
classifier.save('BI_Test_Classifier_model.h5')
# Deletes the existing model
del classifier
If there are some improvements to the model that I could make or suggestions to it would be much appreciated.
If your distinguishing feature is mainly the colour, you can pre-process to help the neural network. In this case, you can convert RGB into Hue Saturation Value (HSV) and just use for example the Hue channel which will contain information about the colour of pixels and ignore shading etc. Here is a post on that and you can use it as preprocessing_function for ImageDataGenerator.