Keras fit_generator is too slow while image proccesing - python

I'm making a CNN to guess a picture whether it's a dog or cat. I'm using Keras for this purpose. I have training set size of 8000 images and when I run the code, time for each epoch to be completed is around 30 minutes. Total number of epochs is 25. How can I run the code faster? By the way my RAM is 8GB.
# Part 1 - Building the CNN
# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
# Initialising the CNN
classifier = Sequential()
# Step 1 - Convolution
classifier.add(Conv2D(32, (3, 3), input_shape=(64, 64, 3), activation='relu'))
# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size=(2, 2)))
# Step 3 - Flattening
classifier.add(Flatten())
# Step 4 - Full connection
classifier.add(Dense(units=128, activation='relu'))
classifier.add(Dense(units=1, activation='sigmoid'))
# Compiling the CNN
classifier.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Part 2 - Fitting the CNN to the images
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
training_set = train_datagen.flow_from_directory(
'dataset/training_set',
target_size=(64, 64),
batch_size=32,
class_mode='binary')
test_set = test_datagen.flow_from_directory(
'dataset/test_set',
target_size=(64, 64),
batch_size=32,
class_mode='binary')
classifier.fit_generator(
training_set,
steps_per_epoch=8000,
epochs=25,
validation_data=test_set,
validation_steps=2000)

Related

Input ran out of Data error but the data is there

Good day, Trying to learn CNN and ran into an issue while running the code below.
from tensorflow.keras.layers import Flatten
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Convolution2D
from tensorflow.keras.layers import MaxPooling2D
import pandas as pd
import numpy as np
import matplotlib.pyplot
%matplotlib inline
model = Sequential()
model.add(Convolution2D(32, 3, 3, input_shape=(64, 64, 3), activation='relu')
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Convolution2D(32, 3, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(units = 128, activation = 'relu'))
model.add(Dense(units = 1, activation = 'sigmoid'))
model.compile(optimizer = 'rmsprop', loss='mse', metrics=['accuracy'])
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale = 1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
training_set = train_datagen.flow_from_directory(
r'C:\Users\Raj Mulati\Downloads\Dev\Machine Learning A-Z New\Part 8 - Deep Learning\Section 40 -
Convolutional Neural Networks (CNN)\dataset\training_set',
target_size=(64, 64),
batch_size=32,
class_mode='binary')
test_set = test_datagen.flow_from_directory(
r'C:\\Users\Raj Mulati\\Downloads\\Dev\\Machine Learning A-Z New\Part 8 - Deep
Learning\\Section 40 - Convolutional Neural Networks (CNN)\\dataset\\test_set',
target_size=(64, 64),
batch_size=32,
class_mode='binary')
model.fit_generator(
training_set,
steps_per_epoch=8000,
epochs=25,
validation_data=test_set,
validation_steps=2000
)
The error I got was:
Found 8000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.
WARNING:tensorflow:sample_weight modes were coerced from
...
to
['...']
WARNING:tensorflow:sample_weight modes were coerced from
...
to
['...']
Train for 8000 steps, validate for 2000 steps
Epoch 1/25
250/8000 [..............................] - ETA: 14:37 - loss: 0.2485 - accuracy: 0.5340WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 200000 batches). You may need to use the repeat() function when building your dataset.
<tensorflow.python.keras.callbacks.History at 0x234d9fec3c8>
One step takes a complete batch of images, i.e. if your batch_size is 32, you run out of data after 250 steps (250 * 32 = 8000). Set your steps_per_epoch and validation_steps like this:
model.fit_generator(
training_set,
steps_per_epoch=8000//32,
epochs=25,
validation_data=test_set,
validation_steps=2000//32
)

How to set filters for convoltional neural network

I am trying to build a multi class image classifier using keras cnn. My input size of images is (256,256) pixels. But i used (128,128) instead, since it will take a lot of time to process (256,256)pixel images. But when i test the network with test set i barely get 50% accuracy although i get 97% accuracy during training. I think there is a problem with filters or number of layers. can anyone explain how to improve the efficiency of my cnn based classifier.
I tried changing number of epoches, i used input shape as (64,64) but these are producing small effects.
...enter code here
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import Dropout
import os
classifier = Sequential()
classifier.add(Conv2D(64,(3,3), input_shape = (128,128,3), activation = "relu"))
classifier.add(Conv2D(64,(3,3), input_shape = (128,128,3), activation = "relu"))
classifier.add(Conv2D(32,(3,3), input_shape = (128,128,3), activation = "relu"))
classifier.add(Conv2D(32,(3,3), input_shape = (128,128,3), activation = "relu"))
classifier.add(MaxPooling2D(pool_size = (2,2)))
classifier.add(Flatten())
classifier.add(Dropout(0.5))
classifier.add(Dense(units= 64, activation = "relu"))
classifier.add(Dense(units= 6, activation = "softmax"))
classifier.compile(optimizer = "adam", loss = "categorical_crossentropy", metrics = ['accuracy'])
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory("/home/user/Documents/final_year_project/dataset/training",
target_size = (128,128),
batch_size = 50,
class_mode="categorical")
test_set = test_datagen.flow_from_directory(
"/home/user/Documents/final_year_project/dataset/testing/",
target_size = (128,128),
batch_size = 32,
class_mode="categorical")
from IPython.display import display
from PIL import Image
classifier.fit_generator(training_set, steps_per_epoch=98, epochs=18)
target_dir = '/home/user/Documents/model'
if not os.path.exists(target_dir):
os.mkdir(target_dir)
classifier.save('/home/user/Documents/model/model.h5')
classifier.save_weights('/home/user/Documents/model/weights.h5')
print("Training Completed!!")
There are a couple of obvious improvements (to me) that you can do:
Change batch size to 2 ** n (i.e. 2 to the power of 5: batch_size = 32).
input_shape is reserved for your input layer only (first convolutional layer).
classifier = Sequential()
# Add extraction layers.
classifier.add(Conv2D(64,(3,3), input_shape = (128,128,3),
activation="relu"))
classifier.add(Conv2D(64,(3,3), activation="relu"))
classifier.add(MaxPooling2D(pool_size = (2,2))) # <= this may help as well
classifier.add(Conv2D(32,(3,3), activation="relu"))
classifier.add(Conv2D(32,(3,3), activation="relu"))
classifier.add(MaxPooling2D(pool_size = (2,2)))
# Add classifier layers.
classifier.add(Flatten())
classifier.add(Dropout(0.5)) # might be too big, can try 0.2
classifier.add(Dense(units=64, activation="relu"))
classifier.add(Dense(units=6, activation="softmax"))
classifier.compile(optimizer="adam", loss="categorical_crossentropy",
metrics = ['accuracy'])
MOST IMPORTANT: Add validation data to your training. The training:validation ratio is roughly 80:20.
fit_generator(
generator, # *
steps_per_epoch=None, # **
epochs=20,
verbose=1,
callbacks=None,
validation_data=None, # same format as training generator *
validation_steps=None, # same format as steps_per_epoch **
class_weight=None,
max_queue_size=10,
workers=1,
use_multiprocessing=False,
shuffle=True,
initial_epoch=0
)

How to train a cnn with two classes with different frequency?

I am training a simple Convolutional Neural Network (CNN) which should perform a binary classification. The package I am using is keras.
What I need is my training set to be unbalanced. For example, one of the classes should be trained with 900 images, and the other one with only 300 images.
The code I am using is the following:
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
classifier = Sequential()
classifier.add(Conv2D(32, (3, 3),
input_shape=(64, 64, 3),
activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
classifier.add(Flatten())
classifier.add(Dense(units=128, activation='relu'))
classifier.add(Dense(units=1, activation='sigmoid'))
classifier.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
training_set = train_datagen.flow_from_directory('dataset/training_set',
target_size=(64, 64),
batch_size=32,
class_mode='binary')
test_set = test_datagen.flow_from_directory('dataset/test_set',
target_size=(64, 64),
batch_size=32,
class_mode='binary')
classifier.fit_generator(training_set,
steps_per_epoch=1200,
epochs=30,
validation_data=test_set,
validation_steps=50)
Right now the model is being trained with a batch_size of 32.
I am guessing that this means that it takes 16 training examples from one of the classes and 16 from the other?
What I need is to take 24 training examples from one of the classes and 8 examples from the other.
Probably, I should amend the flow_from_directory() function concerning the training data set in some way. Unfortunately, there is nothing connected to that in the keras documentation.
Do you have any suggestions?

Python Keras - CNN stuck on epoch 1

from keras import *
import os
import numpy as np
from keras.models import Sequential
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras import optimizers
#from parser import load_data # data loading
# Collecting data:
img_width, img_height = 150, 150
training_data_dir = "train"
testing_data_dir = "test"
# used to rescale the pixel values from [0, 255] to [0, 1] interval
datagen = ImageDataGenerator(rescale=1./255)
# automagically retrieve images and their classes for train and validation sets
train_generator = datagen.flow_from_directory(
training_data_dir,
target_size=(img_width, img_height),
batch_size=16,
class_mode='binary')
test_generator = datagen.flow_from_directory(
testing_data_dir,
target_size=(img_width, img_height),
batch_size=32,
class_mode='binary')
# Building model:
model = Sequential()
model.add(Convolution2D(32, 3, 3, input_shape=(img_width, img_height,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss="binary_crossentropy",
optimizer="rmsprop",
metrics=["accuracy"])
# Training model:
nb_epoch = 30
nb_train_samples = 2048
nb_validation_samples = 832
model.fit_generator(
train_generator,
samples_per_epoch=nb_train_samples,
nb_epoch=nb_epoch,
validation_data=test_generator,
nb_val_samples=nb_validation_samples)
This is my code for a CNN which is trained using images from the folders train and test. But whenever I try training it, the program seems to get stuck at epoch 1/30 all the time, i left it on overnight for 8 hours and it hasn't moved along at all, any fixes I could try?
Update:
The output of my code currently is:
Using TensorFlow backend.
Found 0 images belonging to 0 classes.
Found 0 images belonging to 0 classes.
image_classifiy.py:78: UserWarning: Update your fit_generator call to the Keras 2 API: fit_generator(<keras_pre..., epochs=30, validation_data=<keras_pre..., validation_steps=832, steps_per_epoch=128)
steps_per_epoch=128)
Epoch 1/30
By decoding "Found 0 images belonging to 0 classes", one can conclude that the subdirectories for each class are not created. In keras, there must be a folder for every class and in that folder it must contain the images. So, make sure you make subdirectories for each class inside the train and test folders.

Class lables absent from Keras Predictions

I am trying to solve the Cats vs Dogs problem using Keras. Here is the model I am using.
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
from keras import regularizers
from keras.utils import plot_model
img_width, img_height = 150, 150
train_data_dir = 'kateVSdoge/train'
validation_data_dir = 'kateVSdoge/validation'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1,kernel_regularizer=regularizers.l2(0.01),
activity_regularizer=regularizers.l1(0.01)))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary',
)
xm=model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size)
model.save_weights('first_try3.h5')
model_json=model.to_json()
with open("model3.json","w+") as json_file:
json_file.write(model_json)
plot_model(model,to_file="model.jpeg")
The model trains well accuracy at the end is 0.79-0.80. But when I try to load the model in a predictor script and predict using the model.predict_generator() I seem to be doing something wrong as I cant get the class names in the prediction. I have tried .predict() and .predict_proba() without any success.
Here is the predictor script:
from keras.models import Sequential, model_from_json
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
import numpy as np
p_model = Sequential();
jsonfile = open('model3.json','r')
model_json = jsonfile.read()
p_model = model_from_json(model_json)
p_model.load_weights('first_try3.h5')
p_model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
img = image.load_img('do.jpg', target_size=(150,150))
x=image.img_to_array(img)
x=x.reshape((1,)+x.shape)
test_datagen = ImageDataGenerator(rescale=1. /255)
m=test_datagen.flow(x,batch_size=1)
preds = p_model.predict_generator(m,1,verbose=1)
print preds
Also I observed an Interesting thing , The image doesn't seem to rescale.
I printed out x and m.x , both the matrices seem to be equal and the values don't transform to be between 0 and 1.
Here is the output for a cat and a dog's picture respectively.
(myenv)link#zero-VirtualBox:~/myenv/keras_app$ python predictor.py
Using Theano backend.
1/1 [==============================] - 0s
[[ 0.29857877]]
(myenv)link#zero-VirtualBox:~/myenv/keras_app$ python predictor.py
Using Theano backend.
1/1 [==============================] - 0s
[[ 0.77536112]]
I have used the advice given here https://stackoverflow.com/a/41833076/4159447 to introduce regularizers and rescale.
What am I doing wrong? All I want is to get the cat and dog labels against their scores.
The only wrong thing is to expect class names from a classifier. The classifier doesn't know the class names, that is a post-processing step, something like:
preds = p_model.predict_generator(m,1,verbose=1)[0]
if preds > 0.5:
output = "cat"
else:
output = "dog"
Note that 0.5 might not be the best threshold, you can also take the class with biggest probability (p vs 1 - p).

Categories