How do i interpret results from Keras predict_generator? - python

I am performing binary classification of data. I am using predict_generator to obtain the classification results. The input to the predict_generator are 44 examples, 22 positive, 22 negative. The following is the output obtained:
[9.98187363e-01 1.81267178e-03]
[5.02341951e-04 9.99497652e-01]
[8.41189444e-01 1.58810586e-01]
[7.26610771e-04 9.99273360e-01]
[9.96826649e-01 3.17337317e-03]
[8.83008718e-01 1.16991334e-01]
[3.84130690e-04 9.99615788e-01]
[8.65039527e-01 1.34960532e-01]
[1.78014021e-03 9.98219788e-01]
[9.96107757e-01 3.89222591e-03]
[6.16264821e-04 9.99383688e-01]
[2.98170745e-03 9.97018337e-01]
[9.92357790e-01 7.64221745e-03]
[9.93237853e-01 6.76209433e-03]
[9.98248339e-01 1.75163767e-03]
[1.17816392e-03 9.98821795e-01]
[9.84322488e-01 1.56775210e-02]
[3.11790430e-03 9.96882081e-01]
[4.62388212e-04 9.99537587e-01]
[1.42699364e-03 9.98572946e-01]
[9.43281949e-01 5.67180961e-02]
[9.98008907e-01 1.99115812e-03]
[4.12312744e-04 9.99587715e-01]
[9.29474115e-01 7.05258474e-02]
[3.37766513e-04 9.99662280e-01]
[1.75693433e-03 9.98243093e-01]
[9.92154586e-04 9.99007881e-01]
[1.87152205e-03 9.98128474e-01]
[9.20654461e-02 9.07934546e-01]
[9.95722532e-01 4.27750358e-03]
[9.96877313e-01 3.12273414e-03]
[9.87601459e-01 1.23985587e-02]
[1.11398198e-01 8.88601840e-01]
[1.48968585e-02 9.85103130e-01]
[6.73048152e-03 9.93269503e-01]
[1.65761902e-03 9.98342395e-01]
[9.94634032e-01 5.36595425e-03]
[5.00697970e-01 4.99302000e-01]
[1.65578525e-03 9.98344183e-01]
[9.68859911e-01 3.11401486e-02]
CODE:
from keras.applications import Xception
from keras.models import Model
from keras.layers import Dense, Input, Dropout
from keras.optimizers import Nadam
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.preprocessing.image import ImageDataGenerator
img_height = 299
img_width = 299
no_of_frames = 15
channels = 3
no_of_epochs = 50
batch_size_value = 60
cnn_base = Xception(input_shape=(img_width, img_height, channels),
weights="imagenet", include_top=False, pooling='avg')
cnn_base.trainable = False
hidden_layer_1 = Dense(activation="relu", units=1024)(cnn_base.output)
drop_layer=Dropout(0.2)(hidden_layer_1)
hidden_layer_2 = Dense(activation="relu", units=512)(drop_layer)
outputs = Dense(2, activation="softmax")(hidden_layer_2)
model = Model(cnn_base.input, outputs)
nadam_optimizer = Nadam(lr=0.0001, beta_1=0.9, beta_2=0.999,
epsilon=1e-08, schedule_decay=0.004)
model.compile(loss="categorical_crossentropy",
optimizer=nadam_optimizer, metrics=["accuracy"])
# for data augmentation
train_datagen = ImageDataGenerator( zoom_range=.1, rotation_range=8,
width_shift_range=.2, height_shift_range=.2)
train_generator = train_datagen.flow_from_directory(
'/home/Train', # this is the target directory
target_size=(img_width, img_height),
batch_size=batch_size_value,
class_mode="categorical")
validation_generator = ImageDataGenerator().flow_from_directory(
'/home/Val',
target_size=(img_width, img_height),
batch_size=batch_size_value,
class_mode="categorical")
history = model.fit_generator(
train_generator,
validation_data=validation_generator,
verbose=1,
epochs=no_of_epochs,
steps_per_epoch=17,
validation_steps=7)
Test_generator = ImageDataGenerator().flow_from_directory(
'/home/Test',
target_size=(img_width, img_height),
batch_size=batch_size_value,
class_mode="categorical")
Prob_val = model.predict_generator(test_set)
print((Prob_val))
I assume they are probabilities, but the column contains only 40 entries. How do they correspond with the 44 inputs example ?

Related

How to train imbalanced data with Keras?

I am trying to experiment on ISIC 2019 data as a newbie. Firstly, I downloaded the training data and divided the data into 3 parts as train, test, and validation data, and every dataset folder contains 2 subfolders which are benign and malignant. In short, I just moved all the categories into benign folders except the melenoma category and melanoma images are inside malignant folders. After the division, I get imbalanced data. In the training dataset for benign data, I get 16596 images and for malignant data, I get 3629 images. I tried to train my data and I couldn't get a good result for malignant and my precision value was about 0.18 for malignant. I used ResNet50 to train my model and I would like to ask how can I train my model without data augmentation and oversampling? I am also trying decayed learning metrics at the moment and it seems it won't give a good result too.
import os
import tensorflow as tf
import math
import numpy as np
import matplotlib.pyplot as plt
from tensorflow import keras
from keras.applications.resnet50 import ResNet50, preprocess_input
from keras.layers import Dense, GlobalMaxPooling2D
from keras.models import Model
from keras.optimizers import Adam
from sklearn.metrics import roc_curve
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_examples = 20225
test_examples = 2551
validation_examples = 2555
img_height = img_width = 224
channel = 3
batch_size = 32
base_model = ResNet50(weights = 'imagenet' , include_top = False, input_shape = (img_height, img_width, channel))
x = base_model.output
x = GlobalMaxPooling2D()(x)
x = Dense(1, activation= 'sigmoid')(x)
model = Model(
inputs = base_model.input,
outputs = x)
model.summary()
train_datagen = ImageDataGenerator(
rotation_range = 20,
width_shift_range=0.10,
height_shift_range=0.10,
zoom_range = 0.10,
horizontal_flip = True,
preprocessing_function = preprocess_input,
fill_mode='nearest'
)
validation_datagen = ImageDataGenerator(
preprocessing_function = preprocess_input,
)
test_datagen = ImageDataGenerator(
preprocessing_function = preprocess_input,
)
train_gen = train_datagen.flow_from_directory(
"dataset/train/",
target_size = (img_height, img_width),
batch_size = batch_size,
color_mode = "rgb",
class_mode = "binary",
shuffle = True,
seed = 123,
)
validation_gen = validation_datagen.flow_from_directory(
"dataset/validation/",
target_size = (img_height, img_width),
batch_size = batch_size,
color_mode = "rgb",
class_mode = "binary",
shuffle = True,
seed = 123,
)
test_gen = test_datagen.flow_from_directory(
"dataset/test/",
target_size =(img_height, img_width),
batch_size = batch_size,
color_mode = "rgb",
class_mode = "binary",
shuffle = True,
seed = 123,
)
METRICS = [
keras.metrics.Precision(name = "precision"),
keras.metrics.Recall(name = "recall"),
keras.metrics.AUC(name = "auc"),
]
model.compile(
optimizer = Adam(lr = 3e-4),
loss = [keras.losses.BinaryCrossentropy(from_logits = False)],
metrics = METRICS,
)
history = model.fit(train_gen,
epochs=50,
verbose=1,
validation_data=validation_gen,
callbacks=[keras.callbacks.ModelCheckpoint("isic_binary_model")],
)

Image classification in python with matching score

I am having a code for image classification. It gives me result that image belongs to which class. But i want to print matching score or percentage matching of the image with all classes. so that i can fix some threshold value. Here is my code:
import tensorflow as tf
import numpy as np
import os
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.constraints import maxnorm
from keras.utils import np_utils
classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), input_shape = (64,64,3 ),activation="relu"))
classifier.add(MaxPooling2D(pool_size = (2,2)))
classifier.add(Flatten())
classifier.add(Dense(128 , kernel_initializer ='uniform' , activation = 'relu'))
classifier.add(Dense(10 , kernel_initializer ='uniform' , activation = 'softmax'))
classifier.compile(optimizer = 'rmsprop', loss = 'categorical_crossentropy' , metrics = ['accuracy'])
from keras_preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
training_set = train_datagen.flow_from_directory(
'/code/train',
shuffle=True,
target_size=(64,64),
batch_size=5,
class_mode='categorical',
classes=["shiv", "kart", "nall","sur","harshi","nag","saura","rajan","man","abhim"])
test_set = test_datagen.flow_from_directory(
'/code/validation',
shuffle=True,
target_size=(64,64),
batch_size=5,
class_mode='categorical',
classes=["shiv", "kart", "nall","sur","harshi","nag","saura","rajan","man","abhim"])
from IPython.display import display
from PIL import Image
classifier.fit_generator(
training_set,
steps_per_epoch=80,
epochs=12,
validation_data=test_set,
validation_steps=100)
from keras_preprocessing import image
files_dir = '/code/test_image_clasification1'
files = os.listdir(files_dir)
for f in files:
image_path = files_dir + '/' + f
test_image = image.load_img(image_path,target_size = (64, 64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = classifier.predict(test_image)
#classes = classifier.predict_classes(test_image)
#print (classes)
labels=["shi","kart","nal","sure","harshi","nage","saura","rajan","man","abhim"]
indx=np.argmax(result)
print(f,labels[indx])
Here lables[indx] gives me image belongs to which class. But i require some function so that i can get match score of test image with all classes.
It's in result?
print(list(zip(labels, result)))

TypeError: flow() missing 1 required positional argument: 'x'

I tried to run this code but I'm still stuck.
In this code I use a pretrained neural resnet50 and I tried to extract a deep feature and predict my classes.
Please, if anyone had this error, let me know how I can fix it ?
Thanks
NUM_CLASSES = 2
CHANNELS = 3
IMAGE_RESIZE = 224
RESNET50_POOLING_AVERAGE = 'avg'
DENSE_LAYER_ACTIVATION = 'softmax'
OBJECTIVE_FUNCTION = 'binary_crossentropy'
LOSS_METRICS = ['accuracy']
NUM_EPOCHS = 10
EARLY_STOP_PATIENCE = 3
STEPS_PER_EPOCH_TRAINING = 10
STEPS_PER_EPOCH_VALIDATION = 10
batch_size = 32
from keras.models import load_model
BATCH_SIZE_TRAINING = 100
BATCH_SIZE_VALIDATION = 100
image_size = IMAGE_RESIZE
WEIGHTS_PATH = "C:\\Users\\Desktop\\RESNET \\resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5"
model = Sequential()
train_data_dir = "C:\\Users\\Desktop\\RESNET"
model = ResNet50(include_top=True, weights='imagenet')
model.layers.pop()
model = Model(input=model.input,output=model.layers[-1].output)
model.summary()
model.compile(loss='binary_crossentropy', optimizer=SGD(lr=0.01, momentum=0.9), metrics=['binary_accuracy'])
data_dir = "C:\\Users\\Desktop\\RESNET"
data_generator = ImageDataGenerator(preprocessing_function=preprocess_input)
train_datagenerator = ImageDataGenerator(rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
validation_split=0.2)
train_generator = train_datagenerator.flow_from_directory(
train_data_dir,
target_size=(image_size, image_size),
batch_size=BATCH_SIZE_TRAINING,
class_mode='categorical', shuffle=False, subset='training') # set as training data
validation_generator = train_datagenerator.flow_from_directory(
train_data_dir, # same directory as training data kifkif
target_size=(image_size, image_size),
batch_size=BATCH_SIZE_TRAINING,
class_mode='categorical', shuffle=False, subset='validation') # set as validation data
generator = data_generator.flow(batch_size=batch_size)
batch_size = 32
X_train = np.zeros((len(train_generator.images_ids_in_subset),2048))
Y_train = np.zeros((len(train_generator.images_ids_in_subset),2))
nb_batches = int(len(train_generator.images_ids_in_subset) / batch_size) + 1
Let me know if you have any issue of this problem
Thanks for your help
Delete this line
generator = data_generator.flow(batch_size=batch_size)
It does nothing if your code ends there.
The flow method is for transform the already in the ram data but your code doesn't have that.

ValueError: Error when checking target: expected flatten_1 to have shape (2048,) but got array with shape (2,)

I'm trying to run this code, and I have this error:
ValueError: Error when checking target: expected flatten_4 to have shape (2048,) but got array with shape (2,)
NUM_CLASSES = 2
CHANNELS = 3
IMAGE_RESIZE = 224
RESNET50_POOLING_AVERAGE = 'avg'
DENSE_LAYER_ACTIVATION = 'softmax'
OBJECTIVE_FUNCTION = 'categorical_crossentropy'
NUM_EPOCHS = 10
EARLY_STOP_PATIENCE = 3
STEPS_PER_EPOCH_TRAINING = 10
STEPS_PER_EPOCH_VALIDATION = 10
BATCH_SIZE_TRAINING = 100
BATCH_SIZE_VALIDATION = 100
BATCH_SIZE_TESTING = 1
resnet_weights_path = '../input/resnet50/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5'
model = Sequential()
train_data_dir = "C:\\Users\\Desktop\\RESNET"
model = ResNet50(include_top=True, weights='imagenet')
model.layers.pop()
model = Model(input=model.input,output=model.layers[-1].output)
model.summary()
sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer=SGD(lr=0.01, momentum=0.9), metrics= ['binary_accuracy'])
data_dir = "C:\\Users\\Desktop\\RESNET"
batch_size = 32
from keras.applications.resnet50 import preprocess_input
from keras.preprocessing.image import ImageDataGenerator
image_size = IMAGE_RESIZE
data_generator = ImageDataGenerator(preprocessing_function=preprocess_input)
def append_ext(fn):
return fn+".jpg"
from os import listdir
from os.path import isfile, join
dir_path = os.path.dirname(os.path.realpath(__file__))
train_dir_path = dir_path + '\data'
onlyfiles = [f for f in listdir(dir_path) if isfile(join(dir_path, f))]
data_labels = [0, 1]
t = []
maxi = 25145
LieOffset = 15799
i = 0
while i < maxi: # t = tuple
if i <= LieOffset:
t.append(label['Lie'])
else:
t.append(label['Truth'])
i = i+1
train_datagenerator = ImageDataGenerator(rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
validation_split=0.2)
train_generator = train_datagenerator.flow_from_directory(
train_data_dir,
target_size=(image_size, image_size),
batch_size=BATCH_SIZE_TRAINING,
class_mode='categorical', shuffle=False, subset='training')
validation_generator = train_datagenerator.flow_from_directory(
train_data_dir, # same directory as training data kifkif
target_size=(image_size, image_size),
batch_size=BATCH_SIZE_TRAINING,
class_mode='categorical', shuffle=False, subset='validation')
(BATCH_SIZE_TRAINING, len(train_generator), BATCH_SIZE_VALIDATION, len(validation_generator))
from tensorflow.python.keras.callbacks import EarlyStopping, ModelCheckpoint
cb_early_stopper = EarlyStopping(monitor = 'val_loss', patience = EARLY_STOP_PATIENCE)
cb_checkpointer = ModelCheckpoint(filepath = '../working/best.hdf5', monitor = 'val_loss', save_best_only = True, mode = 'auto')
from sklearn.grid_search import ParameterGrid
param_grid = {'epochs': [5, 10, 15], 'steps_per_epoch' : [10, 20, 50]}
grid = ParameterGrid(param_grid)
val_loss as final model
for params in grid:
print(params)
fit_history = model.fit_generator(
train_generator,
steps_per_epoch=STEPS_PER_EPOCH_TRAINING,
epochs = NUM_EPOCHS,
validation_data=validation_generator,
validation_steps=STEPS_PER_EPOCH_VALIDATION,
callbacks=[cb_checkpointer, cb_early_stopper])
model.load_weights("../working/best.hdf5")
The error suggests that your models output layer should have 2 nodes whereas you have 2048 as you are using the output of avg_pool layer of ResNet50 model as your model output. So, you can add a Dense layer having 2 nodes on top of the avg_pool layer to solve the problem.
model = ResNet50(include_top=True, weights='imagenet')
print(model.summary())
x = model.get_layer('avg_pool').output
predictions = Dense(2, activation='sigmoid')(x)
model = Model(input = model.input, output = predictions)
print(model.summary())
As I'm not quite sure about what type of problem you are solving, i assumed that multilabel (2) classification as your data label shape is (2,).
However, if you are solving a binary classification problem then you need to change your label so that it's either 1 or 0. So, Change class_mode='categorical' to class_mode='binary' in both train_generator and validation_generator. In that case the model output layer should have 1 node.
predictions = Dense(1, activation='sigmoid')(x)

shape mismatch model InceptionResNetV2 & weigts

I am using InceptionResNetV2 for image classification & using repective weight. But get error :
ValueError: You are trying to load a weight file containing 449 layers into a model with 448 layers.
img_ht = 96
img_wid = 96
img_chnl = 3
import tensorflow as tf
from tensorflow import keras
from keras_preprocessing.image import ImageDataGenerator
train_generator = train_datagen.flow_from_directory(
directory = "../input/cassava-disease/train/train/",
subset="training",
batch_size = 49,
seed=42,
shuffle=False,
class_mode="categorical",
target_size=(img_ht, img_wid))
valid_generator = train_datagen.flow_from_directory(
directory = "../input/cassava-disease/train/train/",
subset="validation",
batch_size=49,
seed=42,
shuffle=False,
class_mode="categorical",
target_size = (img_ht, img_wid))
from keras.applications import InceptionResNetV2 as InceptionResNetV2
base_model = keras.applications.InceptionResNetV2(input_shape=(img_ht, img_wid, 3),
include_top = False,
weights = "../input/inception/inception_resnet_v2_weights_tf_dim_ordering_tf_kernels.h5")
base_model.trainable = False
print(base_model.summary())
Got the answer. It's because of line --> include_top = False.
Quite new to python & Machine Learning

Categories