Same input results in different prediction in two times - python

I read an image from file and call predict method of Keras Inception v3 model. And I found two different results from one input.
from keras.applications.inception_v3 import InceptionV3, decode_predictions
from keras.preprocessing import image
import numpy as np
def model():
model = InceptionV3(weights='imagenet')
def predict(x):
x *= 2
x -= 1
return model.predict(np.array([x]))[0]
return predict
img = image.load_img("2.jpg", target_size=(299, 299))
img = image.img_to_array(img)
img /= 255.
p = model()
print('Predicted:', decode_predictions(np.array([p(img)]), top=3)[0])
print('Predicted:', decode_predictions(np.array([p(img)]), top=3)[0])
The output is
Predicted: [('n01443537', 'goldfish', 0.98162466), ('n02701002', 'ambulance', 0.0010537759), ('n01440764', 'tench', 0.00027527584)]
Predicted: [('n02606052', 'rock_beauty', 0.69015616), ('n01990800', 'isopod', 0.039278224), ('n01443537', 'goldfish', 0.03365362)]
where the first result is correct.

You are modifying your input (img) in the predict function not just locally, as you might expect. That modified input is used in the next predict, where it again is modified. So you are effectively appying the modifications once for in your first call to predict, but twice in the second call.
You can find more details about that behavior in this question.

Related

How to understand the dimensions of the result array after using model.predict()

I'm recurrencing a code to retrieve the item, but when I debug in model.predict function, I find that the input of this function is with the dimension(1, 224, 224, 3), but the output is (1, 7, 7, 2048). Shouldn't the result of model.predict() be a 1D array which give the probability that the object belongs to each category instead of 4D? How to understand the dimension of this result array?
model_features = model.predict(x, batch_size=1)
The concrete code is following:
(This is only part of the whole code and may not run directly)
import keras.applications.resnet50
import numpy as np
import os
import pickle
import time
import vse
from keras.preprocessing import image
from keras.models import Model, load_model
model = keras.applications.resnet50.ResNet50(include_top=False)
model_extension == "resnet"
def extract_features_cnn(img_path):
"""Returns a normalized features vector for image path and model specified in parameters file """
print('Using model', model_extension)
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
if model_extension == "vgg19":
x = keras.applications.vgg19.preprocess_input(x)
elif model_extension == "vgg16":
x = keras.applications.vgg16.preprocess_input(x)
elif model_extension == "resnet":
x = keras.applications.resnet50.preprocess_input(x)
else:
print('Wrong model name')
model_features = model.predict(x, batch_size=1)
x = model_features[0]
total_sum = sum(model_features[0])
features_norm = np.array([val / total_sum for val in model_features[0]], dtype=np.float32)
if model_extension == "resnet":
print("reshaping resnet")
features_norm = features_norm.reshape(2048, -1)
return features_norm
Your question is not clear enough but I will try to explain as much as I can understand your question. Your model only has ResNet which has the only convolutional layers and it does not have a linear layer which can cause a result that represents the probability of classes. Your result is not 4D as you think. In your output shape which is (1, 7, 7, 2048), 1 represents batch size. It means you gave only 1 image to the network and get 1 result. 7s represents your output size which is 7x7. And 2048 represents your output channels. If you want to have the probability of classes you need to add a linear layer at the end of the ResNet network. You can add it with the argument include_top=True and you can specify class number with argument classes=1000.
Here is the documentation.

How to train transfer-learning model on custom dataset? ValueError: Shape must be rank 4

I am trying to build a transfer learning model to classify images. The images are a gray-scale (2D). previously I used image_dataset_from_directory method to read the images and there was no problem. However, I am trying to use a custom read function to have more control and access on the data such as knowing how many images in each class. When using this custom read function, I get an error (down below) while trying to train the model. I am not sure about what caused this error.
part1: reading the dataset
import numpy as np
import os
import tensorflow as tf
import cv2
from tensorflow import keras
# neural network
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers.experimental import preprocessing
IMG_WIDTH=160
IMG_HEIGHT=160
DATA_PATH = r"C:\Users\user\Documents\chest_xray"
TRAIN_DIR = os.path.join(DATA_PATH, 'train')
def create_dataset(img_folder):
img_data_array=[]
class_name=[]
for dir1 in os.listdir(img_folder):
for file in os.listdir(os.path.join(img_folder, dir1)):
image_path= os.path.join(img_folder, dir1, file)
image= cv2.imread( image_path, 0)
image=cv2.resize(image, (IMG_HEIGHT, IMG_WIDTH),interpolation = cv2.INTER_AREA)
image=np.array(image)
image = image.astype('float32')
image /= 255
img_data_array.append(image)
class_name.append(dir1)
return img_data_array, class_name
# extract the image array and class name
img_data, class_name =create_dataset(TRAIN_DIR)
target_dict={k: v for v, k in enumerate(np.unique(class_name))}
target_dict
target_val= [target_dict[class_name[i]] for i in range(len(class_name))]
this part will produce A list that has a size of 5232. inside the list there are numpy arrays of size 160X160 (float 32)
part 2: creating the model
def build_model():
inputs = tf.keras.Input(shape=(160, 160, 3))
x = Sequential(
[
preprocessing.RandomRotation(factor=0.15),
preprocessing.RandomTranslation(height_factor=0.1, width_factor=0.1),
preprocessing.RandomFlip(),
preprocessing.RandomContrast(factor=0.1),
],
name="img_augmentation",
)(inputs)
# x = img_augmentation(inputs)
model=tf.keras.applications.EfficientNetB7(include_top=False,
drop_connect_rate=0.4,
weights='imagenet',
input_tensor=x)
# Freeze the pretrained weights
model.trainable = False
# Rebuild top
x = tf.keras.layers.GlobalAveragePooling2D(name="avg_pool")(model.output)
x = tf.keras.layers.BatchNormalization()(x)
top_dropout_rate = 0.2
x = tf.keras.layers.Dropout(top_dropout_rate, name="top_dropout")(x)
outputs = tf.keras.layers.Dense(1, name="pred")(x)
# Compile
model = tf.keras.Model(inputs, outputs, name="EfficientNet")
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-2)
model.compile(
optimizer=optimizer,
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=["accuracy"]
)
return model
model = build_model()
part 3: train the model
history = model.fit(x=np.array(img_data), y=np.array(target_val), epochs=5)
the error I get:
ValueError: Shape must be rank 4 but is rank 3 for '{{node
EfficientNet/img_augmentation/random_rotation_1/transform/ImageProjectiveTransformV3}} =
ImageProjectiveTransformV3[dtype=DT_FLOAT, fill_mode="REFLECT", interpolation="BILINEAR"]
(IteratorGetNext, EfficientNet/img_augmentation/random_rotation_1/rotation_matrix/concat,
EfficientNet/img_augmentation/random_rotation_1/transform/strided_slice,
EfficientNet/img_augmentation/random_rotation_1/transform/fill_value)' with input shapes:
[?,160,160], [?,8], [2], [].
The problem in the code is that OpenCV reads the image in grayscale format, but the grayscale format of the image returned is not (160,160,1) but (160,160).
Because of this fact, the error is thrown.
I managed to replicate your problem by testing it locally.
Say we randomly train on 12 samples.
Possible input formats:
#This one works
1. history = model.fit(x=np.random.rand(12,160,160,3), y=np.array([1,1,1,1,1,1,0,0,0,0,0,0]), epochs=5,verbose=1) WORKS
#This one works
2. history = model.fit(x=np.random.rand(12,160,160,1), y=np.array([1,1,1,1,1,1,0,0,0,0,0,0]), epochs=5,verbose=1) WORKS
#This one fails
3. history = model.fit(x=np.random.rand(12,160,160), y=np.array([1,1,1,1,1,1,0,0,0,0,0,0]), epochs=5,verbose=1) FAILS
(1) and (2) work.
(3) fails, yielding:
ValueError: Shape must be rank 4 but is rank 3 for '{{node
EfficientNet/img_augmentation/random_rotation_4/transform/ImageProjectiveTransformV2}} = ImageProjectiveTransformV2[dtype=DT_FLOAT, fill_mode="REFLECT", interpolation="BILINEAR"](IteratorGetNext,
EfficientNet/img_augmentation/random_rotation_4/rotation_matrix/concat,
EfficientNet/img_augmentation/random_rotation_4/transform/strided_slice)'
with input shapes: [?,160,160], [?,8], [2].
Therefore, ensure that your data format is in the shape (160,160,1) or (160,160,3).
As an alternative, after you you read the image with OpenCV, you can use
image = np.expand_dims(image,axis=-1)
to programatically insert the last axis (the grayscale).

model.evaluate() changes results depending on batch size, when fed by generator

Working in colab, with default tensorflow and keras versions (which print tensorflow 2.2.0-rc2, keras 2.3.0-tf )
I've got a superweird error. Basically, the results of model.evaluate() depend on the batch size I'm using and they change after I shuffle the data. Which makes no sense. I've been able to reproduce this in a minimally working example. In my full program (which works in 3D with bigger datasets) the variations are even more significant. I don't know whether this might depend on batch normalization... But I expect it to be fixed when I'm predicting! My full program is doing multiclass segmentation, my minimal example takes a black image with a white square in a random position, with some little noise, and tries to segment the same white square out of it.
I'm using keras sequence as generators to feed data to the model, which I guess might be relevant as I don't see the behaviour when evaluating the data directly.
Here's the code with its output:
#environment setup
%tensorflow_version 2.x
from tensorflow.keras import backend as K
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input,Conv2D, Activation, BatchNormalization
from tensorflow.keras import metrics
#set up a toy model
K.set_image_data_format("channels_last")
inputL = Input([64,64,1])
l1 = Conv2D(4,[3,3],padding='same')(inputL)
l1N = BatchNormalization(axis=-1,momentum=0.9) (l1)
l2 = Activation('relu') (l1N)
l3 = Conv2D(32,[3,3],padding='same')(l2)
l3N = BatchNormalization(axis=-1,momentum=0.9) (l3)
l4 = Activation('relu') (l3N)
l5 = Conv2D(1,[1,1],padding='same',dtype='float32')(l4)
l6 = Activation('sigmoid') (l5)
model = Model(inputs=inputL,outputs=l6)
model.compile(optimizer='sgd',loss='mse',metrics='accuracy' )
#Create random images
import numpy as np
import random
X_train = np.zeros([96,64,64,1])
for imIdx in range(96):
centPoin = random.randrange(7,50)
X_train[imIdx,centPoin-5:centPoin+5,centPoin-5:centPoin+5,0]=1
X_val = X_train[:32,:,:,:]
X_train = X_train[32:,:,:,:]
Y_train = X_train.copy()
X_train = np.random.normal(0.,0.1,size=X_train.shape)+X_train
for imIdx in range(64):
X_train[imIdx,:,:,:] = X_train[imIdx,:,:,:]+np.random.normal(0,0.2,size=1)
from tensorflow.keras.utils import Sequence
import random
import tensorflow as tf
#setup the data generator
class dataGen (Sequence):
def __init__ (self,x_set,y_set,batch_size):
self.x, self.y = x_set, y_set
self.batch_size = batch_size
nSamples = self.x.shape[0]
patList = np.array(range(nSamples),dtype='int16')
patList = patList.reshape(nSamples,1)
np.random.shuffle(patList)
self.patList = patList
def __len__ (self):
return round(self.patList.shape[0] / self.batch_size)
def __getitem__ (self, idx):
patStart = idx
batchS = self.batch_size
listLen = self.patList.shape[0]
Xout = np.zeros((batchS,64,64,1))
Yout = np.zeros((batchS,64,64,1))
for patIdx in range(batchS):
curPat = (patStart+patIdx) % listLen
patInd = self.patList[curPat]
Xout[patIdx,:,:] = self.x[patInd,:,:,:]
Yout[patIdx,:,:] = self.y[patInd,:,:,:]
return Xout, Yout
def on_epoch_end(self):
np.random.shuffle(self.patList)
def setBatchSize(self,batchS):
self.batch_size = batchS
#load the data in the generator
trainGen = dataGen(X_train,Y_train,16)
valGen = dataGen(X_val,X_val,16)
# train the model for two epochs, so that the loss is bad
trainSteps = len(trainGen)
model.fit(trainGen,steps_per_epoch=trainSteps,epochs=32,validation_data=valGen,validation_steps=len(valGen))
trainGen.setBatchSize(4)
model.evaluate(trainGen)
[0.16259156167507172, 0.9870567321777344]
trainGen.setBatchSize(16)
model.evaluate(trainGen)
[0.17035068571567535, 0.9617958068847656]
trainGen.on_epoch_end()
trainGen.setBatchSize(16)
model.evaluate(trainGen)
[0.16663715243339539, 0.9710426330566406]
If I do model.evaluate(Xtrain,Ytrain,batch_size=16) instead the result is not dependent from the batch size.
If I train the model until convergence, where the loss gets to 0.05, the same thing still happens. With the accuracy fluctuating from one evaluation to the other from 0.95 to 0.99.
Why would this happen?
I'd expect the prediction to be super easy, am I wrong?
You made a small mistake inside the __getitem__ function.
curPat = (patStart+patIdx)
should be changed to
curPat = (patStart*batchS+patIdx)
patStart is equal to idx, the current batch number. If your data set contains 64 samples and your batch size is set to 16, the possible values for idx will be 0, 1, 2 and 3.
curPat on the other hand refers to the index of the current sample number in the shuffled list of sample numbers. curPat should therefore be able to take on all values from 0 to 63. In your code, that is not the case. By making the aforementioned change, this issue is fixed.

TensorFlow + Keras multi gpu model with inference

I am trying to do image classification using Keras's Xception model modeled after this code. However I want to use multiple GPU's to do batch parallel image classification using this function. I believe it is possible and I have the original code working without multi GPU support however I can not get multi_gpu_model function to work as I would expect. I am following this example for the multi GPU example. This is my code (it is the backend of a Flask app), it instantiates the model, makes a prediction on an example ndarray when the class is created, and then expects a base 64 encoded image in the classify function:
import os
from keras.preprocessing import image as preprocess_image
from keras.applications import Xception
from keras.applications.inception_v3 import preprocess_input, decode_predictions
from keras.utils import multi_gpu_model
import numpy as np
import tensorflow as tf
import PIL.Image
from numpy import array
class ModelManager:
def __init__(self, model_path):
self.model_name = 'ImageNet'
self.model_version = '1.0'
self.batch_size = 32
height = 224
width = 224
num_classes = 1000
# self.model = tf.keras.models.load_model(os.path.join(model_path, 'ImageNetXception.h5'))
with tf.device('/cpu:0'):
model = Xception(weights=None,
input_shape=(height, width, 3),
classes=num_classes, include_top=True)
# Replicates the model on 8 GPUs.
# This assumes that your machine has 8 available GPUs.
self.parallel_model = multi_gpu_model(model, gpus=8)
self.parallel_model.compile(loss='categorical_crossentropy',
optimizer='rmsprop')
print("Loaded Xception model.")
x = np.empty((1, 224, 224, 3))
self.parallel_model.predict(x, batch_size=self.batch_size)
self.graph = tf.get_default_graph()
self.graph.finalize()
def classify(self, ids, images):
results = []
all_images = np.empty((0, 224, 224, 3))
# all_images = []
for image_id, image in zip(ids, images):
# This does the same as keras.preprocessing.image.load_img
image = image.convert('RGB')
image = image.resize((224, 224), PIL.Image.NEAREST)
x = preprocess_image.img_to_array(image)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
all_images = np.append(all_images, x, axis=0)
# all_images.append(x)
# a = array(all_images)
# print(type(a))
# print(a[0])
with self.graph.as_default():
preds = self.parallel_model.predict(all_images, batch_size=288)
#print(type(preds))
top3 = decode_predictions(preds, top=3)[0]
print(top3)
output = [((t[1],) + t[2:]) for t in top3]
predictions = [
{'label': label, 'probability': probability * 100.0}
for label, probability in output
]
results.append({
'id': 1,
'predictions': predictions
})
print(len(results))
return results
The parts I am not sure about is what to pass the predict function. Currently I am creating an ndarray of the images I want classified after they are preprocessed and then passing that to the predict function. The function returns but the preds variable doesn't hold what I expect. I tried to loop through the preds object but decode_predictions errors when I pass a single item but responds with one prediction when I pass the whole preds ndarray. In the example code they don't use the decode_predictions function so I'm not sure how to use it with the response from parallel_model.predict. Any help or resources is appreciated, thanks.
the following site illustrates how to do that correctly link

Mnist with Tensorflow Premade Estimator, input dimension mismatch on evaluate

I am rather new to Tensorflow, and has been trying to pick up the basics by reading through the guides and documentation on tensorflow.org
I have learnt the basics of how to use the tf.data and tf.estimator APIs and is trying to get them to work together on a basic classification model for MNIST.
I am using this script to load MNIST: https://github.com/tensorflow/models/blob/master/official/mnist/dataset.py
I made modifications to the dataset function to return a feature dictionary rather than vector:
def dataset(directory, images_file, labels_file):
"""Download and parse MNIST dataset."""
images_file = download(directory, images_file)
labels_file = download(directory, labels_file)
check_image_file_header(images_file)
check_labels_file_header(labels_file)
def decode_image(image):
# Normalize from [0, 255] to [0.0, 1.0]
image = tf.decode_raw(image, tf.uint8)
image = tf.cast(image, tf.float32)
image = tf.reshape(image, [784])
return image / 255.0
def decode_label(label):
label = tf.decode_raw(label, tf.uint8) # tf.string -> [tf.uint8]
label = tf.reshape(label, []) # label is a scalar
return tf.to_int32(label)
images = tf.data.FixedLengthRecordDataset(
images_file, 28 * 28, header_bytes=16).map(decode_image)
labels = tf.data.FixedLengthRecordDataset(
labels_file, 1, header_bytes=8).map(decode_label)
return tf.data.Dataset.zip(({"image":images}, labels))
My MNIST classifier script using the premade estimator in tf is as follows:
import tensorflow as tf
import dataset
fc = [tf.feature_column.numeric_column("image", shape=784)]
mnist_classifier = tf.estimator.DNNClassifier(
hidden_units=[512,512],
feature_columns=fc,
model_dir="models/mnist/dnn",
n_classes=10)
def input_fn(train=False, batch_size=None):
if train:
ds = mnist.train("MNIST-data")
ds = ds.shuffle(1000).repeat().batch(batch_size)
else:
ds = mnist.test("MNIST-data")
return ds
mnist_classifier.train(
input_fn=lambda:input_fn(True, 32),
steps=10000)
eval_results = mnist_classifier.evaluate(input_fn=lambda:input_fn())
The classifier doesn't crash on training, but on evaluate, I faced the following traceback:
ValueError: Cannot reshape a tensor with 784 elements to shape
[784,784] (614656 elements) for
'dnn/input_from_feature_columns/input_layer/image/Reshape' (op:
'Reshape') with input shapes: [784,1], [2] and with input tensors
computed as partial shapes: input[1] = [784,784].
What could be causing the issue here?
I have tried printing the output shapes and types of both train and test datasets, and they are exactly the same.
I have also tried viewing the model on tensorboard, and only the projector tab is available, no scalars or graphs tab.
Thanks!
PS: Any links to TF tutorials using the Datasets and Estimators APIs will be great too.

Categories