CNN batch with images of different size - python

I restored a pre-trained model for face detection which takes a single image at a time and returns bounding boxes. How can I make it take a batch of images if these images have different sizes?

You can use tf.image.resize_images method to achieve this. According to docs tf.image.resize_images:
Resize images to size using the specified method.
Resized images will be distorted if their original aspect ratio is not
the same as size. To avoid distortions see
tf.image.resize_image_with_pad.
How to use it?
import tensorflow as tf
from tensorflow.python.keras.models import Model
x = Input(shape=(None, None, 3), name='image_input')
resize_x = tf.image.resize_images(x, [32,32])
vgg_model = load_vgg()(resize_x)
model = Model(inputs=x, outputs=vgg_model.output)
model.compile(...)
model.predict(...)

Related

Use preprocessing function that changes size of input on ImageDataGenerator

I wish to take the FFT of the input dataset loaded using ImageDataGenerator. Taking the FFT will double the number of channels as I stack the real and complex parts of the complex output of the FFT together along the channels dimension. The preprocessing_function attribute of the ImageDataGenerator class should output a Numpy tensor with the same shape as the input, so I could not use that.
I tried applying tf.math.fft2d directly on the ImageDataGenerator.flow_from_directory() output, but it is consuming too much RAM - causing the program to crash on Google colab. Another way I tried was to add a custom layer computing the FFT as the first layer of my neural network, but this adds to the training time. So I wish to do it as a pre-processing step.
Could anyone kindly suggest an efficient way to apply a function on ImageDataGenerator.
You can do a custom ImageDataGenerator, but I have no reason to think this is any faster than using it in the first layer. It seems like a costly operation, since tf.signal.fft2d takes complex64 or complex128 dtypes. So it needs casting, and then casting back because neural network weights are tf.float32 and other image processing functions don't take complex dtype.
import tensorflow as tf
labels = ['Cats', 'Dogs', 'Others']
def read_image(file_name):
image = tf.io.read_file(file_name)
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.convert_image_dtype(image, tf.float32)
image = tf.image.resize_with_pad(image, target_height=224, target_width=224)
image = tf.cast(image, tf.complex64)
image = tf.signal.fft2d(image)
label = tf.strings.split(file_name, '\\')[-2]
label = tf.where(tf.equal(label, labels))
return image, label
ds = tf.data.Dataset.list_files(r'path\to\my\pictures\*\*.jpg')
ds = ds.map(read_image)
next(iter(ds))

How to generate accuracy from a saved model of Keras?

I already trained my Keras model in .h5. My model use 6 classes and it able to classify all the classes by using images. The model able to output the name of the class that it successfully classified. However, I want to generate accuracy when testing the model with an image input by user. I already searching everywhere but still there are no answer for this problem.
model = load_model('prototype-tl2-80-20.h5')
classes = { 1:'Kacip Fatimah',
2:'Mempisang',
3:'Misai Adam',
4:'Pandan Serapat',
5:'Tapak Sulaiman',
6:'Tongkat Ali'}
image = Image.open(file_path)
image = image.resize((224,224))
image = numpy.expand_dims(image, axis=0)
image = numpy.array(image)
pred = model.predict_classes([image])[0]
sign = classes[pred+1]
print(sign)
to predict an image using a trained model you have to be careful to make sure the image is processed exactly as the training images were processed. The image should be the same size (height,width) as the training images and have the same number of color bands example 'rgb' or 'grayscale'. Make sure color bands are in the same order as used in training. Next you must apply the same preprocessing to the image. For example if your training images were scaled to be between 0 and 1 then you need to rescale your test image with image=image/255. After that than do
pred = model.predict(image)
index=np.argmax(pred)
class=classes[index]
print (index, class)

Black image when image is normalized

Hi I'm using the following procedure to normalize images by following the tutorial on the TF 2.3 website:
from tensorflow.keras import layers
normalization_layer = tf.keras.layers.experimental.preprocessing.Rescaling(1./255)
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
first_image = image_batch[0]
# Notice the pixels values are now in `[0,1]`.
print(np.min(first_image), np.max(first_image))
But after testing this for training my vision model doesn't converge. It works fine if I don't use the normalization layer but I'm afraid I'm hurting its performance by using non-normalized images. Also, after trying to render the normalized images I see only black images.
I use imshow to show the images.
I don't really understand why the images are black when I use imshow() and why my model doesn't work with the normalized images

Add a gaussian noise layer between pretrained network layers in Keras

I would like to add a gaussian layer model to a pretrained vgg16 network (after convolutional layer 1) and have the noisy layer active when I pass the images (I would like to obtain activation of layers after passing through a noise layer during inference phase).
I have managed to compile a 'new' model which has the gaussian layer added to it. I also have code for passing images and obtaining activations to them (using model.predict(img)). My problem is that now when I am passing the images through the gaussian noise layer the activation does not change from passing through the layer prior. I have read that gaussian noise layers are usually used in training (not testing - which is relevant for me), but I've also read that you can 'trick' a layer into believing you are in a training state and still have the gaussian noise active. I don't know how to practically implement this.
Here I am creating my new model with gaussian noise layer:
# load the model
model = VGG16(weights='imagenet')
# summarize the model
model.summary()
# loop over layers and add noise - create a new model
for i in range(1,len(model.layers)): # loop over all layers
#print ('Working on '+layer_names[i])
layer1 = model.layers[i]
if i==1: # adding the noise
noise = GaussianNoise(0.85)
x = noise(layer1.output)
if i<len(model.layers)-1: # reconnecting the layers
layer2 = model.layers[i+1]
x = layer2(x)
if i==len(model.layers)-1:
predictors = x # this is the last layer
# Create a new model
model2 = Model(input=model.input, output=predictors)
model2.summary()
Here is an example of how I would like to obtain activation of all layers (including Gaussian layer).
layer_names = ['block1_conv1',model2.layers[2].name,'block1_conv2','block2_conv1','block2_conv2','block3_conv1','block3_conv2','block3_conv3',\
'block4_conv1','block4_conv2','block4_conv3','block5_conv1','block5_conv2','block5_conv3','fc1','fc2','predictions'];
for i in range(1,len(layer_names)): # loop over all layers
print ('Working on '+layer_names[i])
layer_name = layer_names[i]
intermediate_layer_model = Model(inputs=model2.input,outputs=model2.get_layer(layer_name).output)
# load the image with the required shape
img = image.load_img('test.BMP', target_size=(224,224)) # plot with img
# convert the image to an array
img = image.img_to_array(img)
# expand dimensions so that it represents a single 'sample'
img = np.expand_dims(img, axis=0)
# prepare the image (e.g. scale pixel values for the vgg)
img = preprocess_input(img)
# Now get the activations to this image
intermediate_output = intermediate_layer_model.predict(img)
The output (intermediate_output) should be different after receiving output from block1_cov1 layer vs. gaussian_noise layer, but it isn't. I guess that's because the Gaussian noise layer is not active, but I don't know how to change that. I know my code is suboptimal (just starting to work with Keras), but any help on how to get a noise to a pretrained network and let it 'propagate' through a network would be highly appreciated.

extracting Bottleneck features using pretrained Inceptionv3 - differences between Keras' implementation and Native Tensorflow implementation

(Apologies for the long post)
All,
I want to use the bottleneck features from a pretrained Inceptionv3 model to predict classification for my input images. Before training a model and predicting classification, I tried 3 different approaches for extracting the bottleneck features.
My 3 approaches yielded different bottleneck features (not just in values but even the size was different).
Size of my bottleneck features from Approach 1 and 2: (number of input images) x 3 x 3 x 2048
Size of my bottleneck features from Approach 3: (number of input images) x 2048
Why are the sizes different between the Keras based Inceptionv3 model and the native Tensorflow model? My guess is that when I say include_top=False in Keras, I'm not extracting the 'pool_3/_reshape:0' layer. Is this correct? If yes, how do I extract the 'pool_3/_reshape:0' layer in Keras? If my guess is incorrect, what 'am I missing?
I compared the bottleneck feature values from Approach 1 and 2 and they were significantly different. I think I'm feeding it the same input images because I resize and rescale my images before I even read it as input for my script. I have no options for my ImageDataGenerator in Approach 1 and according to the documentation for that function all the default values do not change my input image. I have set shuffle to false so I assumed that predict_generator and predict are reading images in the same order. What 'am I missing?
Please note:
My inputs images are in RGB format (so number of channels = 3) and I resized all of them to 150x150. I used the preprocess_input function in inceptionv3.py to preprocess all my images.
def preprocess_input(image):
image /= 255.
image -= 0.5
image *= 2.
return image
Approach 1: Used Keras with tensorflow as backend, an ImageDataGenerator to read my data and model.predict_generator to compute bottleneck features
I followed the example (Section Using the bottleneck features of a pre-trained network: 90% accuracy in a minute) from Keras' blog. Instead of VGG model listed there I used Inceptionv3. Below is the snippet of code I used
(code not shown here but what i did before the code below) : read all input images, resize to 150x150x3, rescale according to the preprocessing_input function mentioned above, save the resized and rescaled images
train_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(my_input_dir, target_size=(150,150),shuffle=False, batch_size=16)
# get bottleneck features
# use pre-trained model and exclude top layer - which is used for classification
pretrained_model = InceptionV3(include_top=False, weights='imagenet', input_shape=(150,150,3))
bottleneck_features_train_v1 = pretrained_model.predict_generator(train_generator,len(train_generator.filenames)//16)
Approach 2: Used Keras with tensorflow as backend, my own reader and model.predict to compute bottleneck features
Only difference between this approach and earlier one is that I used my own reader to read the input images.
(code not shown here but what i did before the code below) : read all input images, resize to 150x150x3, rescale according to the preprocessing_input function mentioned above, save the resized and rescaled images
# inputImages is a numpy array of size <number of input images x 150 x 150 x 3>
inputImages = readAllJPEGsInFolderAndMergeAsRGB(my_input_dir)
# get bottleneck features
# use pre-trained model and exclude top layer - which is used for classification
pretrained_model = InceptionV3(include_top=False, weights='imagenet', input_shape=(img_width, img_height, 3))
bottleneck_features_train_v2 = pretrained_model.predict(trainData.images,batch_size=16)
Approach 3: Used tensorflow (NO KERAS) compute bottleneck features
I followed retrain.py to extract bottleneck features for my input images. Please note that that the weights from that script can be obtained from (http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz)
As mentioned in that example, I used the bottleneck_tensor_name = 'pool_3/_reshape:0' as the layer to extract and compute bottleneck features. Similar to the first 2 approaches, I used resized and rescaled images as input to the script and I called this feature list bottleneck_features_train_v3
Thank you so much
Different results between 1 and 2
Since you haven't shown your code, I (maybe wrongly) suggest that the problem is that you might not have used preprocess_input when declaring ImageDataGenerator ?
from keras.applications.inception_v3 import preprocess_input
train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
Make sure, though, that your saved image files range from 0 to 255. (Bit depth 24).
Different shapes between 1 and 3
There are three possible types of model in this case:
include_top = True -> this will return classes
include_top = False (only) -> this implies in pooling = None (no final pooling layer)
include_top = False, pooling='avg' or ='max' -> has a pooling layer
So, your declared model without an explicit pooling=something doesn't have the final pooling layer in keras. Then the outputs will still have the spatial dimensions.
Solve that simply by adding a pooling at the end. One of these:
pretrained_model = InceptionV3(include_top=False, pooling = 'avg', weights='imagenet', input_shape=(img_width, img_height, 3))
pretrained_model = InceptionV3(include_top=False, pooling = 'max', weights='imagenet', input_shape=(img_width, img_height, 3))
Not sure which one the model in the tgz file is using.
As an alternative, you can also get another layer from the Tensorflow model, the one coming immediately before 'pool_3'.
You can look into the Keras implementation of inceptionv3 here:
https://github.com/keras-team/keras/blob/master/keras/applications/inception_v3.py
so, the default parameter is:
def InceptionV3(include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000):
Notice that default for pooling=None, then when building the model, the code is:
if include_top:
# Classification block
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dense(classes, activation='softmax', name='predictions')(x)
else:
if pooling == 'avg':
x = GlobalAveragePooling2D()(x)
elif pooling == 'max':
x = GlobalMaxPooling2D()(x)
# Ensure that the model takes into account
# any potential predecessors of `input_tensor`.
if input_tensor is not None:
inputs = get_source_inputs(input_tensor)
else:
inputs = img_input
# Create model.
model = Model(inputs, x, name='inception_v3')
So if you do not specify the pooling the bottleneck feature is extracted without any pooling, you need to specify if you want to get an average pooling or max pooling on top of these feature.

Categories