I am working on a modified resnet, and want to insert dropout after activation layers.
I have tried the following but due to the model not being sequential, it did not work:
def add_dropouts(model, probability = 0.5):
print("Adding Dropouts")
updated_model = tf.keras.models.Sequential()
for layer in model.layers:
print("layer = ", layer)
updated_model.add(layer)
if isinstance(layer, tf.keras.layers.Activation):
updated_model.add(tf.keras.layers.Dropout(probability))
print("updated model Summary = ", updated_model.summary)
print("model Summary = ", model.summary)
model = updated_model
return model
base_model = tf.keras.applications.ResNet50V2(include_top=False, input_shape=input_img_shape, pooling='avg')
base_model = add_dropouts(base_model, probability = 0.5)
Then i tried my own version using the functional API, but this method doesn't work and returns a value error say Tensor doesn't have output.
prev_layer = base_model.layers[0]
for layer in base_model.layers:
next_layer = layer(prev_layer.output)
if isinstance(layer, tf.keras.layers.Activation):
next_layer = Dropout(0.5)(next_layer.output)
prev_layer = next_layer
Does anyone know how someone would add dropout layers into resnet or any other pretrained network?
So eventually i figured out how to do it; but its very hacky. Go to:
C:\ProgramData\Anaconda3\envs*your env name*\Lib\site-packages\tensorflow\python\keras\applications
Go to resnet.py. This will also change resnetv2 instances because it is based on the original resnet. Just Cntrl+F for activation,and where you see an activation layer(which is usually in the format x = Layer(x) building the model a layer at a time) then just add:
x = Dropout(prob)(x)
Here is an example:
if not preact:
x = layers.BatchNormalization(
axis=bn_axis, epsilon=1.001e-5, name='conv1_bn')(x)
x = layers.Activation('relu', name='conv1_relu')(x)#insert layer after each of these
x = layers.Dropout(prob)(x) # added dropout
Do this for all similar search results for 'activation'.
Then you will see the dropout added in your model summary.
I have a question.
Working with CNN, there is a way to get output of an hidden layer during classification?
Example:
common_input = layers.Input(shape=(224, 224, 3))
x = model0(common_input) #model0 is a pretrain model on imagenet
x = layers.Flatten()(x)
p = layers.Dense(768, activation="relu")(x)
p = layers.Dropout(0.3)(p)
p = layers.Dense(8, activation="softmax", name="fc_out")(p)
model = Model(inputs=common_input, outputs=p)
My task is classification; so if I have 2000 images I will get a matrix 2000x8. If I need the output of layer dense, there is a way to get a matrix 2000x768 (both in the same computation)?
Thanks
I get an error using gradient visualization with transfer learning in TF 2.0. The gradient visualization works on a model that does not use transfer learning.
When I run my code I get the error:
assert str(id(x)) in tensor_dict, 'Could not compute output ' + str(x)
AssertionError: Could not compute output Tensor("block5_conv3/Identity:0", shape=(None, 14, 14, 512), dtype=float32)
When I run the code below it errors. I think there's an issue with the naming conventions or connecting inputs and outputs from the base model, vgg16, to the layers I'm adding. Really appreciate your help!
"""
Broken example when grad_model is created.
"""
!pip uninstall tensorflow
!pip install tensorflow==2.0.0
import cv2
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers
import matplotlib.pyplot as plt
IMAGE_PATH = '/content/cat.3.jpg'
LAYER_NAME = 'block5_conv3'
model_layer = 'vgg16'
CAT_CLASS_INDEX = 281
imsize = (224,224,3)
img = tf.keras.preprocessing.image.load_img(IMAGE_PATH, target_size=(224, 224))
plt.figure()
plt.imshow(img)
img = tf.io.read_file(IMAGE_PATH)
img = tf.image.decode_jpeg(img)
img = tf.cast(img, dtype=tf.float32)
# img = tf.keras.preprocessing.image.img_to_array(img)
img = tf.image.resize(img, (224,224))
img = tf.reshape(img, (1, 224,224,3))
input = layers.Input(shape=(imsize[0], imsize[1], imsize[2]))
base_model = tf.keras.applications.VGG16(include_top=False, weights='imagenet',
input_shape=(imsize[0], imsize[1], imsize[2]))
# base_model.trainable = False
flat = layers.Flatten()
dropped = layers.Dropout(0.5)
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
fc1 = layers.Dense(16, activation='relu', name='dense_1')
fc2 = layers.Dense(16, activation='relu', name='dense_2')
fc3 = layers.Dense(128, activation='relu', name='dense_3')
prediction = layers.Dense(2, activation='softmax', name='output')
for layr in base_model.layers:
if ('block5' in layr.name):
layr.trainable = True
else:
layr.trainable = False
x = base_model(input)
x = global_average_layer(x)
x = fc1(x)
x = fc2(x)
x = prediction(x)
model = tf.keras.models.Model(inputs = input, outputs = x)
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4),
loss='binary_crossentropy',
metrics=['accuracy'])
This portion of the code is where the error lies. I'm not sure what is the correct way to label inputs and outputs.
# Create a graph that outputs target convolution and output
grad_model = tf.keras.models.Model(inputs = [model.input, model.get_layer(model_layer).input],
outputs=[model.get_layer(model_layer).get_layer(LAYER_NAME).output,
model.output])
print(model.get_layer(model_layer).get_layer(LAYER_NAME).output)
# Get the score for target class
# Get the score for target class
with tf.GradientTape() as tape:
conv_outputs, predictions = grad_model(img)
loss = predictions[:, 1]
The section below is for plotting a heatmap of gradcam.
print('Prediction shape:', predictions.get_shape())
# Extract filters and gradients
output = conv_outputs[0]
grads = tape.gradient(loss, conv_outputs)[0]
# Apply guided backpropagation
gate_f = tf.cast(output > 0, 'float32')
gate_r = tf.cast(grads > 0, 'float32')
guided_grads = gate_f * gate_r * grads
# Average gradients spatially
weights = tf.reduce_mean(guided_grads, axis=(0, 1))
# Build a ponderated map of filters according to gradients importance
cam = np.ones(output.shape[0:2], dtype=np.float32)
for index, w in enumerate(weights):
cam += w * output[:, :, index]
# Heatmap visualization
cam = cv2.resize(cam.numpy(), (224, 224))
cam = np.maximum(cam, 0)
heatmap = (cam - cam.min()) / (cam.max() - cam.min())
cam = cv2.applyColorMap(np.uint8(255 * heatmap), cv2.COLORMAP_JET)
output_image = cv2.addWeighted(cv2.cvtColor(img.astype('uint8'), cv2.COLOR_RGB2BGR), 0.5, cam, 1, 0)
plt.figure()
plt.imshow(output_image)
plt.show()
I also asked this to the tensorflow team on github at https://github.com/tensorflow/tensorflow/issues/37680.
I figured it out. If you set up the model extending the vgg16 base model with your own layers, rather than inserting the base model into a new model like a layer, then it works.
First set up the model and be sure to declare the input_tensor.
inp = layers.Input(shape=(imsize[0], imsize[1], imsize[2]))
base_model = tf.keras.applications.VGG16(include_top=False, weights='imagenet', input_tensor=inp,
input_shape=(imsize[0], imsize[1], imsize[2]))
This way we don't have to include a line like x=base_model(inp) to show what input we want to put in. That's already included in tf.keras.applications.VGG16(...).
Instead of putting this vgg16 base model inside another model, it's easier to do gradcam by adding layers to the base model itself. I grab the output of the last layer of VGG16 (with the top removed), which is the pooling layer.
block5_pool = base_model.get_layer('block5_pool')
x = global_average_layer(block5_pool.output)
x = fc1(x)
x = prediction(x)
model = tf.keras.models.Model(inputs = inp, outputs = x)
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4),
loss='binary_crossentropy',
metrics=['accuracy'])
Now, I grab the layer for visualization, LAYER_NAME='block5_conv3'.
# Create a graph that outputs target convolution and output
grad_model = tf.keras.models.Model(inputs = [model.input],
outputs=[model.output, model.get_layer(LAYER_NAME).output])
print(model.get_layer(LAYER_NAME).output)
# Get the score for target class
# Get the score for target class
with tf.GradientTape() as tape:
predictions, conv_outputs = grad_model(img)
loss = predictions[:, 1]
print('Prediction shape:', predictions.get_shape())
# Extract filters and gradients
output = conv_outputs[0]
grads = tape.gradient(loss, conv_outputs)[0]
We (I plus a number of team members developing a project) found a similar problem with a code implementing Grad-CAM that we found in a tutorial.
That code didn't work with a model consisting of the base model of VGG19 plus a few extra layers added on top of it. The problem was that the VGG19 base model was inserted as a "layer" inside our model, and apparently the GradCAM code didn't know how to deal with it - we were getting a "Graph disconnected..." error. Then after some debugging (carried out by another team member, not me) we managed to modify the original code to make it work for this kind of model that contains another model inside it. The idea is to add the inner model as an extra argument of the class GradCAM. Since this may be helpful to others I am including the modified code below (we also renamed the GradCAM class as My_GradCAM).
class My_GradCAM:
def __init__(self, model, classIdx, inner_model=None, layerName=None):
self.model = model
self.classIdx = classIdx
self.inner_model = inner_model
if self.inner_model == None:
self.inner_model = model
self.layerName = layerName
[...]
gradModel = tensorflow.keras.models.Model(inputs=[self.inner_model.inputs],
outputs=[self.inner_model.get_layer(self.layerName).output,
self.inner_model.output])
Then the class can be instantiated by adding the inner model as the extra argument, e.g.:
cam = My_GradCAM(model, None, inner_model=model.get_layer("vgg19"), layerName="block5_pool")
I hope this helps.
Edit: Credit to Mirtha Lucas for doing the debugging and finding the solution.
After a lot of struggle, I condense the way to draw the heat map when you are using transfer learning. Here is the keras official tutorial
The issue I encounter is that when I'm trying to draw the heat map
from my model, the densenet can be only seen as functional layer in my
model. So the make_gradcam_heatmap can not figure out the layer that
inside functional layer. As the 5th layer shows.
Therefore, to simulate the Keras official document, I need to only use the densenet as the model for visualization. Here is the step
Only Take out the model from your model
dense_model = dense_model.get_layer('densenet121')
Copy the weight from dense model to your new initiated model
inputs = tf.keras.Input(shape=(224, 224, 3))
model = model_builder(weights="imagenet", include_top=True, input_tensor=inputs)
for layer, dense_layer in zip(model.layers[1:], dense_model.layers[1:]):
layer.set_weights(dense_layer.get_weights())
relu = model.get_layer('relu')
x = tf.keras.layers.GlobalAveragePooling2D()(relu.output)
outputs = tf.keras.layers.Dense(5)(x)
model = tf.keras.models.Model(inputs = inputs, outputs = outputs)
Draw the heat map
preprocess_input = keras.applications.densenet.preprocess_input
img_array = preprocess_input(get_img_array(img_path, size=(224, 224)))
heatmap = make_gradcam_heatmap(img_array, model, 'bn')
plt.matshow(heatmap)
plt.show()
get_img_array, make_gradcam_heatmap and save_and_display_gradcam are kept in still. Follow the keras tutorial then you are good to go.
I saw this code from https://github.com/raducrs/Applications-of-Deep-Learning/blob/master/Image%20captioning%20Flickr8k.ipynb and tried it to run in google colab, however when I run the code below it gave me error. It says
Merge is deprecated
I wonder how I can run this code with keras latest version.
LSTM_CELLS_CAPTION = 256
LSTM_CELLS_MERGED = 1000
image_pre = Sequential()
image_pre.add(Dense(100, input_shape=(IMG_FEATURES_SIZE,), activation='relu', name='fc_image'))
image_pre.add(RepeatVector(MAX_SENTENCE,name='repeat_image'))
caption_model = Sequential()
caption_model.add(Embedding(VOCABULARY_SIZE, EMB_SIZE,
weights=[embedding_matrix],
input_length=MAX_SENTENCE,
trainable=False, name="embedding"))
caption_model.add(LSTM(EMB_SIZE, return_sequences=True, name="lstm_caption"))
caption_model.add(TimeDistributed(Dense(100, name="td_caption")))
combined = Sequential()
combined.add(Merge([image_pre, caption_model], mode='concat', concat_axis=1,name="merge_models"))
combined.add(Bidirectional(LSTM(256,return_sequences=False, name="lstm_merged"),name="bidirectional_lstm"))
combined.add(Dense(VOCABULARY_SIZE,name="fc_merged"))
combined.add(Activation('softmax',name="softmax_combined"))
predictive = Model([image_pre.input, caption_model.input],combined.output)
Merge(mode='concat') is now Concatenate(axis=1).
The following generates a graph correctly on colab.
from tensorflow.python import keras
from keras.layers import *
from keras.models import Model, Sequential
IMG_FEATURES_SIZE = 10
MAX_SENTENCE = 80
VOCABULARY_SIZE = 1000
EMB_SIZE = 100
embedding_matrix = np.zeros((VOCABULARY_SIZE, EMB_SIZE))
LSTM_CELLS_CAPTION = 256
LSTM_CELLS_MERGED = 1000
image_pre = Sequential()
image_pre.add(Dense(100, input_shape=(IMG_FEATURES_SIZE,), activation='relu', name='fc_image'))
image_pre.add(RepeatVector(MAX_SENTENCE,name='repeat_image'))
caption_model = Sequential()
caption_model.add(Embedding(VOCABULARY_SIZE, EMB_SIZE,
weights=[embedding_matrix],
input_length=MAX_SENTENCE,
trainable=False, name="embedding"))
caption_model.add(LSTM(EMB_SIZE, return_sequences=True, name="lstm_caption"))
caption_model.add(TimeDistributed(Dense(100, name="td_caption")))
merge = Concatenate(axis=1,name="merge_models")([image_pre.output, caption_model.output])
lstm = Bidirectional(LSTM(256,return_sequences=False, name="lstm_merged"),name="bidirectional_lstm")(merge)
output = Dense(VOCABULARY_SIZE, name="fc_merged", activation='softmax')(lstm)
predictive = Model([image_pre.input, caption_model.input], output)
predictive.compile('sgd', 'binary_crossentropy')
predictive.summary()
Description:
This is a model with 2 inputs per sample: an image and a caption ( a sequence of words ).
The input graphs merge at the concatenation point (name='merge_models')
The image is processed simply by a Dense layer (you may want to add convolutions to the image branch ); the output of this dense layer is then copied MAX_SENTENCE times in preparation for the merge.
The captions are processed by an LSTM and a Dense layer.
The merge results in MAX_SENTENCE time-steps each with features from both branches.
The combined branch then ends up predicting one class out of VOCABULARY_SIZE.
The model.summary() is a good way to understand the graph.
I am trying to figure out how to train a CNN in Keras without using ImageDataGenerator. Essentially I'm trying to figure out the magic behind the ImageDataGenerator class so that I don't have to rely on it for all my projects.
I have a dataset organized into 2 folders: training_set and test_set. Each of these folders contains 2 sub-folders: cats and dogs.
I am loading them all into memory using Keras' load_img class in a for loop as follows:
trainingImages = []
trainingLabels = []
validationImages = []
validationLabels = []
imgHeight = 32
imgWidth = 32
inputShape = (imgHeight, imgWidth, 3)
print('Loading images into RAM...')
for path in imgPaths:
classLabel = path.split(os.path.sep)[-2]
classes.add(classLabel)
img = img_to_array(load_img(path, target_size=(imgHeight, imgWidth)))
if path.split(os.path.sep)[-3] == 'training_set':
trainingImages.append(img)
trainingLabels.append(classLabel)
else:
validationImages.append(img)
validationLabels.append(classLabel)
trainingImages = np.array(trainingImages)
trainingLabels = np.array(trainingLabels)
validationImages = np.array(validationImages)
validationLabels = np.array(validationLabels)
When I print the shape() of the trainingImages and trainingLabels I get:
Shape of trainingImages: (8000, 32, 32, 3)
Shape of trainingLabels: (8000,)
My model looks like this:
model = Sequential()
model.add(Conv2D(
32, (3, 3), padding="same", input_shape=inputShape))
model.add(Activation("relu"))
model.add(Flatten())
model.add(Dense(len(classes)))
model.add(Activation("softmax"))
And when I compile and try to fit the data, I get:
ValueError: Error when checking target: expected activation_2 to have shape (2,) but got array with shape (1,)
Which tells me my data is not input into the system correctly. How can I properly prepare my data arrays without using ImageDataGenerator?
The error is because of your model definition instead of ImageDataGenerator (which I don't see used in the code you have posted). I am assuming that len(classes) = 2 because of the error message that you are getting. You are getting the error because the last layer of your model expects trainingLabels to have a vector of size 2 for each datapoint but your trainingLabels is a 1-D array.
For fixing this, you can either change your last layer to have just 1 unit because it's binary classification:
model.add(Dense(1))
or you can change your training and validation labels to vectors using one hot encoding:
from keras.utils import to_categorical
training_labels_one_hot = to_categorical(trainingLabels)
validation_labels_one_hot = to_categorical(validationLabels)