how to get data from within Keras model for visualisation? - python

I am using Tensorflow 1.12 which has Keras integrated together with Python 3.6.x
I wish to use Keras for its simplicity of model building, but also would like to use data on the intermediate layer for visualization of feature maps and kernels to better understand how machine learning works(even though this is admittedly not so evident)
I am using the mnist data base and a very basic Keras model to try to do what I want to do.
Here is the code
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow import keras
print(tf.VERSION)
print(tf.keras.__version__)
tf.keras.backend.clear_session()
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train_shaped = np.expand_dims(x_train, axis=3) / 255.0
x_test_shaped = np.expand_dims(x_test, axis=3) / 255.0
def create_model():
model = tf.keras.models.Sequential([
keras.layers.Conv2D(32, kernel_size=(4, 4),strides=(1,1),activation='relu', input_shape=(28,28,1)),
keras.layers.Dropout(0.5),
keras.layers.MaxPooling2D(pool_size=(2,2), strides=(2,2)),
keras.layers.Conv2D(24, kernel_size=(8, 8),strides=(1,1)),
keras.layers.Flatten(),
keras.layers.Dropout(0.5),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer=tf.keras.optimizers.Adam(),
loss=tf.keras.losses.sparse_categorical_crossentropy,
metrics=['accuracy'])
return model
The above sets up the dataset and the model
Next I define my session for Tensorflow and do the training.
This all works fine but now I want to get my data for the, as example, the first layer out as ideally a numpy array on which I can do the visualization.
My model.layers[0].output gives me a Tensor of (?,25,25,32) as expected and now I try to do a eval() and thenafter a .numpy() method to get my result.
The error message is
You must feed a value for placeholder tensor 'conv2d_6_input' with dtype float and shape [?,28,28,1]
I am looking for help on how to get my data (32 feature maps of 25x25 pixels) out as numpy array for visualization.
sess = tf.Session(graph=tf.get_default_graph())
tf.keras.backend.set_session(sess)
with sess.as_default():
model = create_model()
model.summary()
model.fit(x_train_shaped[:10000], y_train[:10000], epochs=2,
batch_size=64, validation_split=.2,)
model.layers[0].output
print(model.layers[0].output.shape)
my_array = model.layers[0].output
my_array.eval()
tf.keras.backend.clear_session()
sess.close()

First of all, you must note that getting the output of a model or a layer only makes sense when you feed the input layers with some data. You get the model something (i.e. input data), you get something in return (i.e. output or feature map or activation map). That's why it would produce the following error:
You must feed a value for placeholder tensor 'conv2d_6_input'
You haven't fed the baby, so it would cry :)
Now, the idea of building a new Keras model is counterproductive. When you have a large model in the first place, one would like to plug in some kind of ready-made code that can get the output of the feature maps and visualize them. So this route seems not really interesting.
I think you are mistakenly thinking that when you construct a new model out of the layers of another model, a whole new model is cloned. That's not the case since the parameters of the layers would be shared.
Concretely, what you are looking for can be achieved like this:
viz_conv = Model(model.input, model.layers[0].output)
conv_active = viz_conv(my_input_data) # my_input_data is a numpy array of shape `(num_samples,28,28,1)`
All the parameters of viz_conv are shared with model and they have not been copied either. Under the hood they are using the same weight Tensors.
Alternatively, you could define a backend function to do this:
from tensorflow.keras import backend as K
viz_func = K.function([model.input], [any layer(s) you would like in the model])
output = viz_func([my_input_data])
This has been covered in Keras documentation and I highly recommend to read that as well.

Related

What is the most efficient way to modify a Keras Model?

Is there a way to add nodes to a layer in an existing Keras model? if so, what is the most efficient way to do so?
Also, is it possible to do the same but with layers? i.e. add a new layer to an existing Keras model (for example, right after the input layer).
One way I know of is to use Keras functional API by iterating and cloning each layer of the model in order to create a "copy" of the original model with the desired changes, but is it the most efficient way to accomplish this task?
You can take the output of a layer in a model and build another model starting from it:
import tensorflow as tf
# One simple model
inputs = tf.keras.Input(shape=(3,))
x = tf.keras.layers.Dense(4, activation='relu')(inputs)
outputs = tf.keras.layers.Dense(5, activation='softmax')(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
# Make a second model starting from layer in previous model
x2 = tf.keras.layers.Dense(8, activation='relu')(model.layers[1].output)
outputs2 = tf.keras.layers.Dense(7, activation='softmax')(x2)
model2 = tf.keras.Model(inputs=model.input, outputs=outputs2)
Note that in this case model and model2 share the same input layer and first dense layer objects (model.layers[0] is model2.layers[0] and model.layers[1] is model2.layers[1]).

Adding a Concatenated layer to TensorFlow 2.0 (using Attention)

In building a model that uses TensorFlow 2.0 Attention I followed the example given in the TF docs. https://www.tensorflow.org/api_docs/python/tf/keras/layers/Attention
The last line in the example is
input_layer = tf.keras.layers.Concatenate()(
[query_encoding, query_value_attention])
Then the example has the comment
# Add DNN layers, and create Model.
# ...
So it seemed logical to do this
model = tf.keras.Sequential()
model.add(input_layer)
This produces the error
TypeError: The added layer must be an instance of class Layer.
Found: Tensor("concatenate/Identity:0", shape=(None, 200), dtype=float32)
UPDATE (after #thushv89 response)
What I am trying to do in the end is add an attention layer in the following model which works well (or convert it to an attention model).
model = tf.keras.Sequential()
model.add(layers.Embedding(vocab_size, embedding_nodes, input_length=max_length))
model.add(layers.LSTM(20))
#add attention here?
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error', metrics=['accuracy'])
My data looks like this
4912,5059,5079,0
4663,5145,5146,0
4663,5145,5146,0
4840,5117,5040,0
Where the first three columns are the inputs and the last column is binary and the goal is classification. The data was prepared similarly to this example with a similar purpose, binary classification. https://machinelearningmastery.com/use-word-embedding-layers-deep-learning-keras/
So, first thing is Keras has three APIs when it comes to creating models.
Sequential - (Which is what you're doing here)
Functional - (Which is what I'm using in the solution)
Subclassing - Creating Python classes to represent custom models/layers
The way the model created in the tutorial is not to be used with sequential models but a model from the Functional API. So you got to do the following. Note that, I've taken the liberty of defining the dense layers with arbitrary parameters (e.g. number of output classes, which you can change as needed).
import tensorflow as tf
# Variable-length int sequences.
query_input = tf.keras.Input(shape=(None,), dtype='int32')
value_input = tf.keras.Input(shape=(None,), dtype='int32')
# ... the code in the middle
# Concatenate query and document encodings to produce a DNN input layer.
input_layer = tf.keras.layers.Concatenate()(
[query_encoding, query_value_attention])
# Add DNN layers, and create Model.
# ...
dense_out = tf.keras.layers.Dense(50, activation='relu')(input_layer)
pred = tf.keras.layers.Dense(10, activation='softmax')(dense_out)
model = tf.keras.models.Model(inputs=[query_input, value_input], outputs=pred)
model.summary()

How to bypass portion of neural network in TensorFlow for some (but not all) features

In my TensorFlow model I have some data that I feed into a stack of CNNs before it goes into a few fully connected layers. I have implemented that with Keras' Sequential model. However, I now have some data that should not go into the CNN and instead be fed directly into the first fully connected layer because that data contains some values and labels that are part of the input data but that data should not undergo convolutions as it is not image data.
Is such a thing possible with tensorflow.keras or should I do that with tensorflow.nn instead? As far as I understand Keras' sequential models is that the input goes in one end and comes out the other with no special wiring in the middle.
Am I correct that to do this I have to use tensorflow.concat on the data from the last CNN layer and the data that bypasses the CNNs before feeding it into the first fully connected layer?
Here is an simple example in which the operation is to sum the activations from different subnets:
import keras
import numpy as np
import tensorflow as tf
from keras.layers import Input, Dense, Activation
tf.reset_default_graph()
# this represents your cnn model
def nn_model(input_x):
feature_maker = Dense(10, activation='relu')(input_x)
feature_maker = Dense(20, activation='relu')(feature_maker)
feature_maker = Dense(1, activation='linear')(feature_maker)
return feature_maker
# a list of input layers, of course the input shapes can be different
input_layers = [Input(shape=(3, )) for _ in range(2)]
coupled_feature = [nn_model(input_x) for input_x in input_layers]
# assume you take the sum of the outputs
coupled_feature = keras.layers.Add()(coupled_feature)
prediction = Dense(1, activation='relu')(coupled_feature)
model = keras.models.Model(inputs=input_layers, outputs=prediction)
model.compile(loss='mse', optimizer='adam')
# example training set
x_1 = np.linspace(1, 90, 270).reshape(90, 3)
x_2 = np.linspace(1, 90, 270).reshape(90, 3)
y = np.random.rand(90)
inputs_x = [x_1, x_2]
model.fit(inputs_x, y, batch_size=32, epochs=10)
You can actually plot the model to gain more intuition
from keras.utils.vis_utils import plot_model
plot_model(model, show_shapes=True)
The model of the above code looks like this
With a little remodeling and the functional API you can:
#create the CNN - it can also be a sequential
cnn_input = Input(image_shape)
cnn_output = Conv2D(...)(cnn_input)
cnn_output = Conv2D(...)(cnn_output)
cnn_output = MaxPooling2D()(cnn_output)
....
cnn_model = Model(cnn_input, cnn_output)
#create the FC model - can also be a sequential
fc_input = Input(fc_input_shape)
fc_output = Dense(...)(fc_input)
fc_output = Dense(...)(fc_output)
fc_model = Model(fc_input, fc_output)
There is a lot of space for creativity, this is just one of the ways.
#create the full model
full_input = Input(image_shape)
full_output = cnn_model(full_input)
full_output = fc_model(full_output)
full_model = Model(full_input, full_output)
You can use any of the three models in any way you want. They share the layers and the weights, so internally they are the same.
Saving and loading the full model might be quirky. I'd probably save the other two separately and when loading create the full model again.
Notice also that if you save two models that share the same layers, after loading they will probably not share these layers anymore. (Another reason for saving/loading only fc_model and cnn_model, while creating full_model again from code)

Why do I need to compile and fit a pretrained model in Keras?

I want to load in a pretrained model and start testing on images.
This is the code that I thought would work:
from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
base_model = InceptionV3(weights='imagenet', include_top=False)
from __future__ import absolute_import, division, print_function
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
#Preprocessing
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
train_images = train_images / 255.0
test_images = test_images / 255.0
#Preprocessing
test_loss, test_acc = base_model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)
Instead it says: "You must compile a model before training/testing"
Looking here https://keras.io/applications/ at InceptionV3: they seem to be compiling and fitting the model after importing it. Why do they do this?
The InceptionV3 model was trained on very different images in comparison to Fashion MNIST. What you are seeing in the tutorial is an instance of transfer learning. Roughly in transfer learning, you can split up the model into a feature extraction module and a classification module. The goal of the convolutional and pooling layers are to automate the feature extractions so that we can produce an ideal transformation from raw image pixels to a representative set of features that describes the images well.
These images are then fed to a classification module, where the goal is to take these features and actually do classification. That's the goal of the dense layers that are attached after the convolutional and pooling. Also note that the InceptionV3 model is trained on ImageNet images, which have 1000 classes. In order to successfully apply ImageNet to the Fashion MNIST dataset, you will need to retrain the Dense layers so that the conv and pooling layers can take the features extracted from the images and perform classification on that. Therefore, set include_top=False as what you have done, but you'll also have to attach some Dense layers and retrain those. Also make sure that you specify the last layer to have 10 classes due to the Fashion MNIST dataset.
However, some gotchas are that InceptionV3 takes in 299 x 299 sized images where Fashion MNIST takes in 28 x 28. You will need to resize the images, as well as artificially pad the images in the third dimension so that they are RGB. Because going from 28 x 28 to 299 x 299 requires a factor of 10 increase in both dimensions, resizing images up to this resolution probably will not look well perceptually. InceptionV3 can load in a model where you can change the expected input image size. The smallest image size unfortunately is 75 x 75 for InceptionV3, so we'll have to use that then resize up to 75 x 75. To resize the image, you can use Scikit-images resize method from skimage.transform. Also, if you plan on using InceptionV3, you'll need to preprocess the input images like they've done in their network prior to training.
Therefore:
from __future__ import absolute_import, division, print_function
from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.applications.inception_v3 import preprocess_input # New
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(75, 75, 3))
# Now add some Dense Layers - let's also add in a Global Average Pooling layer too
# as a better way to "flatten"
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a softmax layer -- 10 classes
predictions = Dense(10, activation='softmax')(x)
# Create new model
model = Model(inputs=base_model.input, outputs=predictions)
# Make sure we set the convolutional layers and pooling layers so that they're not trainable
for layer in base_model.layers:
layer.trainable = False
#Preprocessing
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
train_images = train_images.astype(np.float) / 255.0 # Change
test_images = test_images.astype(np.float) / 255.0 # Change
# Preprocessing the images
from skimage.transform import resize
train_images_preprocess = np.zeros((train_images.shape[0], 75, 75, 3), dtype=np.float32)
for i, img in enumerate(train_images):
img_resize = resize(img, (75, 75), anti_aliasing=True)
img_resize = preprocess_input(img_resize).astype(np.float32)
train_images_preprocess[i] = np.dstack([img_resize, img_resize, img_resize])
del train_images
test_images_preprocess = np.zeros((test_images.shape[0], 75, 75, 3), dtype=np.float32)
for i, img in enumerate(test_images):
img_resize = resize(img, (75, 75), anti_aliasing=True)
img_resize = preprocess_input(img_resize).astype(np.float32)
test_images_preprocess[i] = np.dstack([img_resize, img_resize, img_resize])
del test_images
# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# Train it
model.fit(train_images_preprocess, train_labels, epochs=15)
# Now evaluate the model - note that we're evaluating on the new model, not the old one
test_loss, test_acc = model.evaluate(test_images_preprocess, test_labels)
print('Test accuracy:', test_acc)
Take note that I had to change the expected input shape of the data from the InceptionV3 base model so that it's 75 x 75 x 3 with 3 being it's expecting colour images. Also, I had to convert your data to floating-point before dividing by 255 or the data will still be unsigned 8-bit integer so the only values would be either 0 or 1, thus significantly decreasing your accuracy. Also, I created new arrays that store the RGB versions of the images that are not only resized to 75 x 75, but are also preprocessed using the same method that InceptionV3 uses prior to training their images. One other thing I should mention is that we need to set the layers prior to the Dense layers so that we don't train on them. We want to use these layers to provide the feature descriptors for the images that get pumped into the Dense layers to do the classification. Finally, take note that the labels for the training and test data are enumerated from 0 - 9. Therefore, the loss function you need will be the sparse categorical cross-entropy one which is designed to take in single-valued labels. The categorical cross-entropy loss function expects one-hot encoding.
We finally compile the model so we can set it up for training, then train it. Finally we evaluate the test data accuracy. This will of course require some tuning, especially the number of Dense layers you want, and the number of epochs to choose for training.
Warning
The resizing of the images and creating a new array for them will take some time as we're looping over 60000 training images and 10000 test images respectively. You'll need to be patient here. In order to conserve memory, I delete the original training and test images from memory to compensate for the preprocessed images.
Ending Note
Because the Fashion MNIST dataset has considerably less degrees of freedom than ImageNet, you can get away with getting a high accuracy model using fewer layers than normal. The ImageNet database consists of images that have varying levels of distortion, object orientations, placements and size. If you constructed a model that consisted of just a few conv and pool layers, combined with a flattening and a couple of dense layers, not only will it take less time to train, but you'll get a decently performing model.
Most pre-trained image classification models are pre-trained on the ImageNet dataset, so you're loading the parameter weights from that training when you call base_model = InceptionV3(weights='imagenet', include_top=False). The include_top=False parameter actually chops off the prediction layer from the model, which you are expected to add on and train on your own dataset, Fashion MNIST in this case.
The transfer learning approach doesn't completely get rid of all training, but makes it so you only need to fine tune the model based on your dataset's specific data. Since the model has already learned how to recognize basic and even somewhat complex shapes by training on ImageNet, now it just needs to be trained to recognize what certain combinations of shapes mean in the context of your data.
That being said, I believe you should still be able to call model.predict(x) on some preprocessed image, x, if you change include_top=False to include_top=True, although the model will try to classify the image into one of ImageNet's 1000 classes and not into one of Fashion MNIST's classes.
The example you are showing in the Keras documentation does not the same as you want to do. They fit a model in order to perform transfer learning.
You seem to want to just load a pretrained model and then evaluate its loss/accuracy on some dataset. The problem is that in order to call model.evaluate, you first need to define a loss and metrics (including accuracy), and for that you need to call model.compile(loss = ..., metrics = ..., optimizer = ...), just because it is the only Keras call that sets the loss and metrics of a model.
If for some reason you don't want to do that, you could just call y_pred = model.predict with your dataset, and use any python implementation of the loss and metrics you want on y_true and y_pred. This would not require to compile the model as you externalize the evaluation.

What is the significance of extracted weight of Keras layer for doing forward pass without using Keras API

My intention is to know the kernel weight which has used during convolution and then do the forward pass on an image to classify. The work which is easy to do by using Keras API but it is a demand for my master's thesis as here I want to build a CNN model on FPGA only for testing/ classification.
Instead of using Keras API:
1/ I will write a plain code where I will give my preprocessed image as an input
2/ I will write convolution algorithm and give the extracted information of the Kernel to do the convolution
3/ I will write the algorithm for Flatten and
4/ By using Dense algorithm I want to predict the class
MY query is:
1/ What is the information actually is giving by layer.get_weights()? Is it giving us the kernel weight which will use for the convolution?
2/ If I want to do the classification with the help of extracted weight how can I approach?
The following is my model:(For simplicity I have just written a model with minimal layer)
def cnn_model():
model = Sequential()
model.add(Conv2D(1, (3, 3), padding='same',
input_shape=input_shape,
activation='relu'))
model.add(Flatten())
model.add(Dense(num_classes, activation='softmax'))
return model
model = cnn_model()
lr = 0.01
sgd = SGD(lr=lr, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
the input image is grayscale and width, height is 80,80.
I have trained my model using the following code:
def lr_schedule(epoch):
return lr * (0.1 ** int(epoch / 10))
batch_size = batch_size
epochs = nb_epoch
model.fit(X_train, Y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(X_test, Y_test),
#np.resize(img, (-1, <image shape>)
callbacks= [LearningRateScheduler(lr_schedule),
ModelCheckpoint('path_to_save_model/model.h5',
save_best_only=True)])
I have extracted the layers weight by using:
from keras.models import load_model
import pandas as pd
weight_list=[]
for lay in model.layers:
name=lay.name
weight=lay.get_weights()
print(name," layer weight is:\n\n",weight,"\n\n")
weight_list.append(weight)
weight_array=[]
weight_array=np.array(weight_list)
print("weight_array's fist element is: \n\n",weight_array[0],"\n\n")
output of weight_array=[0] is
[array([[[[ 0.3856341 ]],
[[-0.35276324]],
[[-0.51678646]]],
[[[-0.62636113]],
[[ 0.43428165]],
[[-0.26765126]]],
[[[ 0.461921 ]],
[[-0.14468761]],
[[-0.3061749 ]]]], dtype=float32), array([-0.1087065], dtype=float32)]
Any suggestions would be appreciatable.
1) What is the information actually is giving by layer.get_weights()? Is it giving us the kernel weight which will use for the convolution?
For convolutional layers, layer.get_weights() returns [kernel, bias].
2) If I want to do the classification with the help of extracted weight how can I approach?
Replicate each stage of your network, the mathematics of which is well documented. Pay close attention to the exact operation being performed, for example 'convolution' in deep learning is not quite the same as in standard maths (no transform is applied). I suggest you pass a known input through the network and check that you get the same answers at every stage.
I am trying to explain the solution I have done and understood regarding my question. I am giving here the solution that I have tried. Major information I have given from here.
In a model file convolution kernel weight , convolution bias, Dense layer weight, Dense layer Bias are stored. If anyone wants to write a Forward pass from scratch Numpy using Python or a function in C++ these kernels weights are important. Details information is given in the Github link

Categories