Get decoder from trained autoencoder model in Keras - python

I am training a deep autoencoder to map human faces to a 128 dimensional latent space, and then decode them back to its original 128x128x3 format.
I was hoping that after training the autoencoder, I would somehow be able to 'slice' the second half of the autoencoder, i.e. the decoder network responsible for mapping the latent space (128,) to the image space (128, 128, 3) by using the functional Keras API and autoenc_model.get_layer()
Here are the relevant layers of my model:
INPUT_SHAPE=(128,128,3)
input_img = Input(shape=INPUT_SHAPE, name='enc_input')
#1
x = Conv2D(64, (3, 3), padding='same', activation='relu')(input_img)
x = BatchNormalization()(x)
//Many Conv2D, BatchNormalization(), MaxPooling() layers
.
.
.
#Flatten
fc_input = Flatten(name='enc_output')(x)
y = Dropout(DROP_RATE)(fc_input)
y = Dense(128, activation='relu')(y)
y = Dropout(DROP_RATE)(y)
fc_output = Dense(128, activation='linear')(y)
#Reshape
decoder_input = Reshape((8, 8, 2), name='decoder_input')(fc_output)
#Decoder part
#UnPooling-1
z = UpSampling2D()(decoder_input)
//Many Conv2D, BatchNormalization, UpSampling2D layers
.
.
.
#16
decoder_output = Conv2D(3, (3, 3), padding='same', activation='linear', name='decoder_output')(z)
autoenc_model = Model(input_img, decoder_output)
here is the notebook containing the entire model architecture.
To get the decodeer network from the trained autoencoder, I have tried using:
dec_model = Model(inputs=autoenc_model.get_layer('decoder_input').input, outputs=autoenc_model.get_layer('decoder_output').output)
and
dec_model = Model(autoenc_model.get_layer('decoder_input'), autoenc_model.get_layer('decoder_output'))
neither of which seem to work.
I need to extract the decoder layers out of the autoencoder as I want to train the entire autoencoder model first, then use the encoder and the decoder independently.
I could not find a satisfactory answer anywhere else. The Keras blog article on building autoencoders only covers how to extract the decoder for 2 layered autoencoders.
The decoder input/output shape should be: (128, ) and (128, 128, 3), which is the input shape of the 'decoder_input' and output shape of the 'decoder_output' layers respectively.

Couple of changes are needed:
z = UpSampling2D()(decoder_input)
to
direct_input = Input(shape=(8,8,2), name='d_input')
#UnPooling-1
z = UpSampling2D()(direct_input)
and
autoenc_model = Model(input_img, decoder_output)
to
dec_model = Model(direct_input, decoder_output)
autoenc_model = Model(input_img, dec_model(decoder_input))
Now, you can train on the auto encoder and predict using the decoder.
import numpy as np
autoenc_model.fit(np.ones((5,128,128,3)), np.ones((5,128,128,3)))
dec_model.predict(np.ones((1,8,8,2)))
You can also refer this self-contained example:
https://github.com/keras-team/keras/blob/master/examples/variational_autoencoder.py

My solution isn't very elegant, and there are probably better solutions out there, but since no-one replied yet, I'll post it (I was actually hoping someone would so I can improve my own implementation, as you'll see below).
So what I did was built a network that can take a secondary input, directly into the latent space.
Unfortunately, both inputs are obligatory, so I end up with a network that requires dummy arrays full of zeros for the 'unwanted' input (you'll see in a second).
Using Keras functional API:
image_input = Input(shape=image_shape)
conv1 = Conv2D(...,activation='relu')(image_input)
...
dense_encoder = Dense(...)(<layer>)
z_input = Input(shape=n_latent)
decoder_entry = Dense(...,activation='relu')(Add()([dense_encoder,z_input]))
...
decoder_output = Conv2DTranspose(...)
model = Model(inputs=[image_input,z_input], outputs=decoder_output)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
encoder = Model(inputs=image_input,outputs=dense_encoder)
decoder = Model(inputs=[z_input,image_input], outputs=decoder_output)
Note that you shouldn't compile the encoder and decoder.
(some code is either omitted or left with ... for you to fill in your specific needs).
Finally, to train you'll have to provide one empty array. So to train the entire auto-encoder:
images is X in this context
model.fit([images,np.zeros((len(n_latent),...))],images)
And then you can get the latent features using:
latent_features = encoder.predict(images)
Or use the decoder with latent input and dummy variables (note the order of inputs above):
decoder.predict([Z_inputs,np.zeros(shape=images.shape)])
Finally, another solution I haven't tried is build to parallel models, with the same architecture, one the autoencoder, and the second only the decoder part, and then use:
decoder_layer.set_weights(model_layer.get_weights())
It should work, but I haven't confirmed it. It does have the disadvantage of having to copy the weights again every time your train the autoencoder model.
So to conclude, I am aware of the many problems here, but again, I only posted this because I saw no-one else replied, and was hoping this will still be of some use to you.
Please comment if something is not clear.

An option is to define a function which uses get_layer and then reconstruct the decoder part in there. For example, consider a simple autoencoder with the following architecture: [n_inputs, 500, 100, 500, n_outputs]. To be able to run some inputs through the second half (ie run 100 inputs through the layers of 500 and n_outputs.
# Function to get outputs from a given set of bottleneck inputs
def bottleneck_to_outputs(bottleneck_inputs, autoencoder):
# Run bottleneck_inputs (eg 100 units) through decoder layer (eg 500 units)
x = autoencoder.get_layer('decoder')(bottleneck_inputs)
# Run x (eg 500 units) through output layer (n units = n features)
x = autoencoder.get_layer('output')(x)
return x
For your example, this function should work (assuming you have given your layers the names referenced here).
def decoder_part(autoenc_model, image):
#UnPooling-1
z = autoenc_model.get_layer('upsampling1')(image)
#9
z = autoenc_model.get_layer('conv2d1')(z)
z = autoenc_model.get_layer('batchnorm1')(z)
#10
z = autoenc_model.get_layer('conv2d2')(z)
z = autoenc_model.get_layer('batchnorm2')(z)
#UnPooling-2
z = autoenc_model.get_layer('upsampling2')(z)
#11
z = autoenc_model.get_layer('conv2d3')(z)
z = autoenc_model.get_layer('batchnorm3')(z)
#12
z = autoenc_model.get_layer('conv2d4')(z)
z = autoenc_model.get_layer('batchnorm4')(z)
#UnPooling-3
z = autoenc_model.get_layer('upsampling3')(z)
#13
z = autoenc_model.get_layer('conv2d5')(z)
z = autoenc_model.get_layer('batchnorm5')(z)
#14
z = autoenc_model.get_layer('conv2d6')(z)
z = autoenc_model.get_layer('batchnorm6')(z)
#UnPooling-4
z = autoenc_model.get_layer('upsampling4')(z)
#15
z = autoenc_model.get_layer('conv2d7')(z)
z = autoenc_model.get_layer('batchnorm7')(z)
#16
decoder_output = autoenc_model.get_layer('decoder_output')(z)
return decoder_output
Given this function, it would make sense to also have a way to test if it is working correctly. In order to do this, define another model which gets you from inputs to the bottleneck (latent space), such as:
bottleneck_layer = Model(inputs= input_img,outputs=decoder_input)
Then, as a test, run a vector of ones through the first part of the model and obtain the latent space:
import numpy as np
ones_image = np.ones((128,128,3))
bottleneck_ones = bottleneck_layer(ones_image.reshape(1,128,128,3))
And then run that latent space through the function defined above to create a variable which you will test against the output of full network:
decoded_test = decoder_part(autoenc_model, bottleneck_ones)
Now, run the ones_image through the whole network and verify that you get the same results:
model_test = autoenc_model.predict(ones_image.reshape(1,128,128,3))
tf.debugging.assert_equal(model_test, decoder_test, message= 'Tensors are not equivalent')
If the assert_equal line does not throw an error, your decoder is working correctly.

Related

Image sequence detection with Keras, Convolutional and Stateful Neural Network

I am trying to write a pretty complicated neural network (at least for me) in keras that needs to combine both a common CNN structure and an LSTM/GRU layer.
Basically, I have a dataset of climatological maps of the Mediterranean sea, each map details the wind, pressure and other parameters. I am studying Medicanes (Mediterranean hurricanes) and my goal is to create a neural network that can classify each map with a label zero if there is no trace of such hurricanes or one if the map contains one.
In order to achieve that I need a network with two parts:
feature extractor (normal CNN).
temporal layer (LSTM/GRU).
The main cause of this is that each map is correlated with the previous one because the formation and life cycle of a Medicane can take several days to complete.
Important note: the dataset is too big to be uploaded all at once so I have to work one batch at a time.
I am working with Keras and I found it pretty challenging to adapt its standard framework to my needs so I have come up with some peculiar flow to feed my data into the network.
In particular, I found it hard to pass both my batch size and my time-step parameter to the GRU layer using a more standard alternative.
This is what I tried:
I am positively sure I have overcomplicated the task, but, as I said I am not very proficient with Keras and TensorFlow.
The main problem was that I could not find a way to import the data both in a batch (for RAM reasons) and in a sequence of 10-15 pictures (to be used as the time steps in the GRU layer).
I solved this problem by importing batches of 120 maps in order (no shuffle) and I created a way to turn these batches into the sequence of images I needed then I proceeded to re-batch the sequences and feed them to the model manually.
Data Import
batch_size=120
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
"./Figures_1/Train",
validation_split=None,
subset=None,
labels="inferred",
label_mode="binary",
color_mode="rgb",
interpolation='bilinear',
batch_size=batch_size,
image_size=(600, 600),
shuffle=False,
seed=123
)
Get a sequence of Images
Here, I break down the 120 map batches into sequences of 60 observations, and I return each sequence one at a time.
sequence_lengh=60
def sequence_x(train_dataset):
x_numpy = np.asarray(list(map(lambda x: x[0], tfds.as_numpy(train_dataset))),dtype=object)
for element in range(0,x_numpy.shape[0]):
for i in range(0, x_numpy.shape[0],sequence_lengh):
x_seq = x_numpy[element][i:i+sequence_lengh]
yield x_seq
def sequence_y(train_dataset):
y_numpy = np.asarray(list(map(lambda x: x[1], tfds.as_numpy(train_dataset))),dtype=object)
for element in range(0,y_numpy.shape[0]):
for i in range(0, y_numpy.shape[0],sequence_lengh):
y_seq = y_numpy[element][i:i+sequence_lengh]
yield y_seq
CNN Model
I build the CNN model based on a pre-trained DenseNet
from keras.layers import TimeDistributed, GRU
def build_convnet(shape=(600, 600, 3)):
inputs = keras.Input(shape = shape)
x = inputs
# preprocessing
x = keras.applications.densenet.preprocess_input(x)
#Convbase
x = convBase(x)
x = layers.Flatten()(x)
# Fine tuning
x = keras.layers.Dense(1024, activation='relu')(x)
x = layers.Dropout(0.2)(x)
x = keras.layers.Dense(512, activation='relu')(x)
x = keras.layers.GlobalMaxPool2D()
return x
GRU Model
I build the time part of the network with a GRU layer
def action_model(shape=(15, 600, 600, 3), nbout=15):
# Create our convnet with (112, 112, 3) input shape
convnet = build_convnet(shape[1:]) #[1:]
# then create our final model
model = keras.Sequential()
# add the convnet with (5, 112, 112, 3) shape
model.add(TimeDistributed(convnet, input_shape=shape))
# here, you can also use GRU or LSTM
model.add(GRU(64))
# and finally, we make a decision network
model.add(Dense(1024, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(512, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(128, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(64, activation='relu'))
model.add(Dense(15, activation='softmax'))
return model
Transfer Learning
I retrain a part of the GRU
convBase = DenseNet121(include_top=False, weights=None, input_shape=(600,600,3), pooling="avg")
for layer in convBase.layers:
if 'conv5' in layer.name:
layer.trainable = True
for layer in convBase.layers:
if 'conv4' in layer.name:
layer.trainable = True
Model Compile
Model compilation ( image size= 600x600x3)
INSHAPE=(15, 600, 600, 3) # (5, 112, 112, 3)
model = action_model(INSHAPE, 1)
optimizer = keras.optimizers.Adam(0.001)
model.compile(
optimizer,
'categorical_crossentropy',
metrics='accuracy'
)
Model Fit
Here I manually batch my data. I turn an array (60, 600, 600, 3) into a (4,15,600,600) array. Meaning 4 batches each one containing a 15-map long sequence.
epochs = 10
for value in range(0, epochs):
train_x, train_y = sequence_x(train_ds), sequence_y(train_ds)
val_x, val_y = sequence_x(validation_ds), sequence_y(validation_ds)
for i in range(0,278): #
x = next(train_x, "none")
y = next(train_y, "none")
if (x!="none" or y!="none"):
if (np.any(x) and np.any(y)):
x_stack = np.stack((x[:15], x[15:30], x[30:45], x[45:]))
y_stack = np.stack((y[:15], y[15:30], y[30:45], y[45:]))
y_stack=y_stack.reshape(4,15)
model.fit(x=x_stack, y=y_stack,
validation_data=None,
batch_size=None,
shuffle=False
)
else:
continue
else:
continue
The idea is to get a model that, when presented with a sequence of images, can categorize each one of them with a 0 or a 1 if they have a Medicane or not.
The model does compile without any errors but the results it provides are horrible:
.
What am I doing incorrectly? Is there a more effective way to write all of this?

Using Keras VGG19 preprocess_input function during model training

I am using VGG19 model pretrained on ImageNet dataset with top-layers (from flatten till the last output layer) removed.
The problem I am solving takes in multiple inputs that are from ModelNet dataset (a modified version actually). So what I am doing is, I am using VGG19 to extract features from these images for me and then I concatenate the output which is further fed to the rest of the network i.e. the top of the model that I have created according to my need and the number of classes (20 in my case). This is my code for the model I am using (if needed for reference):
from keras.applications.vgg19 import VGG19
input_1 = Input(shape=(224, 224, 3), name='image1')
input_2 = Input(shape=(224, 224, 3), name='image2')
input_3 = Input(shape=(224, 224, 3), name='image3')
input_4 = Input(shape=(224, 224, 3), name='image4')
base_model = VGG19(weights='imagenet', input_shape=(224, 224, 3), include_top=False)
base_model.trainable = False
x1 = base_model(input_1, training=False)
x2 = base_model(input_2, training=False)
x3 = base_model(input_3, training=False)
x4 = base_model(input_4, training=False)
x = concatenate([x1, x2, x3, x4])
x = Flatten()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(256, activation='relu')(x)
outputs = Dense(20, activation='softmax', name='class_out')(x)
model = tf.keras.models.Model([input_1, input_2, input_3, input_4], outputs)
print(model.summary())
Now, I want to know that is it necessary to use 'from keras.applications.vgg19 import preprocess_input' for my input images before training the model? or should I use this function while predicting only?
As for my input, they are already normalized values that I am using for model training via a custom made data-loader function; which simply returns the generator of 4 normalized input images and a class_output, like this:
# The output from data-loader function
# where x_batch are values normalized by x_batch[i,j]/255
yield {'image1': x_batch[:, 0],
'image2': x_batch[:, 1],
'image3': x_batch[:, 2],
'image4': x_batch[:, 3], }, {'class_out': y_batch}
However, whenever I use preprocess_input instead of plain normalization, my output image is a weird looking image like this:
I do not understand this scenario.. if anyone can help me with this please.
VGG19 requires your input image to be a BGR image, not RGB image. And so, keras.applications.vgg19.preprocess_input will convert the input images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling. For more information about your model click here, and more about the preprocessing function can be found here
So, if you are already converting your images from RGB to BGR before passing to the model, then you don't need to use keras.applications.vgg19.preprocess_input on your input images.

How to get the latent vector as an output from a cnn model before training to the fully connected layer?

I am working on CNN model using Tensorflow frames in google collab. I am unable to extract the latent vectors from the convolutional layers. I want to extract the output of the convolutional layers, the layers before fully connected layer.
I have tried with the following code
a = dropout()(classifier_model.output)
print(a)
I am unable to understand the solution suggested on the link Stackoverflow solution to print the value of tensorflow object after applying a-conv-pool-layer
Anyone with any suggestion?
You can use get_layer method of the Model class to get a layer by its name, find bellow an example with a dummy 1D CNN and a binary classifier :
timesteps = 100
nfeatures = 2
# build the model using the functional API
# example of a 1D CNN inspired by the your stack overflow link, but using a model instead of successive *raw* layers
# the values of the Conv1D filters and kernels are different
input = Input((timesteps, nfeatures))
p = Conv1D(filters=16, kernel_size=10)(input)
p = ReLU()(p)
p = MaxPool1D(pool_size=2)(p)
p = Conv1D(filters=32, kernel_size=10)(p)
p = ReLU()(p)
p = MaxPool1D(pool_size=2)(p)
p = Conv1D(filters=64, kernel_size=10)(p)
p = ReLU()(p)
p = MaxPool1D(pool_size=2, name='conv1Dfeat')(p) # give a name to the CNN output
# fully connected part
p = Flatten()(p)
p = Dense(10)(p)
# could add a dropout layer to ease optimization
finaloutput = Dense(1, activation='sigmoid')(p)
# full model
model = Model(inputs=input, outputs=finaloutput)
# compile network, i.e. define optimizer, loss and metrics
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()
You need to train the model using the fit method with some data. Then you can get the output of the layer which name is conv1Dfeat (the last layer of the convolutive part) by defining the model:
modelCNN = Model(inputs=input, outputs=model.get_layer('conv1Dfeat').output)
modelCNN.summary()
If you want to get the output of the convolutive part, let's say based on a single numpy input array of shape (timesteps, nfeatures), you can use the predict of the Model class on batched data:
data = np.random.normal(size=(timesteps, nfeatures)) # dummy data
data_tf = tf.expand_dims(data, axis=0) # convert to TF tensor and add batch dimension at the same time
cnn_out_np = modelCNN.predict(data_tf)
cnn_out_np = np.squeeze(cnn_out_np, axis=0) # remove batch dimension
print(cnn_out_np.shape)
(4, 64)

failing to load weights into model XCeption-CNN

I make the same model, I load the weights. It had worked before but now it does not. Maybe I am stupid but I am thought I checked everything.
def make_model(trainable = False):
xception = Xception(include_top=False, weights='imagenet', input_shape=input_shape)
xception.trainable = trainable
inputs = Input(shape=input_shape, name='xception_input')
x = xception(inputs, training=False)
x = Flatten()(x)
x = Dense(256, activation='swish', name='xception_int', trainable=True)(x)
x = Dropout(0.6)(x)
outputs = Dense(17, activation='softmax', name='xception_output')(x)
model = Model(inputs, outputs)
return model
then I load the weights.
model.load_weights('models/xception2_weights.h5')
It has different things it says at the end of the error message. On Mac it says:
ValueError: Shapes (32,) and (3, 3, 32, 64) are incompatible
on Windows:
ValueError: axes don't match array
from what it says on the Mac, I guess it has something to do with the XCeption part. I had used load_weights in the same way successfully before I don't know why it is like this now.
If anyone can help, that would be great
I identified that when you create the model via tf.keras.applications.Xception as a functional model, has a different shape than including tf.keras.applications.Xception a Sequential model.
It also happens if you remove the top layer, adding a new classifier and then you return back to the original Xception shape with a Dense(1000) output.
If your initial model was generated with a different classifier, you added more layers and went back, it will also mismatch. Please check the axis of the original model and this new one and compare the shapes.

How to use Bidirectional RNN and Conv1D in keras when shapes are not matching?

I am brand new to Deep-Learning so I'm reading though Deep Learning with Keras by Antonio Gulli and learning a lot. I want to start using some of the concepts. I want to try and implement a neural network with a 1-dimensional convolutional layer that feeds into a bidirectional recurrent layer (like the paper below). All the tutorials or code snippets I've encountered do not implement anything remotely similar to this (e.g. image recognition) or use an older version of keras with different functions and usage.
What I'm trying to do is a variation of this paper:
(1) convert DNA sequences to one-hot encoding vectors; ✓
(2) use a 1 dimensional convolutional neural network; ✓
(3) with max pooling; ✓
(4) send the output to a bidirectional RNN; ⓧ
(5) classify the input;
I cannot figure out how to get the shapes to match up on the Bidirectional RNN. I can't even get an ordinary RNN to work at this stage. How can I restructure the incoming layers to work with a Bidirectional RNN?
Note:
The original code came from https://github.com/uci-cbcl/DanQ/blob/master/DanQ_train.py but I simplified the output layer to just do binary classification. This processed was described (kind of) in https://github.com/fchollet/keras/issues/3322 but I cannot get it to work with the updated keras. The original code (and the 2nd link) work on a very large dataset so I am generating some fake data to illustrate the concept. They are also using an older version of keras where key functionality changes have been made since then.
# Imports
import tensorflow as tf
import numpy as np
from tensorflow.python.keras._impl.keras.layers.core import *
from tensorflow.python.keras._impl.keras.layers import Conv1D, MaxPooling1D, SimpleRNN, Bidirectional, Input
from tensorflow.python.keras._impl.keras.models import Model, Sequential
# Set up TensorFlow backend
K = tf.keras.backend
K.set_session(tf.Session())
np.random.seed(0) # For keras?
# Constants
NUMBER_OF_POSITIONS = 40
NUMBER_OF_CLASSES = 2
NUMBER_OF_SAMPLES_IN_EACH_CLASS = 25
# Generate sequences
https://pastebin.com/GvfLQte2
# Build model
# ===========
# Input Layer
input_layer = Input(shape=(NUMBER_OF_POSITIONS,4))
# Hidden Layers
y = Conv1D(100, 10, strides=1, activation="relu", )(input_layer)
y = MaxPooling1D(pool_size=5, strides=5)(y)
y = Flatten()(y)
y = Bidirectional(SimpleRNN(100, return_sequences = True, activation="tanh", ))(y)
y = Flatten()(y)
y = Dense(100, activation='relu')(y)
# Output layer
output_layer = Dense(NUMBER_OF_CLASSES, activation="softmax")(y)
model = Model(input_layer, output_layer)
model.compile(optimizer="adam", loss="categorical_crossentropy", )
model.summary()
# ~/anaconda/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/layers/recurrent.py in build(self, input_shape)
# 1049 input_shape = tensor_shape.TensorShape(input_shape).as_list()
# 1050 batch_size = input_shape[0] if self.stateful else None
# -> 1051 self.input_dim = input_shape[2]
# 1052 self.input_spec[0] = InputSpec(shape=(batch_size, None, self.input_dim))
# 1053
# IndexError: list index out of range
You don't need to restructure anything at all to get the output of a Conv1D layer into an LSTM layer.
So, the problem is simply the presence of the Flatten layer, which destroys the shape.
These are the shapes used by Conv1D and LSTM:
Conv1D: (batch, length, channels)
LSTM: (batch, timeSteps, features)
Length is the same as timeSteps, and channels is the same as features.
Using the Bidirectional wrapper won't change a thing either. It will only duplicate your output features.
Classifying.
If you're going to classify the entire sequence as a whole, your last LSTM must use return_sequences=False. (Or you may use some flatten + dense instead after)
If you're going to classify each step of the sequence, all your LSTMs should have return_sequences=True. You should not flatten the data after them.

Categories