Keras CNN Autoencoder input shape is wrong - python

I have build a CNN autoencoder using keras and it worked fine for the MNIST test data set. I am now trying it with a different data set collected from another source. There are pure images and I have to read them in using cv2 which works fine. I then convert these images into a numpy array which again I think works fine. But when I try to do the .fit method it gives me this error.
Error when checking target: expected conv2d_39 to have shape (100, 100, 1) but got array with shape (100, 100, 3)
I tried converting the images to grey scale but they then get the shape (100,100) and not (100,100,1) which is what the model wants. What am I doing wrong here?
Here is the code that I am using:
def read_in_images(path):
images = []
for files in os.listdir(path):
img = cv2.imread(os.path.join(path, files))
if img is not None:
images.append(img)
return images
train_images = read_in_images(train_path)
test_images = read_in_images(test_path)
x_train = np.array(train_images)
x_test = np.array(test_images) # (36, 100, 100, 3)
input_img = Input(shape=(100,100,3))
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(168, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.fit(x_train, x_train,
epochs=25,
batch_size=128,
shuffle=True,
validation_data=(x_test, x_test),
callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])
The model works fine with the MNIST data set but not with my own data set. Any help will be appreciated.

Your input and output shapes are different. That triggers the error (I think).
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
should be
decoded = Conv2D(num_channels, (3, 3), activation='sigmoid', padding='same')(x)

I ran some tests, and with data loaded in grayscale like that :
img = cv2.imread(os.path.join(path, files), 0)
then expand the dim of the final loaded array like :
x_train = np.expand_dims(x_train, -1)
and finaly normalize you data with a simple :
x_train = x_train / 255.
(the input of your model must be : input_img = Input(shape=(100, 100, 1))
The loss becomes normal again and the model run well !
UPDATE after comment
In order to keep all the rgb channel throught the network, you need an output corresponding to your input shape.
Here if you want image with shape (100, 100, 3), you need an output of (100, 100, 3) from your decoder.
The decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x) will shrink the output to have a shape (100, 100, 1)
So you simply need to change the number of filters, here we want 3 colors channels so the conv must be like that :
decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)

Related

Negative Loss In Keras Convolutional AutoEncoder

I am trying to implement the AutoEncoder developed on the keras documentation webpage (https://blog.keras.io/building-autoencoders-in-keras.html) that is using convolutional layers. On the example they use it for the MNIST dataset flattened (reshaped from 3 channels RGB to 1) but I want to use all 3 channels. The dataset I am using has different dimentions. So, what I tried to do is just changing the parts of the code in order to output and image from the decoder with dims = (128, 128, 3) but the problem is that I get negative loss (deeply negative) and I do not know what is happening. This is the chunk of code where I do so:
input_img = keras.Input(shape= input_dim)
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(encoded)
x = layers.UpSampling3D((2))(x)
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = layers.UpSampling3D((2))(x)
decoded = layers.Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = keras.Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
autoencoder.fit(x_train, x_train,
epochs=50,
batch_size=128,
shuffle=True,
validation_data=(x_test, x_test))
Input dim is equal to (128, 128, 3) and my data dimentions x_train.shape, x_test.shape, are equal to ((6000, 128, 128, 3), (1200, 128, 128, 3)).
Thanks in advance!

Inverse of keras.Flatten for building autoencoder

My goal is to build a convolutional autoencoder that encodes input image to flat vector of size (10,1). I followed the example from keras documentation and modified it for my purposes. Unfortunetely, the model like this:
input_img = Input(shape=(28, 28, 1))
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Flatten()(x)
encoded = Dense(units = 10, activation = 'relu')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
gives me
ValueError: Input 0 is incompatible with layer conv2d_39: expected ndim=4, found ndim=2
I think I should add some layer to my Decoder to inverse the effect of Flatten, but wasn't really sure which one. Can you help?
Why do you want to have specifically (10,1) shape for the vector?
You are trying to then do convolutions on that with kernel of size 3x3, which does not really make sense.
The shape a convolutional layer takes in has height, width and channels. The output of the dense layer has to be reshaped which can be done with Reshape layer.
You can then reshape it to for example 5x2 with single channel.
encoded = Reshape((5, 2, 1))(encoded)

Autoencoder algorithm and principle and why encoder part is blurry

Trying to understand the autoencoders, I am looking for an algorithm of an autoencodeur and the principle of autoencoders function, mathematical formulas ..
I'm currently working on autoencoders and trying to take the encoder output the compressed data and i'm not sure if that's the good result
i'm i in the right way?
code :
nput_img = Input(shape=(64, 64, 3))
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
encoder_mode = Model(input_img, encoded)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
...
autoencoder.fit
...
...
encoded_imgs = autoencoder.predict(X_test)
plt.imshow(encoded_imgs[i])
is it the encoded input it must be a the characteristic and compressed data ?
Autoencoders are used to encode the main features of the input data. You can think of it as a feature extractor. The result will be blurred because there is data loss when you encode. The principle is to represent the input with less data.
Your input data is 64x64x3 = 12288 pixels. And your encoded is 8x8x64 = 4096. Which is 1/3 of the input data.
Encoded size calculation:
input_img = Input(shape=(64, 64, 3)) # 64x64x3
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img) # 64x64x32
x = MaxPooling2D((2, 2), padding='same')(x) # 32x32x32
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x) # 32x32x64
x = MaxPooling2D((2, 2), padding='same')(x) # 16x16x64
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x) # 16x16x64
encoded = MaxPooling2D((2, 2), padding='same')(x) # 8x8x64
So you are reconstructing the original image from 33% of its data. It is quite impressive and of course there will be a little blur.

ValueError: Error when checking target: expected conv2d_21 to have 4 dimensions, but got array with shape (26, 1)

I have images with shape (3600, 3600, 3). I'd like to use an autoencoder on them. My code is:
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras import backend as K
from keras.preprocessing.image import ImageDataGenerator
input_img = Input(shape=(3600, 3600, 3))
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
batch_size=2
datagen = ImageDataGenerator(rescale=1. / 255)
# dimensions of our images.
img_width, img_height = 3600, 3600
train_data_dir = 'train'
validation_data_dir = validation
generator_train = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
)
generator_valid = datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
autoencoder.fit_generator(generator=generator_train,
validation_data = generator_valid,
)
When I run the code I get this error message:
ValueError: Error when checking target: expected conv2d_21 to have 4 dimensions, but got array with shape (26, 1)
I know the problem is somewhere in the shape of the layers, but I couldn't find it. Can someone please help me and explain the solution?
There are the following issues in your code:
Pass class_mode='input' to flow_from_directory method to give input images as the labels as well (since you are creating an autoencoder).
Pass padding='same' to the third Conv2D layer in the decoder:
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
Use three filers in the last layer since your images are RGB:
decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)

How to get a tensor value in Keras?

I want to compare 2 images.
The approach I adopt is to encode them.
The angle between the two encoded vectors is then calculated for similarity measure.
The code below is used to encode and then decode images using CNN with Keras.
However, I need to get the value of the tensor encoded.
How to achieve it?
Thank you very much.
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras import backend as K
input_img = Input(shape=(28, 28, 1))
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
#----------------------------------------------------------------#
# How to get the values of the tensor "encoded"? #
#----------------------------------------------------------------#
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
.....
autoencoder.fit(x_train, x_train,
epochs=50,
batch_size=128,
shuffle=True,
validation_data=(x_test, x_test),
callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])
In order to get an intermediate output, you need to create a separate model that contains the computation graph up to that point. In your case, you can:
encoder = Model(input_img, encoded)
After training with autoencoder is complete, you can encoder.predict which will return you the intermediate encoded result. You can also save the models separately as you would any other model and not have to train every time. In short, a Model is container for layers that construct a computation graph.
If I understand your question correctly you would like to get the 128 dimensional encoded representation, from the convolutional autoencoder, for image comparison?
What you could do is create a reference on the encoder part of the network, train the whole autoencoder and then encode the images with the weights of the encoder reference.
Put this:
self.autoencoder = autoencoder
self.encoder = Model(inputs=self.autoencoder.input, outputs=self.autoencoder.get_layer('encoded').output)
after autoencoder.compile()
and create encodings with:
encoded_img = self.encoder.predict(input)

Categories