Trying to understand the autoencoders, I am looking for an algorithm of an autoencodeur and the principle of autoencoders function, mathematical formulas ..
I'm currently working on autoencoders and trying to take the encoder output the compressed data and i'm not sure if that's the good result
i'm i in the right way?
code :
nput_img = Input(shape=(64, 64, 3))
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
encoder_mode = Model(input_img, encoded)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
...
autoencoder.fit
...
...
encoded_imgs = autoencoder.predict(X_test)
plt.imshow(encoded_imgs[i])
is it the encoded input it must be a the characteristic and compressed data ?
Autoencoders are used to encode the main features of the input data. You can think of it as a feature extractor. The result will be blurred because there is data loss when you encode. The principle is to represent the input with less data.
Your input data is 64x64x3 = 12288 pixels. And your encoded is 8x8x64 = 4096. Which is 1/3 of the input data.
Encoded size calculation:
input_img = Input(shape=(64, 64, 3)) # 64x64x3
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img) # 64x64x32
x = MaxPooling2D((2, 2), padding='same')(x) # 32x32x32
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x) # 32x32x64
x = MaxPooling2D((2, 2), padding='same')(x) # 16x16x64
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x) # 16x16x64
encoded = MaxPooling2D((2, 2), padding='same')(x) # 8x8x64
So you are reconstructing the original image from 33% of its data. It is quite impressive and of course there will be a little blur.
Related
I have a convolutional neural network, that does better job than others for my dataset. The problem is with the ZeroPadding2D I need to place to account for down/up sampling; it creates artifacts in the output (zero samples). So, How can I avoid ZeroPadding2D option without changing the network structure (Layers). I need to maintain the structure as it's (no.layers) and may change the
1- Filter
2- kernel
3- first dimension in my data (e.g 96)
4- any other options
Bellow is my CNN
input_img = Input(shape=(96, 44, 1), name='full')
x = GaussianNoise(.1)(input_img)
x = Conv2D(64, (5, 5), activation='relu', padding='same')(x)
x = AveragePooling2D((2, 2), padding='same')(x)
x = Dropout(0.1)(x)
x = Conv2D(128, (5, 5), activation='relu', padding='same')(x)
x = AveragePooling2D((2, 2), padding='same')(x)
x = Dropout(0.2)(x)
x = Conv2D(512, (5, 5), activation='relu', padding='same')(x)
encoded = AveragePooling2D((2, 2), padding='same')(x)
x = Dropout(0.2)(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(512, (5, 5), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Dropout(0.2)(x)
x = Conv2D(128, (5, 5), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Dropout(0.12)(x)
x = Conv2D(64, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
x = Dropout(0.12)(x)
x = ZeroPadding2D(((4, 0), (0, 0)))(x)
decoded = Conv2D(1, (5, 5), activation='tanh', padding='same',
name='out')(x)
autoencoder = Model(input_img, decoded)
I guess if you replace your Upsampling+ ZeroPadding sections with Conv2DTranspose , that might help to solve your issue.
Have a look here :
https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2DTranspose
I have been training this model for 1 epoch to see if it will work. What I want to do is take an image and create global features for the image. Right now after each training session, the features are all 0. Can someone please tell me the best way to create global features using Keras and cnn?
Here is my model so far.
def create_base_network():
"""
Base network to be shared.
"""
model_input = Input(shape=(224,224,3))
x = Conv2D(64, (3, 3), activation='relu', padding='same')(model_input)
x = MaxPooling2D(pool_size=(2,2))(x)
x = Dropout(0.5)(x)
x = BatchNormalization()(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D(pool_size=(2,2))(x)
x = Dropout(0.5)(x)
x = BatchNormalization()(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D(pool_size=(2,2))(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D(pool_size=(2,2))(x)
x = Dropout(0.5)(x)
x = BatchNormalization()(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D(pool_size=(2,2))(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same')(x)
#x = MaxPooling2D(pool_size=(2,2))(x)
x = Dropout(0.5)(x)
#x = Flatten()(x)
x = Dense(1024, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(0.7)(x)
x = Dense(1024, activation='relu')(x)
x = Dense(1024, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(0.7)(x)
x = Dense(4096, activation='relu')(x)
x = Dense(4096, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(0.7)(x)
# This layer is what the features are
x = Dense(4096, activation='relu')(x)
model = Model(inputs=model_input, outputs=x)
return model
Please and thank you.
I have an autoencoder with two outputs and the first output should be used as an input to the next part of autoencoder after some changes. I put my code here and I'm a little confused that this code is logically true or not. in the following code I have one output in decoder part named act11 the output of sigmoid activation and the second output is in the w extraction part with name pred_W. I feed bncv11 to the GaussianNoise instead of act11. I want to know it is correct or not? based on back propagation rules and the structure of the network is it possible to do this? can I just use the output of activation in this code act11=Activation('sigmoid',name='imageprim')(bncv11) for the output of model?
all of my questions is about this part of the code:
decoded = Conv2D(1, (5, 5), padding='same', name='decoder_output',dilation_rate=(2,2))(BNd)
bncv11=BatchNormalization()(decoded)
act11=Activation('sigmoid',name='imageprim')(bncv11)
decoded_noise = GaussianNoise(0.5)(bncv11)
#----------------------w extraction------------------------------------
convw1 = Conv2D(64, (3,3), activation='relu', padding='same', name='conl1w',dilation_rate=(2,2))(decoded_noise)
watermark_extraction=Model(inputs=[image,wtm],outputs=[act11,pred_w])
I want to know the above code based on deep learning is correct or not?
I just use act11 in the activation and for output but use bncv11 for feeding during learning.
wtm=Input((28,28,1))
image = Input((28, 28, 1))
conv1 = Conv2D(64, (5, 5), activation='relu', padding='same', name='convl1e',dilation_rate=(2,2))(image)
conv2 = Conv2D(64, (5, 5), activation='relu', padding='same', name='convl2e',dilation_rate=(2,2))(conv1)
conv3 = Conv2D(64, (5, 5), activation='relu', padding='same', name='convl3e',dilation_rate=(2,2))(conv2)
BN=BatchNormalization()(conv3)
encoded = Conv2D(1, (5, 5), activation='relu', padding='same',name='encoded_I',dilation_rate=(2,2))(BN)
add_const = Kr.layers.Lambda(lambda x: x[0] + x[1])
encoded_merged = add_const([encoded,wtm])
#-----------------------decoder------------------------------------------------
deconv1 = Conv2D(64, (5, 5), activation='relu', padding='same', name='convl1d',dilation_rate=(2,2))(encoded_merged)
deconv2 = Conv2D(64, (5, 5), activation='relu', padding='same', name='convl2d',dilation_rate=(2,2))(deconv1)
deconv3 = Conv2D(64, (5, 5), activation='relu',padding='same', name='convl3d',dilation_rate=(2,2))(deconv2)
deconv4 = Conv2D(64, (5, 5), activation='relu',padding='same', name='convl4d',dilation_rate=(2,2))(deconv3)
BNd=BatchNormalization()(deconv3)
decoded = Conv2D(1, (5, 5), padding='same', name='decoder_output',dilation_rate=(2,2))(BNd)
bncv11=BatchNormalization()(decoded)
act11=Activation('sigmoid',name='imageprim')(bncv11)
decoded_noise = GaussianNoise(0.5)(bncv11)
#----------------------w extraction------------------------------------
convw1 = Conv2D(64, (3,3), activation='relu', padding='same', name='conl1w',dilation_rate=(2,2))(decoded_noise)
convw2 = Conv2D(64, (3, 3), activation='relu', padding='same', name='convl2w',dilation_rate=(2,2))(convw1)
convw3 = Conv2D(64, (3, 3), activation='relu', padding='same', name='conl3w',dilation_rate=(2,2))(convw2)
convw4 = Conv2D(64, (3, 3), activation='relu', padding='same', name='conl4w',dilation_rate=(2,2))(convw3)
convw5 = Conv2D(64, (3, 3), activation='relu', padding='same', name='conl5w',dilation_rate=(2,2))(convw4)
convw6 = Conv2D(64, (3, 3), activation='relu', padding='same', name='conl6w',dilation_rate=(2,2))(convw5)
pred_w = Conv2D(1, (1, 1), activation='sigmoid', padding='same', name='reconstructed_W',dilation_rate=(2,2))(convw6)
watermark_extraction=Model(inputs=[image,wtm],outputs=[act11,pred_w])
I am having trouble getting this model to compile.
I am trying to implement a VGG16 but I will be using a custom loss function. The target variable has a shape of (?, 14, 14, 9, 6) where we only use binary crossentropy on Y_train[:,:,:,:,0] then Y_train[:,:,:,:,1] as a switch to turn off the loss effectively making this a mini-batch -- the others will be used on a separate branch of the neural net. This is a binary classification problem on this branch so I only want to have output of shape (?, 14, 14, 9, 1).
I have listed my error below. Can you please explain firstly what is going wrong and secondly how to mitigate this issue?
Model code
img_input = Input(shape = (224,224,3))
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
# # Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
# # Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
# # Block 5
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)
x = Conv2D(512, (3, 3), padding='same', activation='relu', kernel_initializer='normal', name='rpn_conv1')(x)
x_class = Conv2D(9, (1, 1), activation='sigmoid', kernel_initializer='uniform', name='rpn_out_class')(x)
x_class = Reshape((14,14,9,1))(x_class)
model = Model(inputs=img_input, outputs=x_class)
model.compile(loss=rpn_loss_cls(), optimizer='adam')
Loss function code:
def rpn_loss_cls(lambda_rpn_class=1.0, epsilon = 1e-4):
def rpn_loss_cls_fixed_num(y_true, y_pred):
return lambda_rpn_class * K.sum(y_true[:,:,:,:,0]
* K.binary_crossentropy(y_pred[:,:,:,:,:], y_true[:,:,:,:,1]))
/ K.sum(epsilon + y_true[:,:,:,:,0])
return rpn_loss_cls_fixed_num
Error:
ValueError: logits and labels must have the same shape ((?, ?, ?, ?) vs (?, 14, 14, 9, 1))
Note: I have read multiple question on this site having the same error, but none of the solutions allowed my model to compile.
Potential solution:
I continued messing with this and found that by adding
y_true = K.expand_dims(y_true, axis=-1)
I was able to compile the model. Still dubious that this is going to work correctly.
Keras model set y_true shape equivalent to input shape. Therefore, when your loss function gets shape mismatch error. So you need to align dimensions by using expand_dims. This, however, needs to be done considering your model architecture, data and loss function. Code below will compile.
def rpn_loss_cls(lambda_rpn_class=1.0, epsilon = 1e-4):
def rpn_loss_cls_fixed_num(y_true, y_pred):
y_true = tf.keras.backend.expand_dims(y_true, -1)
return lambda_rpn_class * K.sum(y_true[:,:,:,:,0]
* K.binary_crossentropy(y_pred[:,:,:,:,:], y_true[:,:,:,:,1]))
/ K.sum(epsilon + y_true[:,:,:,:,0])
return rpn_loss_cls_fixed_num
I am trying to program a convolutional autoencoder with my 2D array data (28x28).
Here is the link I referred. https://blog.keras.io/building-autoencoders-in-keras.html
The only deference between the reference and mine is either MNIST or mine.
The problem should be caused by the data split "X_train..., X_test..." sections.
Since I have been having a trouble with if I use skitlearn train_test_split algorism.
I know what it the problem.
I just do not know how to fix it.
Thank you.
data1 = pd.read_csv("2D2828.csv")
data2 = data1.as_matrix()
X = data2.astype(np.float32)
X = Input(shape=(28, 28, 1))
ae_cnn = Conv2D(16, (3, 3), activation='relu', padding='same')(X)
ae_cnn = MaxPooling2D((2, 2), padding='same')(ae_cnn)
ae_cnn = Conv2D(8, (3, 3), activation='relu', padding='same')(ae_cnn)
ae_cnn = MaxPooling2D((2, 2), padding='same')(ae_cnn)
ae_cnn = Conv2D(8, (3, 3), activation='relu', padding='same')(ae_cnn)
encoded = MaxPooling2D((2, 2), padding='same')(ae_cnn)
ae_cnn = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
ae_cnn = UpSampling2D((2, 2))(ae_cnn)
ae_cnn = Conv2D(8, (3, 3), activation='relu', padding='same')(ae_cnn)
ae_cnn = UpSampling2D((2, 2))(ae_cnn)
ae_cnn = Conv2D(16, (3, 3), activation='relu')(ae_cnn)
ae_cnn = UpSampling2D((2, 2))(ae_cnn)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(ae_cnn)
autoencoder = Model(X, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
from keras.callbacks import TensorBoard
X_train = X[1:24
X_test = X[25:28]
autoencoder.fit(X_train, X_train,
epochs=2,
batch_size=2,