I am working on Conditional GANs and my generator and discriminator both have two inputs and use merged models like this:-
z = Input(shape=(100,))
temp = Input(shape=(384,))
generator=Generator()
img = generator([z,temp])
valid = discriminator([img,temp])
combined = Model([z,temp], valid)
combined.compile(loss='binary_crossentropy', optimizer=optimizer)
DCGAN is being used to classify and generate images conditional to the "temp" embedding and I am using Adam "optimizer = Adam(0.0001, 0.5)" for both models.
GEN is like takes input noise "z" and "temp" merges them and makes 128x128x3 images. Disc takes image and performs conv2d on it and then reshapes "temp" to 1,128,3 and merges both and further applies conv2d and outputs a sigmoid unit. My question is that during backprop how are weights updated of a merged model, lets say of Disc here:-
inp1 = Input(shape=(128,128,3),name='inp1')
inp2 = Input(shape=(384,),name='inp2')
d2=Reshape(target_shape=(1,128,3))(inp2)
d1 = Conv2D(16, kernel_size=5, strides=2, padding="same")(inp1)
d1=LeakyReLU(alpha=0.2)(d1)
d1=Dropout(0.25)(d1)
d1 = Conv2D(32, kernel_size=5, strides=2, padding="same")(inp1)
d1=BatchNormalization(momentum=0.8)(d1)
d1=LeakyReLU(alpha=0.2)(d1)
d1=Dropout(0.25)(d1)
d1 = Conv2D(64, kernel_size=5, strides=2, padding="same")(inp1)
d1=BatchNormalization(momentum=0.8)(d1)
d1=LeakyReLU(alpha=0.2)(d1)
d1=Dropout(0.25)(d1)
d1 = Conv2D(128, kernel_size=5, strides=2, padding="same")(inp1)
d1=BatchNormalization(momentum=0.8)(d1)
d1=LeakyReLU(alpha=0.2)(d1)
d1=Dropout(0.25)(d1)
d1 = Conv2D(256, kernel_size=5, strides=2, padding="same")(inp1)
d1=BatchNormalization(momentum=0.8)(d1)
d1=LeakyReLU(alpha=0.2)(d1)
d1=Dropout(0.25)(d1)
d1=Flatten()(d1)
d1=Dense(768, activation="relu")(d1)
d1=Reshape(target_shape=(2,128,3))(d1)
output=concatenate(
[
d1,
d2,
]
,axis=1
)
d1 = Conv2D(64, kernel_size=5, strides=2, padding="same")(output)
d1=BatchNormalization(momentum=0.8)(d1)
d1=LeakyReLU(alpha=0.2)(d1)
d1=Dropout(0.25)(d1)
d1 = Conv2D(128, kernel_size=5, strides=2, padding="same")(inp1)
d1=BatchNormalization(momentum=0.8)(d1)
d1=LeakyReLU(alpha=0.2)(d1)
d1=Dropout(0.25)(d1)
d1 = Conv2D(256, kernel_size=5, strides=2, padding="same")(inp1)
d1=BatchNormalization(momentum=0.8)(d1)
d1=LeakyReLU(alpha=0.2)(d1)
d1=Dropout(0.25)(d1)
d1=Flatten()(d1)
output=Dense(1,activation='sigmoid')(d1)
model=Model(
inputs=[
inp1,
inp2
],
outputs=[
output
]
)
model.summary()
img = Input(shape=(128,128,3))
text=Input(shape=(384,))
validity = model([img,text])
return Model([img,text], validity)
And my disc loss starts from 2.02 and goes to around 6.7 in 150 epochs and loss of Gen decreases from 0.80 to 0.00024 in 150 epochs and I am getting garbage, how can I improve my architecture? And I was wondering that maybe backprop doesn't work well in merged models because it becomes a lot complicated.
I am using batchnorm, leaky relu, conv2d + stride, no pooling layers and label smoothing though.
Related
i have written this NN
decoder_output = Conv2D(64, (3,3), activation='relu', padding='same')(encoder_input)
decoder_output = UpSampling2D((2, 2))(decoder_output)
decoder_output = Conv2D(32, (3,3), activation='relu', padding='same')(decoder_output)
decoder_output = UpSampling2D((2, 2))(decoder_output)
decoder_output = Conv2D(16, (3,3), activation='relu', padding='same')(decoder_output)
decoder_output = UpSampling2D((2, 2))(decoder_output)
decoder_output = Conv2D(2, (3, 3), activation='sigmoid', padding='same')(decoder_output)
decoder_output = UpSampling2D((2, 2))(decoder_output)
decoder_output = Flatten()(decoder_output)
decoder_output = Dense(height*width, activation='relu')(decoder_output)
model = Model(inputs=encoder_input, outputs=decoder_output)
model.compile(optimizer='adam', loss='mse')
clean_images = model.fit(train_images, y_train_red, epochs=10,validation_data=(validation_images,y_validation_red))
which suppose to return an image values.
is there a way to restrict the return values to be int and/or maximize the ouput layer value to 255?
What should happen is that your model will learn to not output values above 255 and below 0. However, in the instances that it does, you could clip the values to be between 0 and 255 when you are predicting. Regarding integer outputs, there isn't a way that I know of. However, you could round the outputs when you are predicting.
I'm trying to separate close objects as was shown in the U-Net paper (here). For this, one generates weight maps which can be used for pixel-wise losses. The following code describes the network I use from this blog post.
x_train_val = # list of images (imgs, 256, 256, 3)
y_train_val = # list of masks (imgs, 256, 256, 1)
y_weights = # list of weight maps (imgs, 256, 256, 1) according to the blog post
# visual inspection confirms the correct calculation of these maps
# Blog posts' loss function
def my_loss(target, output):
return - tf.reduce_sum(target * output,
len(output.get_shape()) - 1)
# Standard Unet model from blog post
_epsilon = tf.convert_to_tensor(K.epsilon(), np.float32)
def make_weighted_loss_unet(input_shape, n_classes):
ip = L.Input(shape=input_shape)
weight_ip = L.Input(shape=input_shape[:2] + (n_classes,))
conv1 = L.Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(ip)
conv1 = L.Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv1)
conv1 = L.Dropout(0.1)(conv1)
mpool1 = L.MaxPool2D()(conv1)
conv2 = L.Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(mpool1)
conv2 = L.Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv2)
conv2 = L.Dropout(0.2)(conv2)
mpool2 = L.MaxPool2D()(conv2)
conv3 = L.Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(mpool2)
conv3 = L.Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv3)
conv3 = L.Dropout(0.3)(conv3)
mpool3 = L.MaxPool2D()(conv3)
conv4 = L.Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(mpool3)
conv4 = L.Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv4)
conv4 = L.Dropout(0.4)(conv4)
mpool4 = L.MaxPool2D()(conv4)
conv5 = L.Conv2D(1024, 3, activation='relu', padding='same', kernel_initializer='he_normal')(mpool4)
conv5 = L.Conv2D(1024, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv5)
conv5 = L.Dropout(0.5)(conv5)
up6 = L.Conv2DTranspose(512, 2, strides=2, kernel_initializer='he_normal', padding='same')(conv5)
conv6 = L.Concatenate()([up6, conv4])
conv6 = L.Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv6)
conv6 = L.Conv2D(512, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv6)
conv6 = L.Dropout(0.4)(conv6)
up7 = L.Conv2DTranspose(256, 2, strides=2, kernel_initializer='he_normal', padding='same')(conv6)
conv7 = L.Concatenate()([up7, conv3])
conv7 = L.Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv7)
conv7 = L.Conv2D(256, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv7)
conv7 = L.Dropout(0.3)(conv7)
up8 = L.Conv2DTranspose(128, 2, strides=2, kernel_initializer='he_normal', padding='same')(conv7)
conv8 = L.Concatenate()([up8, conv2])
conv8 = L.Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv8)
conv8 = L.Conv2D(128, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv8)
conv8 = L.Dropout(0.2)(conv8)
up9 = L.Conv2DTranspose(64, 2, strides=2, kernel_initializer='he_normal', padding='same')(conv8)
conv9 = L.Concatenate()([up9, conv1])
conv9 = L.Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv9)
conv9 = L.Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv9)
conv9 = L.Dropout(0.1)(conv9)
c10 = L.Conv2D(n_classes, 1, activation='softmax', kernel_initializer='he_normal')(conv9)
# Mimic crossentropy loss
c11 = L.Lambda(lambda x: x / tf.reduce_sum(x, len(x.get_shape()) - 1, True))(c10)
c11 = L.Lambda(lambda x: tf.clip_by_value(x, _epsilon, 1. - _epsilon))(c11)
c11 = L.Lambda(lambda x: K.log(x))(c11)
weighted_sm = L.multiply([c11, weight_ip])
model = Model(inputs=[ip, weight_ip], outputs=[weighted_sm])
return model
I then compile and fit the model as is shown below:
model = make_weighted_loss_unet((256, 256, 3), 1) # shape of input, number of classes
model.compile(optimizer='adam', loss=my_loss, metrics=['acc'])
model.fit([x_train_val, y_weights], y_train_val, validation_split=0.1, epochs=1)
The model can then train as usual. However, the loss doesn't seem to improve much. Furthermore, when I try to predict on new images, I obviously don't have the weight maps (because they are calculated on the labeled masks). I tried to use empty / zero arrays shaped like the weight map but that only yields in blank / zero predictions. I also tried different metrics and more standards losses without any success.
Did anyone face the same issue or have an alternative in implementing this weighted loss? Thanks in advance. BBQuercus
A simpler way to write custom loss with pixel weights
In your code, the loss is scattered around, between my_loss and make_weighted_loss_unet functions. You can add targets as an input and use model.add_loss to structure the code better :
def make_weighted_loss_unet(input_shape, n_classes):
ip = L.Input(shape=input_shape)
weight_ip = L.Input(shape=input_shape[:2] + (n_classes,))
targets = L.input(shape=input_shape[:2] + (n_classes,))
# .... rest of your model definition code ...
c10 = L.Conv2D(n_classes, 1, activation='softmax', kernel_initializer='he_normal')(conv9)
model.add_loss(pixel_weighted_cross_entropy(weights_ip, targets, c10))
# .... return Model .... NO NEED to specify loss in model.compile
def pixel_weighted_cross_entropy(weights, targets, predictions)
loss_val = keras.losses.categorical_crossentropy(targets, predictions)
weighted_loss_val = weights * loss_val
return K.mean(weighted_loss_val)
If you don't refactor your code to the above approach, next section shows how to still run inference without issues
How to run your model in inference
Option 1 : Use another Model object for inference
You can create a Model used for training and another used for inference. Both are largely the same except that the inference Model does not take weights_ip, and gives an early output c10.
Here's an example code that adds an argument is_training=True to decide which Model to return :
def make_weighted_loss_unet(input_shape, n_classes, is_training=True):
ip = L.Input(shape=input_shape)
conv1 = L.Conv2D(64, 3, activation='relu', padding='same', kernel_initializer='he_normal')(ip)
# .... rest of your model definition code ...
c10 = L.Conv2D(n_classes, 1, activation='softmax', kernel_initializer='he_normal')(conv9)
if is_training:
# Mimic crossentropy loss
c11 = L.Lambda(lambda x: x / tf.reduce_sum(x, len(x.get_shape()) - 1, True))(c10)
c11 = L.Lambda(lambda x: tf.clip_by_value(x, _epsilon, 1. - _epsilon))(c11)
c11 = L.Lambda(lambda x: K.log(x))(c11)
weight_ip = L.Input(shape=input_shape[:2] + (n_classes,))
weighted_sm = L.multiply([c11, weight_ip])
return Model(inputs=[ip, weight_ip], outputs=[weighted_sm])
else:
return Model(inputs=[ip], outputs=[c10])
return model
Option 2 : Use K.function
If you don't want to mess with your Model definition method (make_weighted_loss_unet) and want to achieve the same result outside, you can use a function that extracts the subgraph relevant for inference.
In your inference function:
from keras import backend as K
model = make_weighted_loss_unet(input_shape, n_classes)
inference_function = K.function([model.get_layer("input_layer").input],
[model.get_layer("output_softmax_layer").output])
predicted_heatmap = inference_function(new_image)
Note that you'll have to give name= to your ip layer and c10 layer to be able to retrieve them via model.get_layer(name) :
ip = L.Input(shape=input_shape, name="input_layer")
and
c10 = L.Conv2D(n_classes, 1, activation='softmax', kernel_initializer='he_normal', name="output_softmax_layer")(conv9)
I am trying to detect the single pixel location of a single object in an image. I have a keras CNN regression network with my image tensor as the input, and a 3 item vector as the output.
First item: Is a 1 (if an object was found) or 0 (no object was found)
Second item: Is a number between 0 and 1 which indicates how far along the x axis is the object
Third item: Is a number between 0 and 1 which indicates how far along the y axis is the object
I have trained the network on 2000 test images and 500 validation images, and the val_loss is far less than 1, and the val_acc is best at around 0.94. Excellent.
But then when I predict the output, I find the values for all three output items are not between 0 and 1, they are actually between -2 and 3 approximately. All three items should be between 0 and 1.
I have not used any non-linear activation functions on the output layer, and have used relus for all non-output layers. Should I be using a softmax, even though it is non-linear? The second and third items are predicting the x and y axis of the image, which appear to me as linear quantities.
Here is my keras network:
inputs = Input((256, 256, 1))
base_kernels = 64
# 256
conv1 = Conv2D(base_kernels, 3, activation='relu', padding='same', kernel_initializer='he_normal')(inputs)
conv1 = BatchNormalization()(conv1)
conv1 = Conv2D(base_kernels, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv1)
conv1 = BatchNormalization()(conv1)
conv1 = Dropout(0.2)(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
# 128
conv2 = Conv2D(base_kernels * 2, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool1)
conv2 = BatchNormalization()(conv2)
conv2 = Conv2D(base_kernels * 2, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv2)
conv2 = BatchNormalization()(conv2)
conv2 = Dropout(0.2)(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
# 64
conv3 = Conv2D(base_kernels * 4, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool2)
conv3 = BatchNormalization()(conv3)
conv3 = Conv2D(base_kernels * 4, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv3)
conv3 = BatchNormalization()(conv3)
conv3 = Dropout(0.2)(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
flat = Flatten()(pool3)
dense = Dense(256, activation='relu')(flat)
output = Dense(3)(dense)
model = Model(inputs=[inputs], outputs=[output])
optimizer = Adam(lr=1e-4)
model.compile(optimizer=optimizer, loss='mean_absolute_error', metrics=['accuracy'])
Can anyone please help? Thanks! :)
Chris
The sigmoid activation produces outputs between zero and one, so if you use it as activation of your last layer(the output), the network's output will be between zero and one.
output = Dense(3, activation="sigmoid")(dense)
Here is my discriminator architecture:
def build_discriminator(img_shape,embedding_shape):
model1 = Sequential()
model1.add(Conv2D(32, kernel_size=5, strides=2, input_shape=img_shape, padding="same"))
model1.add(LeakyReLU(alpha=0.2))
model1.add(Dropout(0.25))
model1.add(Conv2D(48, kernel_size=5, strides=2, padding="same"))
#model.add(ZeroPadding2D(padding=((0,1),(0,1))))
model1.add(BatchNormalization(momentum=0.8))
model1.add(LeakyReLU(alpha=0.2))
model1.add(Dropout(0.25))
model1.add(Conv2D(64, kernel_size=5, strides=2, padding="same"))
model1.add(BatchNormalization(momentum=0.8))
model1.add(LeakyReLU(alpha=0.2))
model1.add(Dropout(0.25))
model1.add(Conv2D(128, kernel_size=5, strides=2, padding="same"))
model1.add(BatchNormalization(momentum=0.8))
model1.add(LeakyReLU(alpha=0.2))
model1.add(Dropout(0.25))
model1.add(Conv2D(256, kernel_size=5, strides=2, padding="same"))
model1.add(BatchNormalization(momentum=0.8))
model1.add(LeakyReLU(alpha=0.2))
model1.add(Dropout(0.25))
model1.add(Flatten())
model1.add(Dense(200))
model2=Sequential()
model2.add(Dense(50, input_shape=embedding_shape))
model2.add(Dense(100))
model2.add(Dense(200))
model2.add(Flatten())
merged_model = Sequential()
merged_model.add(Merge([model1, model2], mode='concat'))
merged_model.add(Dense(1, activation='sigmoid', name='output_layer'))
#merged_model.compile(loss='binary_crossentropy', optimizer='adam',
#metrics=['accuracy'])
#model1.add(Dense(1, activation='sigmoid'))
merged_model.summary()
merged_model.input_shape
img = Input(shape=img_shape)
emb = Input(shape=embedding_shape)
validity = merged_model([img,emb])
return Model([img,emb],validity)
and here is the generator architecture:
def build_generator(latent_dim=484):
model = Sequential()
model.add(Dense(624 * 2 * 2, activation="relu", input_dim=latent_dim))
model.add(Reshape((2, 2, 624)))
model.add(UpSampling2D())
model.add(Conv2D(512, kernel_size=5, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))
model.add(UpSampling2D())
#4x4x512
model.add(Conv2D(256, kernel_size=5, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))
model.add(UpSampling2D())
#8x8x256
model.add(Conv2D(128, kernel_size=5, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))
model.add(UpSampling2D())
#16x16x128
model.add(Conv2D(64, kernel_size=5, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))
model.add(UpSampling2D())
#32x32x64
model.add(Conv2D(32, kernel_size=5, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))
model.add(UpSampling2D())
#64x64x32
model.add(Conv2D(3, kernel_size=5, padding="same"))
model.add(Activation("tanh"))
#128x128x3
noise = Input(shape=(latent_dim,))
img = model(noise)
return Model(noise, img)
and here is how I am making the GAN network:
optimizer = Adam(0.0004, 0.5)
discriminator=build_discriminator((128,128,3),(1,128,3))
discriminator.compile(loss='binary_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
# Build the generator
generator = build_generator()
# The generator takes noise as input and generates imgs
z = Input(shape=(100+384,))
img = generator(z)
# For the combined model we will only train the generator
discriminator.trainable = False
temp=Input(shape=(1,128,3))
# The discriminator takes generated images as input and determines validity
valid = discriminator([img,temp])
# The combined model (stacked generator and discriminator)
# Trains the generator to fool the discriminator
combined = Model(z, valid)
combined.compile(loss='binary_crossentropy', optimizer=optimizer)
The discriminator have 2 models, and will get as input an image of shape 128x128x3 and an embedding of shape 1x128x3 and both models are merged then. The generator model just gets noise and generates a 128x128x3 image. So at the line combined = Model(z, valid) I am getting the followiing error:
RuntimeError: Graph disconnected: cannot obtain value for tensor Tensor("input_5:0", shape=(?, 1, 128, 3), dtype=float32) at layer "input_5". The following previous layers were accessed without issue: ['input_4', 'model_2']
which I think is because of the fact that discriminator can't find embedding input but I am feeding it a tensor of shape (1,128,3), just like noise is being fed to the generator model. Can anyone please help me where I am doing wrong?
And after everything is set here is how I will generate images from noise and embedding vector merged together and discriminator will take image and vector to identify fakes:
#texts has embedding vectors
pics=np.array(pics) . #images
noise = np.random.normal(0, 1, (batch_size, 100))
j=0
latent_code=[]
for j in range(len(texts)): #appending embedding at the end of noise
n=np.append(noise[j],texts[j])
n=n.tolist()
latent_code.append(n)
latent_code=np.array(latent_code)
gen_imgs = generator.predict(latent_code) #gen making fakes
j=0
vects=[]
for im in gen_imgs:
t=np.array(texts[j])
t=np.reshape(t,[128,3])
t=np.expand_dims(t, axis=0)
vects.append(t)
j+=1
vects=np.array(vects) #vector of ?,1,128,3
#disc marking fakes and reals
d_loss_real = discriminator.train_on_batch([pics,vects], valid)
d_loss_fake = discriminator.train_on_batch([gen_pics,vects], fake)
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)
g_loss = combined.train_on_batch(latent_code, valid)
You have forgotten to add the temp as one of the inputs of the GAN (that's why the error says it can't feed the corresponding tensor since it is essentially disconnected):
combined = Model([z, temp], valid)
As a side note, I highly recommend to use Keras Functional API for building complicated and multi branch models like your discriminator. It is much easier to use, being more flexible and less error-prone.
For example, this is the descriminator you have written but I have rewritten it using Functional API. I personally think it is much easier to follow:
def build_discriminator(img_shape,embedding_shape):
input_img = Input(shape=img_shape)
x = Conv2D(32, kernel_size=5, strides=2, padding="same")(input_img)
x = LeakyReLU(alpha=0.2)(x)
x = Dropout(0.25)(x)
x = Conv2D(48, kernel_size=5, strides=2, padding="same")(x)
x = BatchNormalization(momentum=0.8)(x)
x = LeakyReLU(alpha=0.2)(x)
x = Dropout(0.25)(x)
x = Conv2D(64, kernel_size=5, strides=2, padding="same")(x)
x = BatchNormalization(momentum=0.8)(x)
x = LeakyReLU(alpha=0.2)(x)
x = Dropout(0.25)(x)
x = Conv2D(128, kernel_size=5, strides=2, padding="same")(x)
x = BatchNormalization(momentum=0.8)(x)
x = LeakyReLU(alpha=0.2)(x)
x = Dropout(0.25)(x)
x = Conv2D(256, kernel_size=5, strides=2, padding="same")(x)
x = BatchNormalization(momentum=0.8)(x)
x = LeakyReLU(alpha=0.2)(x)
x = Dropout(0.25)(x)
x = Flatten()(x)
output_img = Dense(200)(x)
input_emb = Input(shape=embedding_shape)
y = Dense(50)(input_emb)
y = Dense(100)(y)
y = Dense(200)(y)
output_emb = Flatten()(y)
merged = concatenate([output_img, output_emb])
output_merge = Dense(1, activation='sigmoid', name='output_layer')(merged)
return Model([input_img, input_emb], output_merge)
I have Keras model that accepts inputs which have 4D shapes as (n, height, width, channel).
However, my data generator is producing 2D arrays as(n, width*height). So, the predict function of Keras is expecting inputs as 4D. I have no chance to change the data generator because the model will be tested by someone else. So, is there a way to override the predict function of Keras.
My model structure
a = Input(shape=(width*height,))
d1 = 16 # depth of filter kernel each layer
d2 = 16
d3 = 64
d4 = 128
d5 = 256
drop_out = 0.25
patch_size = (3, 3)
k_size = (2, 2)
reshape = Reshape((height, width, 1))(a)
conv1 = Conv2D(filters=d1, kernel_size=patch_size, padding='same', activation='relu')(reshape)
conv1 = MaxPooling2D(pool_size=k_size, padding='same')(conv1)
conv2 = Convolution2D(filters=d2, kernel_size=patch_size, padding='same', activation='relu')(conv1)
conv2 = MaxPooling2D(pool_size=k_size, padding='same')(conv2)
conv3 = Convolution2D(filters=d3, kernel_size=patch_size, padding='same', activation='relu')(conv2)
conv3 = MaxPooling2D(pool_size=k_size, padding='same')(conv3)
conv4 = Convolution2D(filters=d4, kernel_size=patch_size, padding='same', activation='relu')(conv3)
conv4 = MaxPooling2D(pool_size=k_size, padding='same')(conv4)
conv5 = Convolution2D(filters=d5, kernel_size=patch_size, padding='same', activation='relu')(conv4)
conv5 = MaxPooling2D(pool_size=k_size, padding='same')(conv5)
x = Flatten()(conv5)
x = Dropout(drop_out)(x)
node = 32
x_1 = Dense(node, activation='relu')(x) # connect the flatten layer to five classifier,each one comes to a digit.
x_2 = Dense(node, activation='relu')(x)
x_3 = Dense(node, activation='relu')(x)
x_4 = Dense(node, activation='relu')(x)
x_5 = Dense(node, activation='relu')(x)
d1 = Dense(n_class, activation='softmax')(x_1)
d2 = Dense(n_class, activation='softmax')(x_2)
d3 = Dense(n_class, activation='softmax')(x_3)
d4 = Dense(n_class, activation='softmax')(x_4)
d5 = Dense(n_class, activation='softmax')(x_5)
outputs = [d1, d2, d3, d4, d5]
model = Model(a, outputs)
model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy'])
model.fit(raw_train_data, raw_train_target, batch_size=200, epochs=5, validation_split=0.2)
You don't override the predict, you simply add a Reshape layer at the beginning of your model.
With the functional API:
from keras.layers import *
inp = Input((width*heigth,))
first = Reshape((width,height,1))(inp)
..... other layers.....
model = Model(inp, outputFromTheLastLayer)
With a sequential model:
model = Sequential()
model.add(Reshape((width,height,1), input_shape = (width*height,)))
model.add(otherlayers)
About the output shape.
Since you have 5 outputs, you need your target array to be a list of five arrays:
raw_train_target = [target1,target2,target3,target4,target5]
If you cannot do that, and raw_train_target is one single arary with the targets all following a sequence, you can try to use a concatenate layer at the end:
output = Concatenate()(outputs)