Images that I would like to use to train the network are about the size of 4000px*3000px and about 40k of them, sorted in 250 classes.
I have made a CNN shown below:
model = keras.Sequential([
layers.Input((imgHeight, imgWidth, 1)),
layers.Conv2D(16, 3, padding = 'same'), # filters, kernel_size
layers.Conv2D(32, 3, padding = 'same'),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(250),
])
How do I figure out what layers.Conv2D(*16*, ...), a value I need?
How do I figure out what layers.Dense(*250*), a value I need?
Because I can't start the training process, I'm running out of memory.
The output shape of the Flatten() layer is 96 Million, and so the final dense layer of your model has 24 Billion parameters, this is why you are running out of memory. There are some steps you can take to fix this issue:
Try resizing your images to a smaller shape, if 4000x3000x1 isn't necessary, 160x160x1 would be a good choice.
Try using more Conv2D layers followed by a MaxPool2D layer to decrease the size of the input, and then finally at the end, use a Flatten layer followed by a Dense layer.
For example:
model = keras.Sequential([
layers.Input((160, 160, 1)),
layers.Conv2D(32, 3, padding = 'same'),
layers.Conv2D(32, 3, padding = 'same'),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64, 3, padding = 'same'),
layers.Conv2D(64, 3, padding = 'same'),
layers.MaxPooling2D((2,2)),
layers.Conv2D(128, 3, padding = 'same'),
layers.Conv2D(128, 3, padding = 'same'),
layers.MaxPooling2D((2,2)),
layers.Conv2D(256, 3, padding = 'same'),
layers.Conv2D(256, 3, padding = 'same'),
layers.MaxPooling2D((2,2)),
layers.Flatten(),
layers.Dense(512),
layers.Dense(250),
])
This type of architecture will work well if you are doing a classification task, and will not run out of memory.
Related
Hello I am trying to build a multi label image classifier but I am having issues with the input shape.
My features.shape is (40000, 28, 28, 1). The image is of two letters ranging from (a-g) in the photo that are to be classified. The third dimension (1) I manually added to it because from my understanding the Conv2D needs a 3 dimensional shape.
labels.shape is (40000, 2) and it is an array with the two letters associated with each photo.
Here is my model:
model = keras.Sequential([
Conv2D(32, 3, padding='same', activation='relu', input_shape=(28, 28, 1)),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(256, activation='relu'),
Dense(7, activation='sigmoid')
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
When I train the model I receive the error
ValueError: `logits` and `labels` must have the same shape, received ((None, 7) vs (None, 2)).
I am assuming I need to reshape the labels or features somehow but I am not sure.
I have been trying multiple different inputs and changes to no avail. I appreciate any help on this problem.
You are doing it correctly; the problem is with the last Dense layer since you are doing two label output change the last layer to Dense(2, activation='sigmoid') instead of Dense(7, activation='sigmoid')
I'm making a simple image classification in keras and I used MaxPooling2D to reduce image sizes. Recently I learned about strides and I want to implement them but I run into errors. Here's a piece of code which gives errors:
early_stopping = EarlyStopping(monitor = 'val_loss',min_delta = 0.001, patience = 20, restore_best_weights = True)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv2D(512, (2, 2),input_shape=(X[0].shape), strides = 2, data_format='channels_first', activation = 'relu'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(3, 3)))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Conv2D(512, (3, 3), data_format='channels_first',activation = 'relu'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(3, 3)))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Conv2D(128, (3, 3), data_format='channels_first',activation = 'relu'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(4, 4)))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
opt = keras.optimizers.Adam(learning_rate=0.0005)
model.compile(loss='binary_crossentropy',
optimizer=opt,
metrics=['accuracy'])
h= model.fit(trainx, trainy, validation_data = (valx, valy), batch_size=64, epochs=80, callbacks = [early_stopping], verbose = 0)
Here's the error:
ValueError: Negative dimension size caused by subtracting 4 from 2 for '{{node max_pooling2d_35/MaxPool}} = MaxPool[T=DT_FLOAT, data_format="NHWC", explicit_paddings=[], ksize=[1, 4, 4, 1], padding="VALID", strides=[1, 4, 4, 1]](Placeholder)' with input shapes: [?,128,2,46].
when I remove 'strides = 2' everything works just fine. Why is strides option causing input shape error and how can I prevent it? I couldn't find any info about that.
Stride is how much a kernel is shifted every time. A stride of size 2 essentially cuts the dimensions of the input block in half along each axis. Seems like you have an image of size 128 by 2 at some point due to your convolutions and strides. Of course you can't place a 4 x 4 filter on it since the dimension is only 2 on one axis.
You can use padding here to pad the data, I believe with 0s, to bring the dimensions up 128 by 4 and avoid the error.
I'm just a beginner in this field.
I want to apply Laplacian filter to the image that is imported from Keras.
What should I do? If you have a good document, please share it.
model = Sequential([
layers.Conv2D(32,(3,3), activation='relu', padding = 'same', input_shape=(28,28,1)),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64,(3,3), activation='relu', padding = 'same'),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64,(3,3), activation='relu', padding = 'same'),
layers.Flatten(),
Dense(64,activation = 'relu'),
Dense(10,activation='softmax')
])
model.summary()
input2 = Input(shape=(28,28,1))
Cv2D1=model.layers[0](input2)
MP1=model.layers[1](Cv2D1)
Cv2D2=model.layers[2](MP1)
MP2=model.layers[3](Cv2D2)
Cv2D3=model.layers[4](MP2)
flt=model.layers[5](Cv2D3)
D1=model.layers[6](flt)
D2=model.layers[7](D1)
model1 = Model(input2,D2)
This is part of the code I have.
I am currently developing a precipitation cell displacement prediction model. I have taken as a model to implement a variational convolutional autoencoder (I attach the model code). In summary, the model receives a sequence of 5 images, and must predict the following 5. The architecture consists of five convolutive layers in the encoder and decoder (Conv Transpose), which were made to greatly reduce the image size and learn spatial details. Between the encoder and decoder carries ConvLSTM layers to learn the temporal sequences. I am working it in Python, with tensorflow and keras.
The data consists of "images" of the rain radar of 400x400 pixels, with the dark background and the rain cells in the center of the frame. The time between frame and frame is 5 minutes, radar configuration. After further processing, the training data is scaled between 0 and 1, and in numpy format (for working with matrices). My training data finally has the form [number of sequences, number of images per sequence, height, width, channel = 1].
Sequence of precipitation Images
The sequences are made up of: 5 inputs and 5 targets, of which there are 2111 radar image sequences (I know I don't have much data :( for training) and 80% have been taken for training and 20% for the validation.
To detail:
train_input = [1688, 5, 400, 400, 1]
train_target = [1688, 5, 400, 400, 1]
valid_input = [423, 5, 400, 400, 1]
valid_target = [423, 5, 400, 400, 1]
The problem is that I have trained my model, and I have obtained the value of accuracy very poor. around 8e-05. I've been training 400 epochs, and the value remains or surrounds the mentioned value. Also when I take a sequence of 5 images to predict the next 5, I get very bad results (not even a small formation of "spots" in the center, which represents the rain). I have already tried to reduce the number of layers in the encoder and decoder, in addition to the optimizer [adam, nadam, adadelta], I have also tried using the activation function [relu, elu]. I have not obtained any profitable results, in the prediction images and the accuracy value.
Loss and Accuracy during Training
I am a beginner in Deep Learning topics, I like it a lot, but I can't find a solution to this problem. I suspect that my model architecture is not right. In addition to that I should look for a better optimizer or activation function to improve the accuracy value and predicted images. As a last solution, perhaps cut the image of 400x400 pixels to a central area, where the precipitation is. Although I would lose training data.
I appreciate you can help me solve this problem, maybe giving me some ideas to organize my architecture model, or ideas to organize de train data.
Best regards.
# Encoder
seq = Sequential()
seq.add(Input(shape=(5, 400, 400,1)))
seq.add(Conv3D(filters=32, kernel_size=(11, 11, 5), strides=3,
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3D(filters=32, kernel_size=(9, 9, 32), strides=2,
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3D(filters=64, kernel_size=(7, 7, 32), strides=2,
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3D(filters=64, kernel_size=(5, 5, 64), strides=2,
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3D(filters=32, kernel_size=(3, 3, 64), strides=3,
padding='same', activation ='relu'))
seq.add(BatchNormalization())
# ConvLSTM Layers
seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
input_shape=(None, 6, 6, 32),
padding='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
padding='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
padding='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
padding='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(Conv3D(filters=32, kernel_size=(3, 3, 3),
activation='relu',
padding='same', data_format='channels_last'))
# Decoder
seq.add(Conv3DTranspose(filters=32, kernel_size=(3, 3, 64), strides=(2,3,3),
input_shape=(1, 6, 6, 32),
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3DTranspose(filters=64, kernel_size=(5, 5, 64), strides=(3,2,2),
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3DTranspose(filters=64, kernel_size=(7, 7, 32), strides=(1,2,2),
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3DTranspose(filters=32, kernel_size=(9, 9, 32), strides=(1,2,2),
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3DTranspose(filters=1, kernel_size=(11, 11, 5), strides=(1,3,3),
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Cropping3D(cropping = (0,16,16)))
seq.add(Cropping3D(cropping = ((0,-5),(0,0),(0,0))))
seq.compile(loss='mean_squared_error', optimizer='adadelta', metrics=['accuracy'])
The metric you would like to use in case of a regression problem is mse(mean_squared_error) or mae (mean_absolute_error).
You may want to use mse in the beginning as it penalises greater errors more than the mae.
You just need to change a little bit the code where you compile your model.
seq.compile(loss='mean_squared_error', optimizer='adadelta', metrics=['mse','mae'])
In this way you can monitor both mse and mae metric during the training.
I am trying to detect the single pixel location of a single object in an image. I have a keras CNN regression network with my image tensor as the input, and a 3 item vector as the output.
First item: Is a 1 (if an object was found) or 0 (no object was found)
Second item: Is a number between 0 and 1 which indicates how far along the x axis is the object
Third item: Is a number between 0 and 1 which indicates how far along the y axis is the object
I have trained the network on 2000 test images and 500 validation images, and the val_loss is far less than 1, and the val_acc is best at around 0.94. Excellent.
But then when I predict the output, I find the values for all three output items are not between 0 and 1, they are actually between -2 and 3 approximately. All three items should be between 0 and 1.
I have not used any non-linear activation functions on the output layer, and have used relus for all non-output layers. Should I be using a softmax, even though it is non-linear? The second and third items are predicting the x and y axis of the image, which appear to me as linear quantities.
Here is my keras network:
inputs = Input((256, 256, 1))
base_kernels = 64
# 256
conv1 = Conv2D(base_kernels, 3, activation='relu', padding='same', kernel_initializer='he_normal')(inputs)
conv1 = BatchNormalization()(conv1)
conv1 = Conv2D(base_kernels, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv1)
conv1 = BatchNormalization()(conv1)
conv1 = Dropout(0.2)(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
# 128
conv2 = Conv2D(base_kernels * 2, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool1)
conv2 = BatchNormalization()(conv2)
conv2 = Conv2D(base_kernels * 2, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv2)
conv2 = BatchNormalization()(conv2)
conv2 = Dropout(0.2)(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
# 64
conv3 = Conv2D(base_kernels * 4, 3, activation='relu', padding='same', kernel_initializer='he_normal')(pool2)
conv3 = BatchNormalization()(conv3)
conv3 = Conv2D(base_kernels * 4, 3, activation='relu', padding='same', kernel_initializer='he_normal')(conv3)
conv3 = BatchNormalization()(conv3)
conv3 = Dropout(0.2)(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
flat = Flatten()(pool3)
dense = Dense(256, activation='relu')(flat)
output = Dense(3)(dense)
model = Model(inputs=[inputs], outputs=[output])
optimizer = Adam(lr=1e-4)
model.compile(optimizer=optimizer, loss='mean_absolute_error', metrics=['accuracy'])
Can anyone please help? Thanks! :)
Chris
The sigmoid activation produces outputs between zero and one, so if you use it as activation of your last layer(the output), the network's output will be between zero and one.
output = Dense(3, activation="sigmoid")(dense)