Multi Label Image Classifier Input Issues - python

Hello I am trying to build a multi label image classifier but I am having issues with the input shape.
My features.shape is (40000, 28, 28, 1). The image is of two letters ranging from (a-g) in the photo that are to be classified. The third dimension (1) I manually added to it because from my understanding the Conv2D needs a 3 dimensional shape.
labels.shape is (40000, 2) and it is an array with the two letters associated with each photo.
Here is my model:
model = keras.Sequential([
Conv2D(32, 3, padding='same', activation='relu', input_shape=(28, 28, 1)),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(256, activation='relu'),
Dense(7, activation='sigmoid')
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
When I train the model I receive the error
ValueError: `logits` and `labels` must have the same shape, received ((None, 7) vs (None, 2)).
I am assuming I need to reshape the labels or features somehow but I am not sure.
I have been trying multiple different inputs and changes to no avail. I appreciate any help on this problem.

You are doing it correctly; the problem is with the last Dense layer since you are doing two label output change the last layer to Dense(2, activation='sigmoid') instead of Dense(7, activation='sigmoid')

Related

Data dimensionality problem with tensorflow sequential model

Being new to Keras sequential models is causing me a few troubles!
I have an x_train of shape : 17755 x 500 x 12
and y_train of shape: 17755 x 15 (labels are already one-hot encoded)
And I made the next model to be trained on this data:
model = Sequential()
model.add(Conv2D(32,3,padding="same", activation="relu", input_shape=(17755,500,12)))
model.add(MaxPool2D())
model.add(Conv2D(32, 3, padding="same", activation="relu"))
model.add(MaxPool2D())
model.add(Conv2D(64, 3, padding="same", activation="relu"))
model.add(MaxPool2D())
model.add(Dropout(0.4))
model.add(Flatten())
model.add(Dense(128,activation="relu"))
model.add(Dense(15, activation="sigmoid"))
model.compile(optimizer ='adam', loss='categorical_crossentropy', metrics = ['Accuracy'])
history = model.fit(x_train, y_train, epochs=5)
1- when I don’t use np.expand_dims to add an axis for batch, I get this error:
ValueError: Input 0 of layer "sequential" is incompatible with the
layer: expected shape=(None, 17755, 500, 12), found shape=(None, 500,
12)
2- when I do use np.expand_dims and the shape of x_train became: 1x17755x500x12
I get this error:
Data cardinality is ambiguous:
x sizes: 1
y sizes: 17755
Make sure all arrays contain the same number of samples.
3- when I use np.expand_dims for y_train too and its shape became: 1x17755x15
I get this error:
ValueError: Shapes (None, 17755, 15) and (None, 15) are incompatible
I know I’m doing something fundamentally wrong, but what what is that? Can anyone please help me out with the shape of data please?
Regarding x_train try adding a new dimension at the end to represent the channel dimension needed for Conv2D layers. Note also that you do not provide the number of samples to your input shape. Here is a working example:
import tensorflow as tf
import numpy as np
x_train = np.random.random((17755,500,12))
x_train = np.expand_dims(x_train, axis=-1)
y_train = np.random.random((17755,15))
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv2D(32,3,padding="same", activation="relu", input_shape=(500, 12, 1)))
model.add(tf.keras.layers.MaxPool2D())
model.add(tf.keras.layers.Conv2D(32, 3, padding="same", activation="relu"))
model.add(tf.keras.layers.MaxPool2D())
model.add(tf.keras.layers.Conv2D(64, 3, padding="same", activation="relu"))
model.add(tf.keras.layers.MaxPool2D())
model.add(tf.keras.layers.Dropout(0.4))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128,activation="relu"))
model.add(tf.keras.layers.Dense(15, activation="sigmoid"))
model.compile(optimizer ='adam', loss='categorical_crossentropy', metrics = ['Accuracy'])
history = model.fit(x_train, y_train, epochs=5)

Why should the input_shape property of a Conv2D layer be specified only for the first Conv2D layer?

I am new to AI/ML stuff. I'm learning TensorFlow. In some tutorial, I noticed that the input_shape argument of a Conv2D layer was specified only for the first. Code looked kinda like this:
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(16, (3,3), activation='relu',
input_shape=(300,300,3)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
In many examples, not only in the above, the instructor didn't include that argument in there. Is there any reason for that?
The next layers derive the required shape from the output of the previous layer. That is, the MaxPooling2D layer derives its input shape based on the output of the Conv2D layer and so on. Note that in your sequential model, you don't even need to define an input_shape in the first layer. It is able to derive the input_shape if you feed it real data, which gives you a bit more flexibility since you don't have to hard-code the input shape:
import tensorflow as tf
tf.random.set_seed(1)
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(16, (3,3), activation='relu',),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
print(model(tf.random.normal((1, 300, 300, 3))))
tf.Tensor([[0.6059081]], shape=(1, 1), dtype=float32)
If data with an incorrect shape, for example (300, 3) instead of (300, 300, 3), is passed to your model, an error occurs because a Conv2D layer requires a 3D input excluding the batch dimension.
If your model does not have an input_shape, you will, however, not be able to call model.summary() to view your network. First you would have to build your model with an input shape:
model.build(input_shape=(1, 300, 300, 3))
model.summary()

Keras Error: Data cardinality is ambiguous

I am trying the following snippet on 64 images of size 28,28,1, but it throws
ValueError: Data cardinality is ambiguous
despite I think the dimensions of the tensors being correct.
loss = model_net.train_on_batch(Batch_X, Batch_Y)
print(type(Batch_X)):<class 'list'>
print(type(Batch_X[1][0])):<class 'numpy.ndarray'>
print(type(Batch_Y)):<class 'numpy.ndarray'>
print(np.shape(Batch_X)):(2, 64, 28, 28, 1)
print(np.shape(Batch_Y)):(64,)
Model is
input_shape=(28,28,1)
left_input = Input(input_shape)
right_input = Input(input_shape)
model = Sequential([
Conv2D(filters=64, kernel_size=(3, 3), activation='relu',input_shape=(28, 28, 1)),
MaxPool2D(pool_size=(2, 2), strides=2),
Conv2D(filters=64, kernel_size=(3, 3), activation='relu'),
MaxPool2D(pool_size=(2, 2), strides=2),
Flatten(),
Dense(units=4096, activation='sigmoid')])
model.summary()
encoded_l = model(left_input)
encoded_r = model(right_input)
subtracted = keras.layers.Subtract()([encoded_l, encoded_r])
prediction = Dense(1, activation='sigmoid')(subtracted)
model_net = Model(inputs=[left_input, right_input], outputs=prediction)
optimizer= Adam(learning_rate=0.0006)
model_net.compile(loss='binary_crossentropy', optimizer=optimizer)
plot_model(model_net, show_shapes=True, show_layer_names=True)
In the above you can see the model takes in 2 images simultaneously for forward pass; thus every element in Batch_X comprises of 2 matrixes. Any suggestions where I am likely making a mistake.

Probability for Tensorflow Binary Image Classification

I try to follow the Image Classification Tutorial but unfortunally it doesn't tell you how to use the model after you've created it.
The code I currently use to create the model is:
model = Sequential([
tf.keras.layers.Conv2D(16, 3, padding='same', activation='relu', input_shape=(IMG_SIZE, IMG_SIZE ,3)),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Dropout(0.1),
tf.keras.layers.Conv2D(32, 3, padding='same', activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Conv2D(64, 3, padding='same', activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Dropout(0.1),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
On my first attempt I hadn't the activation='sigmoid' on the last Dense layer, but then the predictions I get from the model are for example [[332.9539]] which I don't know how to interpret.
After I read this answer I added the Sigmoid activation to receive a value between 0 and 1, but unfortunally when training the model the accuracy is stuck at 0.5 while it worked before.
What am I doing wrong?
If you add the sigmoid activation to the last layer, then you need to remove the from_logits=True from the loss instance, since your model is no longer producing logits:
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(),
metrics=['accuracy'])

Python: How to solve the low accuracy of a Variational Autoencoder Convolutional Model developed to predict a sequence of future frames?

I am currently developing a precipitation cell displacement prediction model. I have taken as a model to implement a variational convolutional autoencoder (I attach the model code). In summary, the model receives a sequence of 5 images, and must predict the following 5. The architecture consists of five convolutive layers in the encoder and decoder (Conv Transpose), which were made to greatly reduce the image size and learn spatial details. Between the encoder and decoder carries ConvLSTM layers to learn the temporal sequences. I am working it in Python, with tensorflow and keras.
The data consists of "images" of the rain radar of 400x400 pixels, with the dark background and the rain cells in the center of the frame. The time between frame and frame is 5 minutes, radar configuration. After further processing, the training data is scaled between 0 and 1, and in numpy format (for working with matrices). My training data finally has the form [number of sequences, number of images per sequence, height, width, channel = 1].
Sequence of precipitation Images
The sequences are made up of: 5 inputs and 5 targets, of which there are 2111 radar image sequences (I know I don't have much data :( for training) and 80% have been taken for training and 20% for the validation.
To detail:
train_input = [1688, 5, 400, 400, 1]
train_target = [1688, 5, 400, 400, 1]
valid_input = [423, 5, 400, 400, 1]
valid_target = [423, 5, 400, 400, 1]
The problem is that I have trained my model, and I have obtained the value of accuracy very poor. around 8e-05. I've been training 400 epochs, and the value remains or surrounds the mentioned value. Also when I take a sequence of 5 images to predict the next 5, I get very bad results (not even a small formation of "spots" in the center, which represents the rain). I have already tried to reduce the number of layers in the encoder and decoder, in addition to the optimizer [adam, nadam, adadelta], I have also tried using the activation function [relu, elu]. I have not obtained any profitable results, in the prediction images and the accuracy value.
Loss and Accuracy during Training
I am a beginner in Deep Learning topics, I like it a lot, but I can't find a solution to this problem. I suspect that my model architecture is not right. In addition to that I should look for a better optimizer or activation function to improve the accuracy value and predicted images. As a last solution, perhaps cut the image of 400x400 pixels to a central area, where the precipitation is. Although I would lose training data.
I appreciate you can help me solve this problem, maybe giving me some ideas to organize my architecture model, or ideas to organize de train data.
Best regards.
# Encoder
seq = Sequential()
seq.add(Input(shape=(5, 400, 400,1)))
seq.add(Conv3D(filters=32, kernel_size=(11, 11, 5), strides=3,
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3D(filters=32, kernel_size=(9, 9, 32), strides=2,
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3D(filters=64, kernel_size=(7, 7, 32), strides=2,
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3D(filters=64, kernel_size=(5, 5, 64), strides=2,
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3D(filters=32, kernel_size=(3, 3, 64), strides=3,
padding='same', activation ='relu'))
seq.add(BatchNormalization())
# ConvLSTM Layers
seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
input_shape=(None, 6, 6, 32),
padding='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
padding='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
padding='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
padding='same', return_sequences=True))
seq.add(BatchNormalization())
seq.add(Conv3D(filters=32, kernel_size=(3, 3, 3),
activation='relu',
padding='same', data_format='channels_last'))
# Decoder
seq.add(Conv3DTranspose(filters=32, kernel_size=(3, 3, 64), strides=(2,3,3),
input_shape=(1, 6, 6, 32),
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3DTranspose(filters=64, kernel_size=(5, 5, 64), strides=(3,2,2),
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3DTranspose(filters=64, kernel_size=(7, 7, 32), strides=(1,2,2),
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3DTranspose(filters=32, kernel_size=(9, 9, 32), strides=(1,2,2),
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Conv3DTranspose(filters=1, kernel_size=(11, 11, 5), strides=(1,3,3),
padding='same', activation ='relu'))
seq.add(BatchNormalization())
seq.add(Cropping3D(cropping = (0,16,16)))
seq.add(Cropping3D(cropping = ((0,-5),(0,0),(0,0))))
seq.compile(loss='mean_squared_error', optimizer='adadelta', metrics=['accuracy'])
The metric you would like to use in case of a regression problem is mse(mean_squared_error) or mae (mean_absolute_error).
You may want to use mse in the beginning as it penalises greater errors more than the mae.
You just need to change a little bit the code where you compile your model.
seq.compile(loss='mean_squared_error', optimizer='adadelta', metrics=['mse','mae'])
In this way you can monitor both mse and mae metric during the training.

Categories