Training simple CNN-LSTM model - python

I have a task for my project paper and I do not get how to train the model. This model is supposed to take an image and segment it into different classes. The hard part is that the different segmentation is the same but I would like to differentiate between them. When I try to make a model with convolutional layers and LSTM, The model only predicts the class of the background.
Here is my model:
def LSTMconv10x9(input_size = (200, 9, 10, 1)):
input = Input(input_size)
conv1 = TimeDistributed(Conv2D(32, 3, padding = "same", activation='relu'))(input)
conv2 = TimeDistributed(Conv2D(64, 3, padding = "same", activation='relu'))(conv1)
lstm = ConvLSTM2D(32, 3, return_sequences=True, padding="same", activation="softmax")(conv2)
conv4 = TimeDistributed(Conv2D(64,3, padding = 'same', activation='relu'))(lstm)
conv5 = TimeDistributed(Conv2D(32,3, padding = 'same', activation='relu'))(conv4)
output = ConvLSTM2D(11,1, return_sequences = True, padding = "same", activation = None)(conv5)
model = Model(inputs = input, outputs = output)
model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), optimizer = tf.keras.optimizers.Adam(),
metrics=["accuracy"], sample_weight_mode='temporal')
And the way I train the model:
weights = np.where(train_y == 0, 0.1, 0.9)
model1 = LSTMconv10x9simple()
model1.fit(train_x,train_y,epochs=20, batch_size=32,validation_data=(test_x, test_y),sample_weight=weights)
The training set size is (2000,200,9,10,1) and the validation set is (1000,200,9,10,1), where I have 2000 videos of 200 frames in the trainingset, the videos are of 10 structures that look the same but I would to numerate them in a way as different structures. This is a segmentation problem.
The data is very unbalanced as there are objects in each video that I want to separate, but the background is about 90% of the videos. I have tried initializing weights with the "sample_weight_mode='temporal'" method in TensorFlow, but it did not seem to work. The most important thing in the model is to find the structures.
Does anyone have any solutions to my problems?

Related

Why the tuned number of hidden layers (show 3) is not the same as the units (show 4 units) when tuning an ANN model using KerasTurner?

I am currently using the KerasTuner to tune my Artificial Neural Network (ANN) deep learning model for a binary classification project (tabular dataset ). Below is my function to build the model:
def build_model(hp):
# Create a Sequential model
model = tf.keras.Sequential()
# Input Layer: The now model will take as input arrays of shape (None, 67)
model.add(tf.keras.Input(shape = (X_train.shape[1],)))
# Tune number of hidden layers and number of neurons
for i in range(hp.Int('num_layers', min_value = 1, max_value = 4)):
hp_units = hp.Int(f'units_{i}', min_value = 64, max_value = 512, step = 5)
model.add(Dense(units = hp_units, activation = 'relu'))
# Output Layer
model.add(Dense(units = 1, activation='sigmoid'))
# Compile the model
hp_learning_rate = hp.Choice('learning_rate', values = [1e-2, 1e-3, 1e-4])
model.compile(optimizer = keras.optimizers.Adam(learning_rate = hp_learning_rate),
loss = keras.losses.BinaryCrossentropy(),
metrics = ["accuracy"]
)
return model
Codes of creating tuner:
import os
# HyperBand algorithm from keras tuner
hpb_tuner = kt.Hyperband(
hypermodel = build_model,
objective = 'val_accuracy',
max_epochs = 500,
seed = 42,
executions_per_trial = 3,
directory = os.getcwd(),
project_name = "Medical Claim (ANN)",
)
hpb_tuner.search_space_summary()
The best result shows that I have to use 3 hidden layers. However, why there is a total of 4 hidden layers shown?
If I didn't misunderstand, the num_layers parameter indicates how many hidden layers I have to use in my ANN, and parameters units_0 to units_3 indicate how many neurons I have to use in each hidden layer where units_0 refers to the first hidden layer, units_1 refers to the second hidden layer and so forth. The input layer of my ANN should equal the number of features in my dataset which is 67 as shown in my code above (within the build_model function), so I believe the units_0 does not refer to the number of neurons in the input layer.
Is there something wrong with my code? Hope any gurus here can solve my doubt and problem!

How should i define loss and performance metric for this CNN?

I have implemented a CNN with two output layers for GTSRB Dataset problem. One output layer classifies images into their respective classes and second layer predicts bounding box coordinates. In dataset, the upper left and lower right coordinate is provided for training images. We have to predict the same for the test images. How do i define the loss metric(MSE or any other) and performance metric(R-Squared or any other) for regression layer since it outputs 4 values(x and y coordinates for upper left and lower right point)? Below is the code of model.
def get_model() :
#Input layer
input_layer = Input(shape=(IMG_HEIGHT, IMG_WIDTH, N_CHANNELS, ), name="input_layer", dtype='float32')
#Convolution, maxpool and dropout layers
conv_1 = Conv2D(filters=8, kernel_size=(3,3), activation=relu,
kernel_initializer=he_normal(seed=54), bias_initializer=zeros(),
name="first_convolutional_layer") (input_layer)
maxpool_1 = MaxPool2D(pool_size=(2,2), name = "first_maxpool_layer")(conv_1)
#Fully connected layers
flat = Flatten(name="flatten_layer")(maxpool_1)
d1 = Dense(units=64, activation=relu, kernel_initializer=he_normal(seed=45),
bias_initializer=zeros(), name="first_dense_layer", kernel_regularizer = l2(0.001))(flat)
d2 = Dense(units=32, activation=relu, kernel_initializer=he_normal(seed=47),
bias_initializer=zeros(), name="second_dense_layer", kernel_regularizer = l2(0.001))(d1)
classification = Dense(units = 43, activation=None, name="classification")(d2)
regression = Dense(units = 4, activation = 'linear', name = "regression")(d2)
#Model
model = Model(inputs = input_layer, outputs = [classification, regression])
model.summary()
return model
For classification output, you need to use softmax.
classification = Dense(units = 43, activation='softmax', name="classification")(d2)
You should use categorical_crossentropy loss for the classification output.
For regression, you can use mse loss.

Modifying TensorFlow Neural Network connections

I am using Python 3.X along with TensorFlow 2.0 to create a toy neural network model which is as follows:
model = Sequential()
model.add(
Dense(
units = 2, activation = 'relu',
kernel_initializer = tf.keras.initializers.GlorotNormal(),
input_shape = (2,)
)
)
model.add(
Dense(
units = 2, activation = 'relu',
kernel_initializer = tf.keras.initializers.GlorotNormal()
)
)
model.add(
Dense(
units = 1, activation = 'sigmoid'
)
)
I now want to modify the weights/biases of the model in a layer-wise manner. The code I have come up with to change the connections of the randomly initialized weights/biases of the model is that connections having magnitude less than 0.5 should become zero, while the others should remain the same:
for layer in model.trainable_weights:
layer = tf.where(tf.less(layer, 0.5), 0, layer)
However, this code does not change the connections as I want. What should I do?
Thanks!
Your code simply creates new tensors that have the desired values and puts them in the Python variable layer, but doesn't change the Tensorflow variables as you want to. You need to use the assign method of the Variable class:
for layer in model.trainable_weights:
layer.assign(tf.where(tf.less(layer, 0.5), 0, layer))

Getting array output but I want one output with sparse_categorical loss

I am trying to fit a simple neural net with keras. I have input and I would like to have a one integer output that represents a class of its own. I want it to be from 0-13 range. However when the last output is set to 1 it gives me an error
InvalidArgumentError: Received a label value of 12 which is outside the valid range of [0, 1). Label values:
This is what I have so far for compiling the neural net
import keras
from keras.models import Sequential
from keras.layers import Dense
classifier = Sequential()
classifier.add(Dense(units = 10, kernel_initializer = 'uniform',
activation = 'relu', input_dim = 10))
classifier.add(Dense(units = 11, kernel_initializer = 'uniform', activation = 'relu'))
classifier.add(Dense(units = 8, kernel_initializer = 'uniform', activation = 'relu'))
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
classifier.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
classifier.fit(X_train, y_train, batch_size = 2000, epochs = 20)
My training input are arrays of array and the labels is just an array with values from 0-12
This is the output
Lets understand sparse categorical cross-Entropy
it will just give you the ability to measure the error via integer labels (instead of one-hot array).
so why the error ?
according to what i explained your network should be able to predict 14 classes. so what is done on one-hot coding still needs to be done for the network (not that your feeding one-hot, this was just a flash back to that method to remind us what to do), you need 14 output neurons to do that, therefore;
the last layer should look like this:
classifier.add(Dense(units = 14, kernel_initializer = 'uniform', activation = 'sigmoid'))
and by the way its good practice to use metrics = ['sparse_categorical_accuracy']
if you want integer outputs
there are 2 options (to the best of my knowledge):
y_pred = np.argmax(classifier.predict(X_test), axis=1)
or simply:
y_pred = classifier.predict_classes(X_test)
If you want integer inputs in the range [0, 13], this corresponds to 14 output classes (indices from 0 to 13, starting from zero), so you need to configure the network appropriately:
classifier.add(Dense(units = 14, kernel_initializer = 'uniform', activation = 'softmax'))
After training, when the model makes a prediction, you will get a probability distribution over integers [0, 13]. To get the encoded integer, you have to take the index with the maximum probability, for example:
pred = classifier.predict(some_data)
integer = np.argmax(pred, axis=-1)
This will produce the predicted integer label.

Setting up a CNN network with multi-label classification

I have a set of 100x100 images, and an output array corresponding to the size of the input (i.e. length of 10000), where each element can be an 1 or 0.
I am trying to write a python program using TensorFlow/Keras to train a CNN on this data, however, I am not sure how to setup the layers to handle it, or the type of network to use.
Currently, I am doing the following (based off the TensorFlow tutorials):
model = keras.Sequential([
keras.layers.Flatten(input_shape=(100, 100)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10000, activation=tf.nn.softmax)
])
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
However, I can't seem to find what type of activation I should be using for the output layer to enable me to have multiple output values?
How would I set that up?
I am not sure how to setup the layers to handle it.
Your code is one way to handle that but as you might read in literature, is not the best one. State-of-the-art models usually use 2D Convolution Neural Networks. E.g:
img_input = keras.layers.Input(shape=img_shape)
conv1 = keras.layers.Conv2D(16, 3, activation='relu', padding='same')(img_input)
pol1 = keras.layers.MaxPooling2D(2)(conv1)
conv2 = keras.layers.Conv2D(32, 3, activation='relu', padding='same')(pol1)
pol2 = keras.layers.MaxPooling2D(2)(conv2)
conv3 = keras.layers.Conv2D(64, 3, activation='relu', padding='same')(pol2)
pol3 = keras.layers.MaxPooling2D(2)(conv3)
flatten = keras.layers.Flatten()(pol3)
dens1 = keras.layers.Dense(512, activation='relu')(flatten)
dens2 = keras.layers.Dense(512, activation='relu')(dens1)
drop1 = keras.layers.Dropout(0.2)(dens2)
output = keras.layers.Dense(10000, activation='softmax')(drop1)
I can't seem to find what type of activation I should be using for the
output layer to enable me to have multiple output values
Softmax is a good choice. It squashes a K-dimensional vector of arbitrary real values to a K-dimensional vector of real values, where each entry is in the range (0, 1].
You can pas output of your Softmax to top_k function to extract top k prediction:
softmax_out = tf.nn.softmax(logit)
tf.nn.top_k(softmax_out, k=5, sorted=True)
If you need multi-label classification you should change the above network. Last Activation function will change to sigmoid:
output = keras.layers.Dense(10000, activation='sigmoid')(drop1)
Then use tf.round and tf.where to extract labels:
indices = tf.where(tf.round(output) > 0.5)
final_output = tf.gather(x, indices)

Categories