Attention layer in Deep Learning classification - python

I am trying to have a normal classification model with a tabular dataset. I came across the Attention layer and I would like to use it to improve my model's accuracy.
input_features_size = X_train.shape[1]
layers = [
tf.keras.Input(shape = input_features_size),
tf.keras.layers.Dense(64, activation = 'relu', name = 'first_layer'),
tf.keras.layers.Dense(128, activation = 'relu', name = 'second_layer'),
tf.keras.layers.BatchNormalization(axis = 1),
tf.keras.layers.Dense(1, activation = 'sigmoid', name = 'output_layer')
]
metrics = [
tf.keras.metrics.BinaryAccuracy(name = 'accuracy'),
tf.keras.metrics.Precision(name = 'precision'),
tf.keras.metrics.Recall(name = 'recall')
]
NUM_EPOCHS = 20
deep_learning_model = Sequential(layers = layers, name = 'DL_Classifier')
deep_learning_model.compile(
loss = binary_crossentropy,
optimizer = Adam(learning_rate = 1e-4),
metrics = metrics
)
I tried adding Addention layer (tf.keras.layers.Attention()) in the layers list but I am doing some mistake here. I am getting this error : Attention layer must be called on a list of inputs, namely [query, value]
How to add an Attention layer?

Related

Why the tuned number of hidden layers (show 3) is not the same as the units (show 4 units) when tuning an ANN model using KerasTurner?

I am currently using the KerasTuner to tune my Artificial Neural Network (ANN) deep learning model for a binary classification project (tabular dataset ). Below is my function to build the model:
def build_model(hp):
# Create a Sequential model
model = tf.keras.Sequential()
# Input Layer: The now model will take as input arrays of shape (None, 67)
model.add(tf.keras.Input(shape = (X_train.shape[1],)))
# Tune number of hidden layers and number of neurons
for i in range(hp.Int('num_layers', min_value = 1, max_value = 4)):
hp_units = hp.Int(f'units_{i}', min_value = 64, max_value = 512, step = 5)
model.add(Dense(units = hp_units, activation = 'relu'))
# Output Layer
model.add(Dense(units = 1, activation='sigmoid'))
# Compile the model
hp_learning_rate = hp.Choice('learning_rate', values = [1e-2, 1e-3, 1e-4])
model.compile(optimizer = keras.optimizers.Adam(learning_rate = hp_learning_rate),
loss = keras.losses.BinaryCrossentropy(),
metrics = ["accuracy"]
)
return model
Codes of creating tuner:
import os
# HyperBand algorithm from keras tuner
hpb_tuner = kt.Hyperband(
hypermodel = build_model,
objective = 'val_accuracy',
max_epochs = 500,
seed = 42,
executions_per_trial = 3,
directory = os.getcwd(),
project_name = "Medical Claim (ANN)",
)
hpb_tuner.search_space_summary()
The best result shows that I have to use 3 hidden layers. However, why there is a total of 4 hidden layers shown?
If I didn't misunderstand, the num_layers parameter indicates how many hidden layers I have to use in my ANN, and parameters units_0 to units_3 indicate how many neurons I have to use in each hidden layer where units_0 refers to the first hidden layer, units_1 refers to the second hidden layer and so forth. The input layer of my ANN should equal the number of features in my dataset which is 67 as shown in my code above (within the build_model function), so I believe the units_0 does not refer to the number of neurons in the input layer.
Is there something wrong with my code? Hope any gurus here can solve my doubt and problem!

How should i define loss and performance metric for this CNN?

I have implemented a CNN with two output layers for GTSRB Dataset problem. One output layer classifies images into their respective classes and second layer predicts bounding box coordinates. In dataset, the upper left and lower right coordinate is provided for training images. We have to predict the same for the test images. How do i define the loss metric(MSE or any other) and performance metric(R-Squared or any other) for regression layer since it outputs 4 values(x and y coordinates for upper left and lower right point)? Below is the code of model.
def get_model() :
#Input layer
input_layer = Input(shape=(IMG_HEIGHT, IMG_WIDTH, N_CHANNELS, ), name="input_layer", dtype='float32')
#Convolution, maxpool and dropout layers
conv_1 = Conv2D(filters=8, kernel_size=(3,3), activation=relu,
kernel_initializer=he_normal(seed=54), bias_initializer=zeros(),
name="first_convolutional_layer") (input_layer)
maxpool_1 = MaxPool2D(pool_size=(2,2), name = "first_maxpool_layer")(conv_1)
#Fully connected layers
flat = Flatten(name="flatten_layer")(maxpool_1)
d1 = Dense(units=64, activation=relu, kernel_initializer=he_normal(seed=45),
bias_initializer=zeros(), name="first_dense_layer", kernel_regularizer = l2(0.001))(flat)
d2 = Dense(units=32, activation=relu, kernel_initializer=he_normal(seed=47),
bias_initializer=zeros(), name="second_dense_layer", kernel_regularizer = l2(0.001))(d1)
classification = Dense(units = 43, activation=None, name="classification")(d2)
regression = Dense(units = 4, activation = 'linear', name = "regression")(d2)
#Model
model = Model(inputs = input_layer, outputs = [classification, regression])
model.summary()
return model
For classification output, you need to use softmax.
classification = Dense(units = 43, activation='softmax', name="classification")(d2)
You should use categorical_crossentropy loss for the classification output.
For regression, you can use mse loss.

Loading weights TensorFlow 2.0 model error

I am using Python 3.X and TensorFlow 2.0 along with "tensorflow_model_optimization" package for neural network pruning. The code I have is as follows-
from tensorflow_model_optimization.sparsity import keras as sparsity
l = tf.keras.layers
# Original model without pruning-
model = Sequential()
model.add(l.InputLayer(input_shape = (784, )))
model.add(Flatten())
model.add(Dense(units = 300, activation='relu', kernel_initializer = tf.initializers.GlorotUniform()))
model.add(l.Dropout(0.2))
model.add(Dense(units = 100, activation='relu', kernel_initializer = tf.initializers.GlorotUniform()))
model.add(l.Dropout(0.1))
model.add(Dense(units = num_classes, activation='softmax'))
# Define callbacks-
callbacks = [
# tf.keras.callbacks.TensorBoard(log_dir=logdir, profile_batch = 0),
tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience = 3)
]
# Compile designed Neural Network-
model.compile(
loss = tf.keras.losses.categorical_crossentropy,
optimizer = 'adam',
metrics = ['accuracy'])
# Save untrained and initial weights to disk-
model.save_weights("Initial_non_trained_weights.h5")
epochs = 12
num_train_samples = X_train.shape[0]
end_step = np.ceil(1.0 * num_train_samples / batch_size).astype(np.int32) * epochs
print("end_step parameter for this dataset = {0}".format(end_step))
# end_step = 5628
# Specify the parameters to be used for layer-wise pruning:
pruning_params = {
'pruning_schedule': sparsity.PolynomialDecay(
initial_sparsity=0.50, final_sparsity=0.90,
begin_step=2000, end_step=end_step, frequency=100)
}
# Neural network which is to be pruned-
pruned_model = Sequential()
pruned_model.add(l.InputLayer(input_shape=(784, )))
pruned_model.add(Flatten())
pruned_model.add(sparsity.prune_low_magnitude(Dense(units = 300, activation='relu', kernel_initializer=tf.initializers.GlorotUniform()),
**pruning_params))
pruned_model.add(l.Dropout(0.2))
pruned_model.add(sparsity.prune_low_magnitude(Dense(units = 100, activation='relu', kernel_initializer=tf.initializers.GlorotUniform()),
**pruning_params))
pruned_model.add(l.Dropout(0.1))
pruned_model.add(sparsity.prune_low_magnitude(Dense(units = num_classes, activation='softmax'), **pruning_params))
# Compile pruned CNN-
pruned_model.compile(
loss=tf.keras.losses.categorical_crossentropy,
optimizer='adam',
metrics=['accuracy'])
# Load weights from before-
pruned_model.load_weights("Initial_non_trained_weights.h5")
This last line of loading initial weights into the pruned model gives me error:
ValueError: Layer #0 (named "prune_low_magnitude_dense_9" in the current model) was found to correspond to layer dense in the save file.
However the new layer prune_low_magnitude_dense_9 expects 5 weights, but the saved weights have 2 elements.
What's going wrong?
Thanks!

Print/Save autoencoder generated features in Keras

I have this autoencoder:
input_dim = Input(shape=(10,))
encoded1 = Dense(30, activation = 'relu')(input_dim)
encoded2 = Dense(20, activation = 'relu')(encoded1)
encoded3 = Dense(10, activation = 'relu')(encoded2)
encoded4 = Dense(6, activation = 'relu')(encoded3)
decoded1 = Dense(10, activation = 'relu')(encoded4)
decoded2 = Dense(20, activation = 'relu')(decoded1)
decoded3 = Dense(30, activation = 'relu')(decoded2)
decoded4 = Dense(ncol, activation = 'sigmoid')(decoded3)
autoencoder = Model(input = input_dim, output = decoded4)
autoencoder.compile(-...)
autoencoder.fit(...)
Now I would like print or save the features generate in encoded4.
Basically, starting from a huge dataset I would like to extract the features generated by autoencoder, after the training part, to obtain a restricted representation of my dataset.
Could you help me?
You can do it by creating the "encoder" model:
encoder = Model(input = input_dim, output = encoded4)
This will use the same layers instances that you trained with the autoencoder and should be producing the feature if you use it in "inference mode" like encoder.predict()
I hope this helps :)
So, basically, by creating an encoder like this:
encoder = Model (input_dim,encoded4)
encoded_input=Input(shape=(6,))
and then using:
encoded_data=encoder.predict(data)
where the data within the predict function is the dataset, the output generate by
print encoded_data
is the restricted representation of my dataset.
It is right?
Thanks

Keras - All layer names should be unique

I combine two VGG net in keras together to make classification task. When I run the program, it shows an error:
RuntimeError: The name "predictions" is used 2 times in the model. All layer names should be unique.
I was confused because I only use prediction layer once in my code:
from keras.layers import Dense
import keras
from keras.models import Model
model1 = keras.applications.vgg16.VGG16(include_top=True, weights='imagenet',
input_tensor=None, input_shape=None,
pooling=None,
classes=1000)
model1.layers.pop()
model2 = keras.applications.vgg16.VGG16(include_top=True, weights='imagenet',
input_tensor=None, input_shape=None,
pooling=None,
classes=1000)
model2.layers.pop()
for layer in model2.layers:
layer.name = layer.name + str("two")
model1.summary()
model2.summary()
featureLayer1 = model1.output
featureLayer2 = model2.output
combineFeatureLayer = keras.layers.concatenate([featureLayer1, featureLayer2])
prediction = Dense(1, activation='sigmoid', name='main_output')(combineFeatureLayer)
model = Model(inputs=[model1.input, model2.input], outputs= prediction)
model.summary()
Thanks for #putonspectacles help, I follow his instruction and find some interesting part. If you use model2.layers.pop() and combine the last layer of two models using "model.layers.keras.layers.concatenate([model1.output, model2.output])", you will find that the last layer information is still showed using the model.summary(). But actually they do not exist in the structure. So instead, you can use model.layers.keras.layers.concatenate([model1.layers[-1].output, model2.layers[-1].output]). It looks tricky but it works.. I think it is a problem about synchronization of the log and structure.
First, based on the code you posted you have no layers with a name attribute 'predictions', so this error has nothing to do with your layer
Dense layer prediction: i.e:
prediction = Dense(1, activation='sigmoid',
name='main_output')(combineFeatureLayer)
The VGG16 model has a Dense layer with name predictions. In particular this line:
x = Dense(classes, activation='softmax', name='predictions')(x)
And since you're using two of these models you have layers with duplicate names.
What you could do is rename the layer in the second model to something other than predictions, maybe predictions_1, like so:
model2 = keras.applications.vgg16.VGG16(include_top=True, weights='imagenet',
input_tensor=None, input_shape=None,
pooling=None,
classes=1000)
# now change the name of the layer inplace.
model2.get_layer(name='predictions').name='predictions_1'
You can change the layer's name in keras, don't use 'tensorflow.python.keras'.
Here is my sample code:
from keras.layers import Dense, concatenate
from keras.applications import vgg16
num_classes = 10
model = vgg16.VGG16(include_top=False, weights='imagenet', input_tensor=None, input_shape=(64,64,3), pooling='avg')
inp = model.input
out = model.output
model2 = vgg16.VGG16(include_top=False,weights='imagenet', input_tensor=None, input_shape=(64,64,3), pooling='avg')
for layer in model2.layers:
layer.name = layer.name + str("_2")
inp2 = model2.input
out2 = model2.output
merged = concatenate([out, out2])
merged = Dense(1024, activation='relu')(merged)
merged = Dense(num_classes, activation='softmax')(merged)
model_fusion = Model([inp, inp2], merged)
model_fusion.summary()
Example:
# Network for affine transform estimation
affine_transform_estimator = MobileNet(
input_tensor=None,
input_shape=(config.IMAGE_H // 2, config.IMAGE_W //2, config.N_CHANNELS),
alpha=1.0,
depth_multiplier=1,
include_top=False,
weights='imagenet'
)
affine_transform_estimator.name = 'affine_transform_estimator'
for layer in affine_transform_estimator.layers:
layer.name = layer.name + str("_1")
# Network for landmarks regression
landmarks_regressor = MobileNet(
input_tensor=None,
input_shape=(config.IMAGE_H // 2, config.IMAGE_W // 2, config.N_CHANNELS),
alpha=1.0,
depth_multiplier=1,
include_top=False,
weights='imagenet'
)
landmarks_regressor.name = 'landmarks_regressor'
for layer in landmarks_regressor.layers:
layer.name = layer.name + str("_2")
input_image = Input(shape=(config.IMAGE_H, config.IMAGE_W, config.N_CHANNELS))
downsampled_image = MaxPooling2D(pool_size=(2,2))(input_image)
x1 = affine_transform_estimator(downsampled_image)
x2 = landmarks_regressor(downsampled_image)
x3 = add([x1,x2])
model = Model(inputs=input_image, outputs=x3)
optimizer = Adadelta()
model.compile(optimizer=optimizer, loss=mae_loss_masked)

Categories