I am working on a project in which i am trying to implement transfer learning to classify ECG signals (1-Dimentional). I have a pretrained model with pretty good accuracy, but the model was trained on a different dataset which have an input shape of (4096,12) and output shape (6). I want to fine tune this pre-trained model on my data which have an input shape of (350,5).
To do so, i have added some layers before the input of the pretrained model in order to get the shape (4096,12) and added an output dense layer with shape (5). the code of my model is as below:
from tensorflow.keras.layers import Dense,Input,Conv1D, BatchNormalization,
Activation,Flatten,Reshape,Dropout
from tensorflow.keras.models import Model
#layer to get the desired shape for pre-trained model
new_inp = Input(shape=(300,5))
net = Flatten()(new_inp)
net = Dense(1000, activation='relu')(net)
net = Dropout(0.3)(net)
net = Dense(4096, activation='relu')(net)
net = Dropout(0.3)(net)
net = Reshape([4096,1])(net)
net = Conv1D (filters = 64, kernel_size = 11, strides = 1, padding = 'same')(net)
net = BatchNormalization()(net)
net = Activation('relu')(net)
net = Conv1D (filters = 12, kernel_size = 9, strides = 1, padding = 'same')(net)
net = BatchNormalization()(net)
net = Activation('relu')(net)
# pre-trained model
net = mod(net)
# output layer
ll = Dense(4,activation="softmax")(net)
newModel = Model(new_inp, ll)
my training and validation accuracy is not improving... it improved upto 55%. any idea about the problem.
Thank you.
The idea behind transfer learning is that you concatenate new trainable layers to the end of a pre-trained model, freeze the pre-trained layers, and train the new layers. When you add these new layers to the beginning of the pre-trained model and training the whole network, you are essentially overwriting the pre-trained coefficients.
It is possible to add preprocessing layers (or any layer that does not require back-propagation) to the beginning, but you have added a whole DNN.
Related
I am currently using the KerasTuner to tune my Artificial Neural Network (ANN) deep learning model for a binary classification project (tabular dataset ). Below is my function to build the model:
def build_model(hp):
# Create a Sequential model
model = tf.keras.Sequential()
# Input Layer: The now model will take as input arrays of shape (None, 67)
model.add(tf.keras.Input(shape = (X_train.shape[1],)))
# Tune number of hidden layers and number of neurons
for i in range(hp.Int('num_layers', min_value = 1, max_value = 4)):
hp_units = hp.Int(f'units_{i}', min_value = 64, max_value = 512, step = 5)
model.add(Dense(units = hp_units, activation = 'relu'))
# Output Layer
model.add(Dense(units = 1, activation='sigmoid'))
# Compile the model
hp_learning_rate = hp.Choice('learning_rate', values = [1e-2, 1e-3, 1e-4])
model.compile(optimizer = keras.optimizers.Adam(learning_rate = hp_learning_rate),
loss = keras.losses.BinaryCrossentropy(),
metrics = ["accuracy"]
)
return model
Codes of creating tuner:
import os
# HyperBand algorithm from keras tuner
hpb_tuner = kt.Hyperband(
hypermodel = build_model,
objective = 'val_accuracy',
max_epochs = 500,
seed = 42,
executions_per_trial = 3,
directory = os.getcwd(),
project_name = "Medical Claim (ANN)",
)
hpb_tuner.search_space_summary()
The best result shows that I have to use 3 hidden layers. However, why there is a total of 4 hidden layers shown?
If I didn't misunderstand, the num_layers parameter indicates how many hidden layers I have to use in my ANN, and parameters units_0 to units_3 indicate how many neurons I have to use in each hidden layer where units_0 refers to the first hidden layer, units_1 refers to the second hidden layer and so forth. The input layer of my ANN should equal the number of features in my dataset which is 67 as shown in my code above (within the build_model function), so I believe the units_0 does not refer to the number of neurons in the input layer.
Is there something wrong with my code? Hope any gurus here can solve my doubt and problem!
I am trying to use the transfer learning example on keras.applications using VGG19. I am trying to train on the cifar10 dataset, so 10 classes. My model is (conceptually) simple as it's just VGG 19 minus the top three layers and then some extra layers that are trainable.
import tensorflow as tf
from keras.utils import to_categorical
from keras.applications import VGG19
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D, Input
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
#%%
# Specify input and number of classes
input_tensor = Input(shape=(32, 32, 3))
num_classes=10
#Load the data (cifar100), if label mode is fine then 100 classes
(X_train,y_train),(X_test,y_test)=tf.keras.datasets.cifar10.load_data()
#One_Hot_encode y data
y_test=to_categorical(y_test,num_classes=num_classes,dtype='int32')
y_train=to_categorical(y_train,num_classes=num_classes,dtype='int32')
#%%
# create the base pre-trained model
base_model = VGG19(weights='imagenet', include_top=False,
input_tensor=input_tensor)
# Add a fully connected layer and then a logistic layer
x = base_model.output
# # let's add a fully-connected layer
x = Dense(1024, activation='relu',name='Fully_Connected')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(num_classes, activation='softmax',name='Logistic')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
layer.trainable = False
# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])
# train the model on the new data for a few epochs
model.fit(X_train,y_train,epochs=10)
#%%
model.evaluate(X_test,y_test)
Now when I try to train using X_train [dimensions of (50000, 32, 32, 3)] and y_test (dimensions of (50000,10),
I get an error:
ValueError: Error when checking target: expected Logistic to have 4
dimensions, but got array with shape (50000, 10)
So for some reason the model isn't realizing that its output shape should be a 1x10 vector with one-hot encoding for the 10 classes.
How can I make it so that the dimensions agree? I don't fully understand the dimensions of output that keras is expecting here. When I do model.summary() the Logistic layer yields that the output shape should be (None, 1, 1, 10), which when flattened should just give a
VGG19 without the top layers does not return a fully-connected layer and instead returns a 2D feature space (output of a Conv2D/max pooling2d I believe). You'll probably want to place a flatten after the VGG, that'd be the best practical choice, as it will make your output shape (None,10).
Otherwise, you could do
y_train = np.reshape(y_train, (50000,1,1,10))
I load a pretrained model. However, I want to add Spectral Normalization to this model.
I can do it in this way if the model is fully sequential:
import tensorflow as tf
base_model = tf.keras.applications.VGG16(input_shape=(720, 1280, 3), weights=None, include_top=False)
# Add spectral normalization
base_model_new = tf.keras.models.Sequential()
for layer in base_model.layers:
if isinstance(layer, tf.keras.layers.Conv2D):
spec_norm_layer = SpectralNormalization(layer)
base_model_new.add(spec_norm_layer)
else:
base_model_new.add(layer)
input = tf.keras.Input(shape=(720, 1280, 3), name='image')
x = base_model_new(input)
model = tf.keras.models.Model(inputs=[input], outputs=x, name='test')
However, this method does not work when for example using a ResNet where there are Add or Concatenate layers which expect a list of inputs. How to solve this for ResNets?
I was trying to retrain Inception-V4 with an image set to unsupervised learning.
First, I have read the pre-trained weights file for the Inception-V4.
out = Dense(output_dim=nb_classes, activation='softmax')(x)
model = Model(init, out, name='Inception-v4')
if load_weights == True:
weights = checkpoint_custom_path
model.load_weights(weights, by_name=True)
print("Model weights loaded.")
return model
And I removed final full connection layer and dense layer to make the feature extractor.
Next, i added a new convolution layer to get special dimension feature vector form model.
And I saved a model and read again.
model = create_inception_v4(load_weights=check)
model.summary()
model.layers.pop()
model.layers.pop()
model.summary()
x = conv_block(model.layers[-1].output, 512, 1, 1)
x = Flatten()(x)
out = Dense(output_dim=500, activation='softmax')(x)
out = Activation('sigmoid', name='loss')(out)
model2 = Model(input=model.input, output=[out])
model2.summary()
model2.save(checkpoint_custom_path)
model = create_custormized_inception_v4(nb_classes=500, load_weights=check_custom)
model.summary()
The problem is to retrain this model with my image data set.
But I don't know the label for each images.
So, I have to retrain the model by unsupervised learning.
The size of image data set is over 130K.
How can I do retrain the model by unsupervised learning?
I'm using pre-trained ResNet-50 model and want to feed the outputs of the penultimate layer to a LSTM Network. Here is my sample code containing only CNN (ResNet-50):
N = NUMBER_OF_CLASSES
#img_size = (224,224,3)....same as that of ImageNet
base_model = ResNet50(include_top=False, weights='imagenet',pooling=None)
x = base_model.output
x = GlobalAveragePooling2D()(x)
predictions = Dense(1024, activation='relu')(x)
model = Model(inputs=base_model.input, outputs=predictions)
Next, I want to feed it to a LSTM network, as follows...
final_model = Sequential()
final_model.add((model))
final_model.add(LSTM(64, return_sequences=True, stateful=True))
final_model.add(Dense(N, activation='softmax'))
But I'm confused how to reshape the output to the LSTM input. My original input is (224*224*3) to CNN.
Also, should I use TimeDistributed?
Any kind of help is appreciated.
Adding an LSTM after a CNN does not make a lot of sense, as LSTM is mostly used for temporal/sequence information, whereas your data seems to be only spatial, however if you still like to use it just use
x = Reshape((1024,1))(x)
This would convert it to a sequence of 1024 samples, with 1 feature
If you are talking of spatio-temporal data, Use Timedistributed on the Resnet Layer and then you can use convlstm2d
Example of using pretrained network with LSTM:
inputs = Input(shape=(config.N_FRAMES_IN_SEQUENCE, config.IMAGE_H, config.IMAGE_W, config.N_CHANNELS))
cnn = VGG16(include_top=False, weights='imagenet', input_shape=(config.IMAGE_H, config.IMAGE_W, config.N_CHANNELS))
x = TimeDistributed(cnn)(inputs)
x = TimeDistributed(Flatten())(x)
x = LSTM(256)(x)