I'm trying to add add conv layers to the transfer learning code mentioned below. But not sure how to proceed. I want to add
conv, max-pooling, 3x3 filter and stride 3 and activation mode ReLU or
conv, max-pooling, 3x3 filter and stride 3 and activation mode LReLU this layer in the below mentioned transfer learning code. Let me know if it's possible and if yes how?
CLASSES = 2
# setup model
base_model = MobileNet(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dropout(0.4)(x)
predictions = Dense(CLASSES, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)
# transfer learning
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
"""##Data augmentation"""
# data prep
"""
## Transfer learning
"""
from tensorflow.keras.callbacks import ModelCheckpoint
filepath="mobilenet/my_model.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
EPOCHS = 1
BATCH_SIZE = 32
STEPS_PER_EPOCH = 5
VALIDATION_STEPS = 32
MODEL_FILE = 'mobilenet/filename.model'
history = model.fit_generator(
train_generator,
epochs=EPOCHS,
steps_per_epoch=STEPS_PER_EPOCH,
validation_data=validation_generator,
validation_steps=VALIDATION_STEPS,
callbacks=callbacks_list)
model.save(MODEL_FILE)
backup_model = model
model.summary()
You can do it several ways, one of them is:
model = Sequential([
base_model,
GlobalAveragePooling2D(name='avg_pool'),
Dropout(0.4),
Conv(...), # the layers you would like to add for the base model
MaxPool(...),
...
])
model.compile(...)
I think this is what you are after
CLASSES=2
new_filters=256 # specify the number of filter you want in the added convolutional layer
img_shape=(224,224,3)
base_model=tf.keras.applications.mobilenet.MobileNet( include_top=False, input_shape=img_shape, weights='imagenet',dropout=.4)
x=base_model.output
x= Conv2D(new_filters, 3, padding='same', strides= (3,3), activation='relu', name='added')(x)
x= GlobalAveragePooling2D(name='avg_pool')(x)
x= Dropout(0.4)(x)
predictions= Dense(CLASSES, activation='softmax', name='output')(x)
model=Model(inputs=base_model.input, outputs=predictions)
model.summary()
Related
Can anyone help me to reduce val_loss and increase val_accuracy?
Below is the code of my Inception-V3 model. I added two dense layers with 1024 nodes with ReLU activation function, 1 GlobalAverage2D and 1 Dropout layer. And 1 last fully-connected layer with softmax activation for classification purpose.
To increase val_accuracy and reduce val_loss. Do i need to remove any one of the following layers?
pretrained_model = tf.keras.applications.InceptionV3(
input_shape=(224, 224, 3),
include_top=False,
weights='imagenet'
)
pretrained_model.trainable = False
inputs = pretrained_model.input
x = GlobalAveragePooling2D()(pretrained_model.output)
x = tf.keras.layers.Dense(1024, activation='relu')(x)
x = tf.keras.layers.Dense(1024, activation='relu')(x)
x = Dropout(0.5)(x)
outputs = tf.keras.layers.Dense(5, activation='softmax')(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
model.compile(
optimizer=Adam(learning_rate=0.0001),
loss='categorical_crossentropy',
metrics=['accuracy','AUC']
)
history = model.fit(
train_images,
validation_data=val_images,
batch_size = 64,
epochs=20,
callbacks=[
tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=5,
restore_best_weights=True
)
]
)
Below graphs shows the accuracy and loss obtained from the trained inception-V3 model.
accuracy
loss
Problem: I have S sequences of T timesteps each and each timestep contains F features so collectively, a dataset of
(S x T x F) and each s in S is described by 2 values (Target_1 and Target_2)
Goal: Model/Train an architecture using LSTMs in order to learn/achieve a function approximator model M and given a sequence s, to predict Target_1 and Target_2 ?
Something like this:
M(s) ~ (Target_1, Target_2)
I'm really struggling to find a way, below is a Keras implementation of an example that probably does not work. I made 2 models one for the first Target value and 1 for the second.
model1 = Sequential()
model1.add(Masking(mask_value=-10.0))
model1.add(LSTM(1, input_shape=(batch, timesteps, features), return_sequences = True))
model1.add(Flatten())
model1.add(Dense(hidden_units, activation = "relu"))
model1.add(Dense(1, activation = "linear"))
model1.compile(loss='mse', optimizer=Adam(learning_rate=0.0001))
model1.fit(x_train, y_train[:,0], validation_data=(x_test, y_test[:,0]), epochs=epochs, batch_size=batch, shuffle=False)
model2 = Sequential()
model2.add(Masking(mask_value=-10.0))
model2.add(LSTM(1, input_shape=(batch, timesteps, features), return_sequences=True))
model2.add(Flatten())
model2.add(Dense(hidden_units, activation = "relu"))
model2.add(Dense(1, activation = "linear"))
model2.compile(loss='mse', optimizer=Adam(learning_rate=0.0001))
model2.fit(x_train, y_train[:,1], validation_data=(x_test, y_test[:,1]), epochs=epochs, batch_size=batch, shuffle=False)
I want to make somehow good use of LSTMs time relevant memory in order to achieve good regression.
IIUC, you can start off with a simple (naive) approach by using two output layers:
import tensorflow as tf
timesteps, features = 20, 5
inputs = tf.keras.layers.Input((timesteps, features))
x = tf.keras.layers.Masking(mask_value=-10.0)(inputs)
x = tf.keras.layers.LSTM(32, return_sequences=False)(x)
x = tf.keras.layers.Dense(32, activation = "relu")(x)
output1 = Dense(1, activation = "linear", name='output1')(x)
output2 = Dense(1, activation = "linear", name='output2')(x)
model = tf.keras.Model(inputs, [output1, output2])
model.compile(loss='mse', optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001))
x_train = tf.random.normal((500, timesteps, features))
y_train = tf.random.normal((500, 2))
model.fit(x_train, [y_train[:,0],y_train[:,1]] , epochs=5, batch_size=32, shuffle=False)
My goal is to first train only using ResNet152 and then save the learned weights. Then i want to use these weights as a base for a more complex model with added layers which i ultimately want to do hyperparameter tuning on. The reason for this approach is that doing it all at once takes a very long time. The problem i am having is that my code doesnt seem to work. I dont get an error message but when i start training the more complex model it seems to start from 0 again and not using the learned ResNet152 weights.
Here is the code:
First i am only using ResNet152 and the output layer
input_tensor = Input(shape=train_generator.image_shape)
base_model = applications.ResNet152(weights='imagenet', include_top=False, input_tensor=input_tensor)
for layer in base_model.layers[:]:
layer.trainable = True¨
x = Flatten()(base_model.output)
predictions = Dense(num_classes, activation= 'softmax')(x)
model = Model(inputs = base_model.input, outputs = predictions)
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
model.fit(
train_generator,
validation_data=valid_generator,
epochs=epochs,
steps_per_epoch=len_train // batch_size,
validation_steps=len_val // batch_size,
callbacks=[earlyStopping, reduce_lr]
)
Then i am saving the weights:
model.save_weights('/content/drive/MyDrive/MODELS_SAVED/model_RESNET152/model_weights.h5')
Adding more layers.
input_tensor = Input(shape=train_generator.image_shape)
base_model = applications.ResNet152(weights='imagenet', include_top=False, input_tensor=input_tensor)
for layer in base_model.layers[:]:
layer.trainable = False
x = Flatten()(base_model.output)
x = Dense(1024, kernel_regularizer=tf.keras.regularizers.L2(l2=0.01),
kernel_initializer=tf.keras.initializers.HeNormal(),
kernel_constraint=tf.keras.constraints.UnitNorm(axis=0))(x)
x = LeakyReLU()(x)
x = BatchNormalization()(x)
x = Dropout(rate=0.1)(x)
x = Dense(512, kernel_regularizer=tf.keras.regularizers.L2(l2=0.01),
kernel_initializer=tf.keras.initializers.HeNormal(),
kernel_constraint=tf.keras.constraints.UnitNorm(axis=0))(x)
x = LeakyReLU()(x)
x = BatchNormalization()(x)
predictions = Dense(num_classes, activation= 'softmax')(x)
model = Model(inputs = base_model.input, outputs = predictions)
Loading the weights after the added layers and using by_name=True, both according to the keras tutorial.
model.load_weights('/content/drive/MyDrive/MODELS_SAVED/model_RESNET152/model_weights.h5', by_name=True)
Then i start training again.
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy']
)
model.fit(
train_generator,
validation_data=valid_generator,
epochs=epochs,
steps_per_epoch=len_train // batch_size,
validation_steps=len_val // batch_size,
callbacks=[earlyStopping, reduce_lr]
)
But it is starting at a very low accuracy, basically from 0 again, so im guessing something is wrong here. Any ideas on how to fix this?
When you use adam and save model weights only - you have to save/load optimizer weights as well:
weight_values = model.optimizer.get_weights()
with open(output_path+'optimizer.pkl', 'wb') as f:
pickle.dump(weight_values, f)
dummy_input = tf.random.uniform(inp_shape) # create a tensor of input shape
dummy_label = tf.random.uniform(label_shape) # create a tensor of label shape
hist = model.fit(dummy_input, dummy_label)
with open(path_to_saved_model+'optimizer.pkl', 'rb') as f:
weight_values = pickle.load(f)
optimizer.set_weights(weight_values)
I'll start with the code and after put my question.
model1_input= keras.Input(shape=(5,10))
x = layers.Dense(16, activation='relu')(model1_input)
model1_output = layers.Dense(4)(x)
model1= keras.Model(model1_input, model1_output, name='model1')
model1.summary()
//----
model2_input= keras.Input(shape=(5,10))
y = layers.Dense(16, activation='relu')(model2_input)
model2_output = layers.Dense(4)(y)
model2= keras.Model(model2_input, model2_output, name='model2')
model2.summary()
//----
model3_input= keras.Input(shape=(5, 10))
layer1 = model1(model3_input)
layer2 = model2(layer1)
model3_output = layers.Dense(1)(layer2)
model3= keras.Model(model3_input, model3_output , name='model3')
model3.summary()
model3.compile(loss='mse', optimizer='adam')
model3.fit(inputs, outputs, epochs=10, batch_size=32)
When execute this code, what will happen with the model 1 and model 2 weights? they would stay untrained?
I would like to use trained model1 and trained model2 predictions to train model3. Can I write something like that?
model1_input= keras.Input(shape=(5,10))
x = layers.Dense(16, activation='relu')(model1_input)
model1_output = layers.Dense(4)(x)
model1= keras.Model(model1_input, model1_output, name='model1')
model1.summary()
model1.compile(loss='mse', optimizer='adam')
model1.fit(model1_inputs, model1_outputs, epochs=10, batch_size=32)
//----
model2_input= keras.Input(shape=(5,10))
y = layers.Dense(16, activation='relu')(model2_input)
model2_output = layers.Dense(4)(y)
model2= keras.Model(model2_input, model2_output, name='model2')
model2.summary()
model2.compile(loss='mse', optimizer='adam')
model2.fit(model2_inputs, model2_outputs, epochs=10, batch_size=32)
//----
model3_input= keras.Input(shape=(5, 10))
layer1 = model1(model3_input)
layer2 = model2(layer1)
model3_output = layers.Dense(1)(layer2)
model3= keras.Model(model3_input, model3_output , name='model3')
model3.summary()
model3.compile(loss='mse', optimizer='adam')
model3.fit(inputs, outputs, epochs=10, batch_size=32)
I'm afraid when I train model3 this will change the already trained weights of models 1 and 2. In this case, what will happen with models 1 and 2 weights?
I am not sure if keras works that way but even if it does it still will change the weights as long as the layers are trainable. Try freezing the layers. These links might help you 1,2.
Another option will be to branch the layers like this.
I am trying to learn how vgg16 works. Below is my code, using vgg16 for another classification.
# Generate a model with all layers (with top)
model_vgg16_conv = VGG16(weights='imagenet', include_top=False)
model_vgg16_conv.summary()
# create your own input format
input = Input(shape=(128,128,3),name = 'image_input')
# Use the generated model
output_vgg16_conv = model_vgg16_conv(input)
# Add the fully-connected layers
x = Flatten(name='flatten')(output_vgg16_conv)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(5, activation='softmax', name='predictions')(x)
#Create your own model
model = Model(input=input, output=x)
#In the summary, weights and layers from VGG part will be hidden, but they will be fit during the training
model.summary()
# Specify an optimizer to use
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
# Choose loss function, optimization method, and metrics (which results to display)
model.compile(
optimizer = adam,
loss='categorical_crossentropy',
metrics=['accuracy']
)
model.fit(X_train,y_train,epochs=10,batch_size=10,verbose=2)
# model.fit(X_train,y_train,epochs=30,batch_size=100,verbose=2)
result = model.predict(y_test) # same result
For some reason, using different epoch size and batch size generate exactly the same result. Am I doing something wrong?