I am using progressive learning for making a model for classification. I first used vgg16 weights and add a Dense layer at the end to train the model.
prior = keras.applications.VGG16(
include_top=False,
weights='imagenet',
input_shape=(48, 48, 3)
)
model = Sequential()
model.add(prior)
model.add(Flatten())
model.add(Dense(256, activation='relu', name='Dense_Intermediate'))
model.add(Dropout(0.1, name='Dropout_Regularization'))
model.add(Dense(2, activation='sigmoid', name='Output'))
for cnn_block_layer in model.layers[0].layers:
cnn_block_layer.trainable = False
model.layers[0].trainable = False
I froze the vgg layers so their weights do not change. After training, I saved the weights.
model.save('model65x65.h5')
I increased the dimension of the images using ImageDataGenerator and loaded previously saved model weights and I defined a new model
model = Sequential()
model.add(Conv2D(64, kernel_size=(3, 3), input_shape=(96, 96, 3), activation='relu', padding='same'))
model.add(Conv2D(64, kernel_size=(3, 3), input_shape=(96, 96, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
prior = load_model('./model65x65.h5')
Now I wanted to add all but the first two layers of VGG16 to the new model, strip the input layer out and also strip out the first convolutional layer using
for layer in prior.layers[0].layers[2:]:
model.add(layer)
# re-add the feedforward layers on top
for layer in prior.layers[1:]:
model.add(layer)
# the pretrained CNN layers are already marked non-trainable
# mark off the top layers as well
for layer in prior.layers[-4:]:
layer.trainable = False
As there are convolutional layers in the VGG16, I keep getting this error when I try to iterate over the prior model.
AttributeError: 'Conv2D' object has no attribute 'layers'
How can I iterate over the Conv2D layers?
Related
import tensorflow_datasets as tfds
import tensorflow as tf
# Load the Oxford-IIIT Pet Dataset
dataset, dataset_info = tfds.load('oxford_iiit_pet', with_info=True, as_supervised=True)
train_dataset, test_dataset = dataset['train'], dataset['test']
# Load the VGG16 model
vgg16 = tf.keras.applications.VGG16(weights='imagenet', include_top=False, input_shape=(224,224,3))
# Freeze the weights of the model
vgg16.trainable = False
# Build the U-Net model using the encoder part of the VGG16 model
inputs = tf.keras.layers.Input(shape=(224,224,3))
x = vgg16(inputs)
# Add the U-Net decoder part
x = tf.keras.layers.Conv2D(512, (3,3), activation='relu', padding='same')(x)
x = tf.keras.layers.Conv2D(512, (3,3), activation='relu', padding='same')(x)
x = tf.keras.layers.Conv2D(512, (3,3), activation='relu', padding='same')(x)
x = tf.keras.layers.Conv2D(256, (3,3), activation='relu', padding='same')(x)
x = tf.keras.layers.Conv2D(128, (3,3), activation='relu', padding='same')(x)
x = tf.keras.layers.Conv2D(64, (3,3), activation='relu', padding='same')(x)
x = tf.keras.layers.Conv2D(32, (3,3), activation='relu', padding='same')(x)
# Add the final layer for binary segmentation
outputs = tf.keras.layers.Conv2D(1, (1,1), activation='sigmoid')(x)
# Compile the model
model = tf.keras.Model(inputs=inputs, outputs=outputs)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
This is what I need to do:
You are asked to build a solution to a segmentation problem while using Learning Transfer
The Oxford-IIIT Pet Dataset: array data
(https://www.tensorflow.org/datasets/catalog/oxford_iiit_pet)
Steps of the process:
Reading the dataset
Definition of a Learning Transfer process that includes the steps we discussed in the lecture, and in particular:
Selecting a convolution network, selecting a number of layers in the selected network, etc.
Use the VGG network due to its simplicity network due to its size.
Converting a network from the previous section into a semantic segmentation network using the UNET approach
Defining a LOSS function of your choice.
Network training.
I have a CNN-LSTM that looks as follows;
SEQUENCE_LENGTH = 32
BATCH_SIZE = 32
EPOCHS = 30
n_filters = 64
n_kernel = 1
n_subsequences = 4
n_steps = 8
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(n_subsequences, n_steps, X_train.shape[3]))))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
I'm using this CNN-LSTM for a multivariate time series forecasting problem. the CNN-LSTM input data comes in the 4D format: [samples, subsequences, timesteps, features]. For some reason, I need TimeDistributed Layers; or I get errors like ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35]. I think this has to do with the fact that Conv1D is officially not meant for time series, so to preserve time-series data shape we need to use a wrapper layer like TimeDistributed. I don't really mind using TimeDistributed layers - They're wrappers and if they make my model work I am happy. However, when I try to visualize my model with
file = 'CNN_LSTM_Visualization.png'
tf.keras.utils.plot_model(model, to_file=file, show_layer_names=False, show_shapes=False)
The resulting visualization only shows the Sequential():
I suspect this has to do with the TimeDistributed layers and the model not being built yet. I cannot call model.summary() either - it throws ValueError: This model has not yet been built. Build the model first by calling build()or callingfit()with some data, or specify aninput_shape argument in the first layer(s) for automatic build Which is strange because I have specified the input_shape, albeit in the Conv1D layer and not in the TimeDistributed wrapper.
I would like a working model together with a working tf.keras.utils.plot_model function. Any explanation as to why I need TimeDistributed and why it makes the plot_model function behave weirdly would be greatly awesome.
An alternative to using an Input layer is to simply pass the input_shape to the TimeDistributed wrapper, and not the Conv1D layer:
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu'), input_shape=(n_subsequences, n_steps, X_train.shape[3])))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
Add your input layer at the beginning. Try this
def DNN_Model(X_train):
model = Sequential()
model.add(InputLayer(input_shape=(n_subsequences, n_steps, X_train)))
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel,
activation='relu')))
model.add(TimeDistributed(Conv1D(filters=n_filters,
kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
....
Now, you can plot and get a summary properly.
DNN_Model(3).summary() # OK
tf.keras.utils.plot_model(DNN_Model(3)) # OK
I have an alexnet neural network that I wrote it from scratch using tensorflow and I used 6000 images as train_data.
but while training, the validation accuracy is not changing and it is greater than training accuracy, I guess it is overfitting. In addition, validation loss is increasing.
Is it possible to solve overfitting problem with 1000 data for saving time?
How can I prevent overfitting?
I attached my alexnet code below.
Thanks
def CreateModel():
model = Sequential()
# 1st Convolutional Layer
model.add(Conv2D(filters=96, input_shape=(227,227,3), kernel_size=(11,11), strides=(4,4), padding='valid'))
model.add(Activation('relu'))
# Max Pooling
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='valid'))
# 2nd Convolutional Layer
model.add(Conv2D(filters=256, kernel_size=(11,11), strides=(1,1), padding='valid'))
model.add(Activation('relu'))
# Max Pooling
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='valid'))
# 3rd Convolutional Layer
model.add(Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), padding='valid'))
model.add(Activation('relu'))
# 4th Convolutional Layer
model.add(Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), padding='valid'))
model.add(Activation('relu'))
# 5th Convolutional Layer
model.add(Conv2D(filters=256, kernel_size=(3,3), strides=(1,1), padding='valid'))
model.add(Activation('relu'))
# Max Pooling
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='valid'))
# Passing it to a Fully Connected layer
model.add(Flatten())
# 1st Fully Connected Layer
model.add(Dense(4096, input_shape=(224*224*3,)))
model.add(Activation('relu'))
# Add Dropout to prevent overfitting
model.add(Dropout(0.5))
# 2nd Fully Connected Layer
model.add(Dense(4096))
model.add(Activation('relu'))
# Add Dropout
model.add(Dropout(0.5))
# 3rd Fully Connected Layer
model.add(Dense(1000))
model.add(Activation('relu'))
# Add Dropout
model.add(Dropout(0.5))
# Output Layer
model.add(Dense(2))
model.add(Activation('softmax'))
model.summary()
return model
alexNet_model = CreateModel()
alexNet_model.compile(loss='sparse_categorical_crossentropy' , optimizer='adam', metrics=["accuracy"])
batch_size = 4
epochs = 5
history = alexNet_model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1,
validation_data=(x_validation, y_validation))
To mitigate overfitting. You can try to implement below steps
1.Shuffle the Data, by using shuffle=True in alexNet_model.fit. Code is shown below:
history = alexNet_model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1,
validation_data=(x_validation, y_validation), shuffle = True)
2.Use Early Stopping. Code is shown below
callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=15)
3.Use Regularization. Code for Regularization is shown below (You can try l1 Regularization or l1_l2 Regularization as well):
from tensorflow.keras.regularizers import l2
Regularizer = l2(0.001)
alexNet_model.add(Conv2D(96,11, 11, input_shape = (227,227,3),strides=(4,4), padding='valid', activation='relu', data_format='channels_last',
activity_regularizer=Regularizer, kernel_regularizer=Regularizer))
alexNet_model.add(Dense(units = 2, activation = 'sigmoid',
activity_regularizer=Regularizer, kernel_regularizer=Regularizer))
4.You can try using BatchNormalization.
5.Perform Image Data Augmentation using ImageDataGenerator. Refer this link for more info about that.
6.If the Pixels are not Normalized, Dividing the Pixel Values with 255 also helps
I'm getting a list of images to train my CNN.
model = Sequential()
model.add(Dense(32, activation='tanh', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
data, labels = ReadImages(TRAIN_DIR)
# Train the model, iterating on the data in batches of 32 samples
model.fit(np.array(data), np.array(labels), epochs=10, batch_size=32)
But I faced this error:
'with shape ' + str(data_shape))
ValueError: Error when checking input: expected dense_1_input to have 2 dimensions, but got array with shape (391, 605, 700, 3)
You are feeding images to the Dense Layer. Either flatten the images using .flatten() or use a model with CNN Layers. The shape (391,605,700,3) means you have 391 images of size 605x700 having 3 dimensions(rgb).
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(605, 700, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
This link has good explanations for basic CNN.
This is no CNN. A Convolutional Neural Network is defined by having Conv Layer. Those Layers work with imput shapes in 4D (Batchsize, ImageDimX, ImageDimY, ColorChannels). The Dense Layers(aka. Fully connected) you are using 2D input (Batchsize, DataAsAVector)
You need to first flatten the image if you want to pass the image directly to dense layers as Dense layer takes input in 2 dimensions only and since you are passing whole image there is 4 dimensions in it i.e. Number of images X Height X Width X Number of channels (391, 605, 700, 3).
You are not actually doing any convolutions on the image. To do convolutions you need to add CNN layers after initialising the model as sequential.
To add dense layer :
model = Sequential()
model.add(Flatten())
model.add(Dense(32, activation='tanh', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
To add CNN layer and then flatten it :
model = Sequential()
model.add(Conv2D(input_shape=(605,700,3), filters=64, kernel_size=(3,3),
padding="same",activation="relu"))
model.add(Flatten())
model.add(Dense(32, activation='tanh', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
I have a CNN and like to change this to a LSTM, but when I modified my code I receive the same error: ValueError: Input 0 is incompatible with layer gru_1: expected ndim=3, found ndim=4
I already change ndim but didn't work.
follow my cnn
def build_model(X,Y,nb_classes):
nb_filters = 32 # number of convolutional filters to use
pool_size = (2, 2) # size of pooling area for max pooling
kernel_size = (3, 3) # convolution kernel size
nb_layers = 4
input_shape = (1, X.shape[2], X.shape[3])
model = Sequential()
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
border_mode='valid', input_shape=input_shape))
model.add(BatchNormalization(axis=1))
model.add(Activation('relu'))
for layer in range(nb_layers-1):
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(BatchNormalization(axis=1))
model.add(ELU(alpha=1.0))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation("softmax"))
return model
and follow how i like to did my LSTM
data_dim = 41
timesteps = 20
num_classes = 10
model = Sequential()
model.add(LSTM(256, return_sequences=True, input_shape=(timesteps, data_dim)))
model.add(Dropout(0.5))
model.add(LSTM(128, return_sequences=True, input_shape=(timesteps, data_dim)))
model.add(Dropout(0.25))
model.add(LSTM(64))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))
What I was doing wrong?
Thanks
The LSTM code is fine, it executes with no errors for me.
The error you are seeing is related to internal incompatibility of the tensors within the model itself, not related to training data, in which case you'll get an "Exception: Invalid input shape"
What's confusing in your error is that it refers to a GRU layer, which isn't contained anywhere in your model definition. If your model only contains LSTM, you should get an error that calls out the LSTM layer that it conflicts with.
Perhaps check
model.get_config()
and make sure all the layers and configs are what you intended.
In particular, the first layer should say this:
batch_input_shape': (None, 20, 41)