I'm following this tutorial here.
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(32, 32, 3)))
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(10, activation='softmax'))
I am trying to understand the given code which uses the CIFAR-10 dataset.
why is he using kernel_initializer='he_uniform'?
why did he choose the 128 for the dense layer?
what will happen if we add more dense layer to the code like:
model.add(Dense(512, activation='relu', kernel_initializer='he_uniform'))
is there any way to increase the accuracy of the model?
what would be a suitable dropout rate?
why is he using kernel_initializer='he_uniform'?
The weights in a layer of a neural network are initialized randomly. How though? Which distribution should they follow? he_uniform is a strategy for initializing the weights of that layer.
why did he choose the 128 for the dense layer?
This was chosen arbitrarily.
What will happen if we add more dense layer to the code like:
model.add(Dense(512, activation='relu', kernel_initializer='he_uniform'))
I assime you mean to add them where the other 128-neuron Dense layer is (there it won't break the model) The model will become deeper and have a much higher number of parameters (i.e. your model will become more complex) with whatever positives or negatives come along with this.
what would be a suitable dropout rate?
Usually you see rates in the range of [0.2, 0.5]. Higher rates reduce overfitting but might cause training to become more unstable.
Related
I am using keras to add layers, for example:
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', padding="same", input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu', padding="same", input_shape=(16, 16, 32)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu', padding="same"))
model.add(BatchNormalization())
model.add(layers.Dropout(0.5))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(BatchNormalization())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(10))
Now I am implementing LRN. However, the keras library does not have LRN to my limited knowledge. The old tf.nn library does have a LRN function called tf.nn.local_response_normalization.
Is it possible to mix tf.nn with keras?
Yes, tf.nn.local_response_normalization can be used in a lambda layer. See the code below:
...
model.add(BatchNormalization())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(10))
model.add(layers.Lambda(tf.nn.local_response_normalization))
...
I have a classifier model that I trained using 'theano' backend. The model works properly and I got the expected classification perforamance. The tensor size is Nx3x28x112 However, I would like to use the same classifier in another file (main_file.py) which contains a GANs implementation (with'tensorflow' backend). Thereby, I want to use the same classificer in the main_file.py and to change the input size of the tensor in order to be Nx28x112x3 (that is the proper input for the tensorflow backend). While the training procedure starts the performance is not close to the one I got with 'theano' and is close to random performance. My model looks like:
def createModel():
model = Sequential()
# The first two layers with 32 filters of window size 3x3
model.add(Conv2D(28, (3, 3), padding='same', activation='relu', input_shape=(28, 112, 3)))
# or input_shape = (3, 28, 112) in case of theano backend
model.add(Conv2D(28, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(nClasses, activation='softmax'))
return model
What should I do in order to make the model perform properly? Is there any fundamental difference when the backend is changing except the order of the input tensors?
I'm trying to load the Alexnet weight from 'alexnet_weights.h5' to the model built by the code below and I get an error saying the file matches a 11 layers model and my model is only 8 layers.
# Instantiate an empty model
model = Sequential()
# 1st Convolutional Layer
model.add(Conv2D(filters=96, input_shape=(227, 227, 3), kernel_size=(11, 11), strides=(4, 4), padding='valid'))
model.add(Activation('relu'))
# Max Pooling
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='valid'))
# 2nd Convolutional Layer
model.add(Conv2D(filters=256, kernel_size=(5, 5), strides=(1, 1), padding='same'))
model.add(Activation('relu'))
# Max Pooling
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='valid'))
# 3rd Convolutional Layer
model.add(Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), padding='same'))
model.add(Activation('relu'))
# 4th Convolutional Layer
model.add(Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), padding='same'))
model.add(Activation('relu'))
# 5th Convolutional Layer
model.add(Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), padding='valid'))
model.add(Activation('relu'))
# Max Pooling
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'))
# Passing it to a Fully Connected layer
model.add(Flatten())
# 1st Fully Connected Layer
model.add(Dense(4096, input_shape=(227 * 227 * 3,)))
model.add(Activation('relu'))
# Add Dropout to prevent overfitting
model.add(Dropout(0.4))
# 2nd Fully Connected Layer
model.add(Dense(4096))
model.add(Activation('relu'))
# Add Dropout
model.add(Dropout(0.4))
# 3rd Fully Connected Layer
model.add(Dense(1000))
model.add(Activation('relu'))
# Add Dropout
model.add(Dropout(0.4))
# # Output Layer
# model.add(Dense(17))
# model.add(Activation('softmax'))
model.summary()
model.load_weights(params["weights_path"])
model.summary()
the error:
ValueError: You are trying to load a weight file containing 11 layers into a model with 8 layers.
the file suppose to match the alexnet from convert-karas (from here: https://github.com/heuritech/convnets-keras/blob/master/convnetskeras/convnets.py) that seems alse to have 8 layers (5 conv, 3 dens, since pooling don't have any parameters)
any idea what is the problem?
Thanks
I guess its the naming convention. The file you posted splits the convolutions 2, 4, 5 into two convolutions and then combines them along the feature axis. While this in principle is the same thing as doing one larger convolution it might have legacy reasons, why it is coded this way. If you then load the weights the kernels will not fit and your loading function blows up.
You could just use their code to initiate the alexnet or you can add those splits to your code.
Before diving to code, here is my laptop specs:
Windows 10 Pro GPU
GTX 970M (3GB VRAM)
i7 6700HQ with 16GB RAM (If
at all it matters)
Software versions
Python 3.6.2
Tensorflow 1.5.0
Keras 2.1.3
The paper I am trying to replicate - https://www.sciencedirect.com/science/article/pii/S1877050916311929
My code -
from keras.models import Sequential
from keras.layers import Conv2D, MaxPool2D, Dense
from keras.layers import BatchNormalization, Dropout, Flatten
from keras.preprocessing.image import ImageDataGenerator
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(3, 3),
activation='relu', input_shape=(768, 768, 3)))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Conv2D(filters=32, kernel_size=(3, 3),
activation='relu'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Conv2D(filters=64, kernel_size=(3, 3),
activation='relu'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Conv2D(filters=64, kernel_size=(3, 3),
activation='relu'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Conv2D(filters=128, kernel_size=(3, 3),
activation='relu'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Conv2D(filters=128, kernel_size=(3, 3),
activation='relu'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Conv2D(filters=256, kernel_size=(3, 3),
activation='relu'))
model.add(Conv2D(filters=256, kernel_size=(3, 3),
activation='relu'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Dropout(rate=0.5))
model.add(Flatten())
model.add(Dense(units=1024, activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(units=1024, activation='relu'))
model.add(Dense(units=5, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])
batch_size = 8
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory('D:/Downloads/EyePACS_crop/',target_size=(768, 768),
batch_size=batch_size)
model.fit_generator(train_generator,steps_per_epoch=35126//batch_size,
epochs=50)
model.save_weights('npdr.h5')
I cannot understand why. Even with 3GB VRAM (Tensorflow shows 2.47GB free VRAM) I cannot use a batch size of even 8 images without getting ResourceExhaustedError. The complete output of the program(showing each chunk allocation. I'm suspecting a severe memory leak) is here on pastebin - https://pastebin.com/dRx54brC.
If someone could help me with this, I would be grateful. If it's a bug with keras/tensorflow, I can switch to another framework immediately
Thanks
Hello I am new comer in Keras. I choose keras to Implement this paper : http://mmlab.ie.cuhk.edu.hk/projects/TCDCN.html . I just change the input size to 48x48 then for the output I just need the 68 landmark coordinate. Here is my network:
def mtfl40New(size):
model = Sequential()
model.add(Conv2D(16, (5, 5), padding='valid', input_shape=(3, size, size)))
model.add(Activation('tanh'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(48, (3, 3), padding='valid'))
model.add(Activation('tanh'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), padding='valid'))
model.add(Activation('tanh'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (2, 2), padding='valid'))
model.add(Activation('tanh'))
model.add(Flatten())
model.summary()
#model.count_params()
model.add(Dense(100, kernel_initializer="normal", input_shape=(576,)))
model.add(Activation('tanh'))
model.add(Dense(136, kernel_initializer="normal"))
model.add(Activation('tanh'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
return model
However I get this error :
Anyone can help?
-Thank you-
This is again an incompatibility between your input shape and the format how it is interpreted. You have set in your Keras configuration the image ordering to channels first, while the input shape has the channels at the end. To fix it simply replace this line:
model.add(Conv2D(16, (5, 5), padding='valid', input_shape=(3, size, size)))
With:
model.add(Conv2D(16, (5, 5), padding='valid', input_shape=(size, size, 3)))