I am running into problems with my code after I removed the loss function of the compile step (set it equal to loss=None) and added one with the intention of adding another, loss function through the add_loss method. I can call fit and it trains for one epoch but then I get this error:
ValueError: operands could not be broadcast together with shapes (128,) (117,) (128,)
My batch size is 128. It looks like 117 is somehow dependent on the number of examples that I am using. When I vary the number of examples, I get different numbers from 117. They are all my number of examples mod my batch size. I am at a loss about how to fix this issue. I am using tf.data.TFRecordDataset as input.
I have the following simplified model:
class MyModel(Model):
def __init__(self):
super(MyModel, self).__init__()
encoder_input = layers.Input(shape=INPUT_SHAPE, name='encoder_input')
x = encoder_input
x = layers.Conv2D(64, (3, 3), activation='relu', padding='same', strides=2)(x)
x = layers.BatchNormalization()(x)
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same', strides=2)(x)
x = layers.BatchNormalization()(x)
x = layers.Flatten()(x)
encoded = layers.Dense(LATENT_DIM, name='encoded')(x)
self.encoder = Model(encoder_input, outputs=[encoded])
self.decoder = tf.keras.Sequential([
layers.Input(shape=LATENT_DIM),
layers.Dense(32 * 32 * 32),
layers.Reshape((32, 32, 32)),
layers.Conv2DTranspose(32, kernel_size=3, strides=2, activation='relu', padding='same'),
layers.Conv2DTranspose(64, kernel_size=3, strides=2, activation='relu', padding='same'),
layers.Conv2D(3, kernel_size=(3, 3), activation='sigmoid', padding='same')])
def call(self, x):
encoded = self.encoder(x)
decoded = self.decoder(encoded)
# Loss function. Has to be here because I intend to add another, more layer-interdependent, loss function.
r_loss = tf.math.reduce_sum(tf.math.square(x - decoded), axis=[1, 2, 3])
self.add_loss(r_loss)
return decoded
def read_tfrecord(example):
example = tf.io.parse_single_example(example, CELEB_A_FORMAT)
image = decode_image(example['image'])
return image, image
def load_dataset(filenames, func):
dataset = tf.data.TFRecordDataset(
filenames
)
dataset = dataset.map(partial(func), num_parallel_calls=tf.data.AUTOTUNE)
return dataset
def train_autoencoder():
filenames_train = glob.glob(TRAIN_PATH)
train_dataset_x_x = load_dataset(filenames_train[:4], func=read_tfrecord)
autoencoder = Autoencoder()
# The loss function used to be defined here and everything worked fine before.
def r_loss(y_true, y_pred):
return tf.math.reduce_sum(tf.math.square(y_true - y_pred), axis=[1, 2, 3])
optimizer = tf.keras.optimizers.Adam(1e-4)
autoencoder.compile(optimizer=optimizer, loss=None)
autoencoder.fit(train_dataset_x_x.batch(AUTOENCODER_BATCH_SIZE),
epochs=AUTOENCODER_NUM_EPOCHS,
shuffle=True)
If you only want to get rid of the error and don't care about the last "remainder" batch of your dataset, you can use the keyword argument drop_remainder=True inside of train_dataset_x_x.batch(), that way all of your batches will be the same size.
FYI, it's usually better practice to batch your dataset outside of the function call for fit:
data = data.batch(32)
model.fit(data)
the loss function can not be set in the call method .
the call method is intended to make a forward pass not to copute the loss .
u need to add the loss function in the compile method or after it
Related
I'm trying to implement my own detection model that tries to find objects in a grayscale image by its coordinates, for this I created a custom dataset, defined by a custom data generator:
class DataGenerator(tf.compat.v2.keras.utils.Sequence):
def __init__(self, X_data , y_data, batch_size, shuffle = True):
self.batch_size = batch_size
self.X_data = X_data
self.labels = y_data
self.y_data = y_data
self.shuffle = shuffle
self.n = 0
self.dim = (480, 848)
self.list_IDs = np.arange(len(self.X_data))
self.on_epoch_end()
def __next__(self):
# Get one batch of data
data = self.__getitem__(self.n)
# Batch index
self.n += 1
# If we have processed the entire dataset then
if self.n >= self.__len__():
self.on_epoch_end
self.n = 0
return data
def __len__(self):
# Return the number of batches of the dataset
return math.ceil(len(self.indexes)/self.batch_size)
def __getitem__(self, index):
# Generate indexes of the batch
indexes = self.indexes[index*self.batch_size:
(index+1)*self.batch_size]
# Find list of IDs
list_IDs_temp = [self.list_IDs[k] for k in indexes]
X = self._generate_x(list_IDs_temp)
y = self._generate_y(list_IDs_temp)
return X, y
def on_epoch_end(self):
self.indexes = np.arange(len(self.X_data))
if self.shuffle:
np.random.shuffle(self.indexes)
def _generate_x(self, list_IDs_temp):
X = np.empty((self.batch_size, *self.dim))
for i, ID in enumerate(list_IDs_temp):
X[i,] = cv2.imread(self.X_data[ID],0)
X = (X/255).astype('float32') # Normalize data
return X[:,:,:, np.newaxis]
def _generate_y(self, list_IDs_temp):
y = np.empty((self.batch_size, 2))
for i, ID in enumerate(list_IDs_temp):
y[i] = self.y_data[ID]
return y
When called it gives the following output:
val_generator = DataGenerator(x_test, y_test, batch_size=4, shuffle=False)
images, labels = next(val_generator)
print(labels.shape)
>>>> (4, 2)
Which is what you would expect for a batch size of 4 with an x and an y as coordinates in the image.
The model looks as follows:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
batch_input_shape=(4, 480, 848, 1)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(15, activation="relu"))
model.compile(loss=simple_loss, optimizer=keras.optimizers.Adam(lr=0.001))
Calling the model:
steps_per_epoch = len(train_generator)
validation_steps = len(val_generator)
model.fit(train_generator,
steps_per_epoch=steps_per_epoch,
epochs=10,
validation_data=val_generator,
validation_steps=validation_steps)
My custom but simple loss function to test if the model is running gave some errors:
def simple_loss(yTrue, yPred):
probs_logs, coords = yPred[:,:5], yPred[:,5:]
coords_2d = tf.reshape(coords, [4, 10]) # 4 batch size and 10 coords flatten out
_abs = tf.math.abs(yTrue, coords_2d )
return tf.sqrt(_abs)
I first was thinking yPred contained errors, but that produced the following tensor:
Tensor("simple_loss/Reshape:0", shape=(4, 10), dtype=float32)
Then I looked at yTrue and found the shape was (None, None):
Tensor("IteratorGetNext:1", shape=(None, None), dtype=float32)
So I guess there is something wrong with my generator, I have no clue what, so I was wondering if any of you could help me out?
Thanks
The shape for ground truths is supposed to be None since they have not been evaluated yet when the code is first run.
Moreover, the "yTrue" and "yPred" are purely tensors, so be sure to use tensorflow functions for each and every operation on them.
I faced same error just because I inferred the length of tensor by using tensor.shape which gave me float not Tensor.
Fixing that fixed the problem, which was using tf.size() function. This returns length of tensor as Tensor object.
I'm new to pytorch. Here's an architecture of a tensorflow model and I'd like to convert it into a pytorch model.
I have done most of the codes but am confused about a few places.
1) In tensorflow, the Conv2D function takes filter as an input. However, in pytorch, the function takes the size of input channels and output channels as inputs. So how do I find the equivalent number of input channels and output channels, provided with the size of the filter.
2) In tensorflow, the dense layer has a parameter called 'nodes'. However, in pytorch, the same layer has 2 different inputs (the size of the input parameters and size of the targeted parameters), how do I determine them based on the number of the nodes.
Here's the tensorflow code.
from keras.utils import to_categorical
from keras.models import Sequential, load_model
from keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu', input_shape=X_train.shape[1:]))
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(rate=0.25))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(rate=0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(43, activation='softmax'))
Here's my code.:
import torch.nn.functional as F
import torch
# The network should inherit from the nn.Module
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# Define 2D convolution layers
# 3: input channels, 32: output channels, 5: kernel size, 1: stride
self.conv1 = nn.Conv2d(3, 32, 5, 1) # The size of input channel is 3 because all images are coloured
self.conv2 = nn.Conv2d(32, 64, 5, 1)
self.conv3 = nn.Conv2d(64, 128, 3, 1)
self.conv3 = nn.Conv2d(128, 256, 3, 1)
# It will 'filter' out some of the input by the probability(assign zero)
self.dropout1 = nn.Dropout2d(0.25)
self.dropout2 = nn.Dropout2d(0.5)
# Fully connected layer: input size, output size
self.fc1 = nn.Linear(36864, 128)
self.fc2 = nn.Linear(128, 10)
# forward() link all layers together,
def forward(self, x):
x = self.conv1(x)
x = F.relu(x)
x = self.conv2(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = self.conv3(x)
x = F.relu(x)
x = self.conv4(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = F.relu(x)
x = self.dropout2(x)
x = self.fc2(x)
output = F.log_softmax(x, dim=1)
return output
Thanks in advance!
1) In pytorch, we take input channels and output channels as an input. In your first layer, the input channels will be the number of color channels in your image. After that it's always going to be the same as the output channels from your previous layer (output channels are specified by the filters parameter in Tensorflow).
2). Pytorch is slightly annoying in the fact that when flattening your conv outputs you'll have to calculate the shape yourself. You can either use an equation to calculate this (𝑂𝑢𝑡=(𝑊−𝐹+2𝑃)/𝑆+1), or make a shape calculating function to get the shape of a dummy image after it's been passed through the conv part of the network. This parameter will be your size of input argument; the size of your output argument will just be the number of nodes you want in your next fully connected layer.
I am using the custom loss function in addition to the mean squared error loss function in my Keras model. Code for the custom loss function is given below:
def grad1(matrix):
dx = 1.0
u_x = np.gradient(matrix,dx,axis=0)
u_xx = np.gradient(u_x,dx,axis=0)
return u_xx
def artificial_diffusion(y_true, y_pred):
u_xxt = tf.py_func(grad1,[y_true],tf.float32)
u_xxp = tf.py_func(grad1,[y_pred],tf.float32)
lap_mse = tf.losses.mean_squared_error(u_xxt,u_xxp) + K.epsilon()
I have the 1D CNN model.
input_img = Input(shape=(n_states,n_features))
x = Conv1D(32, kernel_size=5, activation='relu', padding='same')(input_img)
x = Conv1D(32, kernel_size=5, activation='relu', padding='same')(x)
x = Conv1D(32, kernel_size=5, activation='relu', padding='same')(x)
decoded1 = Conv1D(n_outputs, kernel_size=3, activation='linear', padding='same',
name='regression')(x)
decoded2 = Conv1D(n_outputs, kernel_size=3, activation='linear', padding='same',
name='diffusion')(x)
model = Model(inputs=input_img, outputs=[decoded1,decoded2])
model.compile(loss=['mse',artificial_diffusion],
loss_weights=[1, 1],
optimizer='adam',metrics=[coeff_determination])
When I compile and run the model, I get an error An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.. If I create the model as model = Model(inputs=input_img, outputs=[decoded1,decoded1]), then there is no error. But, then I can't monitor two losses separately. Am I doing any mistake while constructing the model?
I am new to tensorflow and Semantic segmentation.
I am designing a U-Net for semantic segmentaion. Each image has one object that I want to classify. But in total I have images of 10 different objects. I am confused, how can I prepare my mask input? Is it considered as multi-label segmentation or only for one class?
Should I convert my input to one hot encoded? Should I use to_categorical? I find exaples for multi-class segmentation, but I don't know, If that's the case here. Because in one image I only have one object to detect/classify.
I tried using this as my code for input. But I am not sure, what I am doing is right or not.
#Generation of batches of image and mask
class DataGen(keras.utils.Sequence):
def __init__(self, image_names, path, batch_size, image_size=128):
self.image_names = image_names
self.path = path
self.batch_size = batch_size
self.image_size = image_size
def __load__(self, image_name):
# Path
image_path = os.path.join(self.path, "images/aug_test", image_name) + ".png"
mask_path = os.path.join(self.path, "masks/aug_test",image_name) + ".png"
# Reading Image
image = cv2.imread(image_path, 1)
image = cv2.resize(image, (self.image_size, self.image_size))
# Reading Mask
mask = cv2.imread(mask_path, -1)
mask = cv2.resize(mask, (self.image_size, self.image_size))
## Normalizaing
image = image/255.0
mask = mask/255.0
return image, mask
def __getitem__(self, index):
if(index+1)*self.batch_size > len(self.image_names):
self.batch_size = len(self.image_names) - index*self.batch_size
image_batch = self.image_names[index*self.batch_size : (index+1)*self.batch_size]
image = []
mask = []
for image_name in image_batch:
_img, _mask = self.__load__(image_name)
image.append(_img)
mask.append(_mask)
#This is where I am defining my input
image = np.array(image)
mask = np.array(mask)
mask = tf.keras.utils.to_categorical(mask, num_classes=10, dtype='float32') #Is this true?
return image, mask
def __len__(self):
return int(np.ceil(len(self.image_names)/float(self.batch_size)))
Is this true? If it is, then, to get the label/class as output what should I change in my input? Should I change the value of pixel of my mask according to my class?
Here is my U-Net architecture.
# Convolution and deconvolution Blocks
def down_scaling_block(x, filters, kernel_size=(3, 3), padding="same", strides=1):
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(x)
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(conv)
pool = keras.layers.MaxPool2D((2, 2), (2, 2))(conv)
return conv, pool
def up_scaling_block(x, skip, filters, kernel_size=(3, 3), padding="same", strides=1):
conv_t = keras.layers.UpSampling2D((2, 2))(x)
concat = keras.layers.Concatenate()([conv_t, skip])
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(concat)
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(conv)
return conv
def bottleneck(x, filters, kernel_size=(3, 3), padding="same", strides=1):
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(x)
conv = keras.layers.Conv2D(filters, kernel_size, padding=padding, strides=strides, activation="relu")(conv)
return conv
def UNet():
filters = [16, 32, 64, 128, 256]
inputs = keras.layers.Input((image_size, image_size, 3))
'''inputs2 = keras.layers.Input((image_size, image_size, 1))
conv1_2, pool1_2 = down_scaling_block(inputs2, filters[0])'''
Input = inputs
conv1, pool1 = down_scaling_block(Input, filters[0])
conv2, pool2 = down_scaling_block(pool1, filters[1])
conv3, pool3 = down_scaling_block(pool2, filters[2])
'''conv3 = keras.layers.Conv2D(filters[2], kernel_size=(3,3), padding="same", strides=1, activation="relu")(pool2)
conv3 = keras.layers.Conv2D(filters[2], kernel_size=(3,3), padding="same", strides=1, activation="relu")(conv3)
drop3 = keras.layers.Dropout(0.5)(conv3)
pool3 = keras.layers.MaxPooling2D((2,2), (2,2))(drop3)'''
conv4, pool4 = down_scaling_block(pool3, filters[3])
bn = bottleneck(pool4, filters[4])
deConv1 = up_scaling_block(bn, conv4, filters[3]) #8 -> 16
deConv2 = up_scaling_block(deConv1, conv3, filters[2]) #16 -> 32
deConv3 = up_scaling_block(deConv2, conv2, filters[1]) #32 -> 64
deConv4 = up_scaling_block(deConv3, conv1, filters[0]) #64 -> 128
outputs = keras.layers.Conv2D(10, (1, 1), padding="same", activation="softmax")(deConv4)
model = keras.models.Model(inputs, outputs)
return model
model = UNet()
model.compile(optimizer='adam', loss="categorical_crossentropy", metrics=["acc"])
train_gen = DataGen(train_img, train_path, image_size=image_size, batch_size=batch_size)
valid_gen = DataGen(valid_img, train_path, image_size=image_size, batch_size=batch_size)
test_gen = DataGen(test_img, test_path, image_size=image_size, batch_size=batch_size)
train_steps = len(train_img)//batch_size
valid_steps = len(valid_img)//batch_size
model.fit_generator(train_gen, validation_data=valid_gen, steps_per_epoch=train_steps, validation_steps=valid_steps,
epochs=epochs)
I hope that I explained my question properly. Any help appriciated!
UPDATE: I changed the value of each pixel in mask as per object class. (If the image contains object which I want to classify as object no. 2, then I changed the value of mask pixel to 2. the whole array of mask will contain 0(bg) and 2(object). Accordingly for each object, the mask will contain 0 and 3, 0 and 10 etc.)
Here I first changed the mask to binary and then if the value of pixel is greater than 1, I changed it to 1 or 2 or 3. (according to object/class no.)
Then I converted them to one_hot with to_categorical as shown in my code. training runs but the network doesnt learn anything. Accuracy and loss keep swinging between two values. What is my mistake here? Am I making a mistake at generating mask (changing the value of pixels?) Or at the function to_categorical?
PROBLEM FOUND:
I was making an error while creating mask.. I was reading image with cv2, which reads image as heightxwidth.. I was creating mask with pixel values according to class, after considering my image dimention as widthxheight.. Which was causing problem and making network not to learn anything.. It is working now..
Each image has one object that I want to classify. But in total I have images of 10 different objects. I am confused, how can I prepare my mask input? Is it considered as multi-label segmentation or only for one class?
If your dataset has N different labels (i.e: 0 - background, 1 - dogs, 2 -cats...), you have a multi class problem, even if your images contain only kind of object.
Should I convert my input to one hot encoded? Should I use to_categorical?
Yes, you should one-hot encode your labels. Using to_categorical boils down to the source format of your labels. Say you have N classes and your labels are (height, width, 1), where each pixel has a value in range [0,N). In that case keras.utils.to_categorical(label, N) will provide a float (height,width,N) label, where each pixel is 0 or 1. And you don't have to divide by 255.
if your source format is different, you may have to use a custom function to get the same output format.
Check out this repo (not my work): keras-unet. The notebooks folder contain two examples to train a u-net on small datasets. They are not multiclass, but it is easy to go step by step to use your own dataset. Star by loading your labels as:
im = Image.open(mask).resize((512,512))
im = to_categorical(im,NCLASSES)
reshape and normalize like this:
x = np.asarray(imgs_np, dtype=np.float32)/255
y = np.asarray(masks_np, dtype=np.float32)
y = y.reshape(y.shape[0], y.shape[1], y.shape[2], NCLASSES)
x = x.reshape(x.shape[0], x.shape[1], x.shape[2], 3)
adapt your model to NCLASSES
model = custom_unet(
input_shape,
use_batch_norm=False,
num_classes=NCLASSES,
filters=64,
dropout=0.2,
output_activation='softmax')
select the correct loss:
from keras.losses import categorical_crossentropy
model.compile(
optimizer=SGD(lr=0.01, momentum=0.99),
loss='categorical_crossentropy',
metrics=[iou, iou_thresholded])
Hope it helps
If I have a keras layer L, and I want to stack N versions of this layer (with different weights) in a keras model, what's the best way to do that? Please note that here N is large and controlled by a hyper param. If N is small then this not a problem (we can just manually repeat a line N times). So let's assume N > 10 for example.
If the layer has only one input and one output, I can do something like:
m = Sequential()
for i in range(N):
m.add(L)
But this is not working if my layer actually takes multiple inputs. For example, if my layer has the form z = L(x, y), and I would like my model to do:
x_1 = L(x_0, y)
x_2 = L(x_1, y)
...
x_N = L(x_N-1, y)
Then Sequential wouldn't do the job. I think I can subclass a keras model, but I don't know what's the cleanest way to put N layers into the class. I can use a list, for example:
class MyModel(Model):
def __init__(self):
super(MyModel, self).__init__()
self.layers = []
for i in range(N):
self.layers.append(L)
def call(self, inputs):
x = inputs[0]
y = inputs[1]
for i in range(N):
x = self.layers[i](x, y)
return x
But this is not ideal, as keras won't recognize these layers (it seems not thinking list of layers as "checkpointables"). For example, MyModel.variables would be empty, and MyModel.Save() won't save anything.
I also tried to define the model using the functional API, but it won't work in my case as well. In fact if we do
def MyModel():
input = Input(shape=...)
output = SomeLayer(input)
return Model(inputs=input, outputs=output)
It won't run if SomeLayer itself is a custom model (it raises NotImplementedError).
Any suggestions?
Not sure if I've got your question right, but I guess that you could use the functional API and concatenate or add layers as it is shown in Keras applications, like, ResNet50 or InceptionV3 to build "non-sequential" networks.
UPDATE
In one of my projects, I was using something like this. I had a custom layer (it was not implemented in my version of Keras, so I've just manually "backported" the code into my notebook).
class LeakyReLU(Layer):
"""Leaky version of a Rectified Linear Unit backported from newer Keras
version."""
def __init__(self, alpha=0.3, **kwargs):
super(LeakyReLU, self).__init__(**kwargs)
self.supports_masking = True
self.alpha = K.cast_to_floatx(alpha)
def call(self, inputs):
return tf.maximum(self.alpha * inputs, inputs)
def get_config(self):
config = {'alpha': float(self.alpha)}
base_config = super(LeakyReLU, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
def compute_output_shape(self, input_shape):
return input_shape
Then, the model:
def create_model(input_shape, output_size, alpha=0.05, reg=0.001):
inputs = Input(shape=input_shape)
x = Conv2D(16, (3, 3), padding='valid', strides=(1, 1),
kernel_regularizer=l2(reg), kernel_constraint=maxnorm(3),
activation=None)(inputs)
x = BatchNormalization()(x)
x = LeakyReLU(alpha=alpha)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Conv2D(32, (3, 3), padding='valid', strides=(1, 1),
kernel_regularizer=l2(reg), kernel_constraint=maxnorm(3),
activation=None)(x)
x = BatchNormalization()(x)
x = LeakyReLU(alpha=alpha)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Conv2D(64, (3, 3), padding='valid', strides=(1, 1),
kernel_regularizer=l2(reg), kernel_constraint=maxnorm(3),
activation=None)(x)
x = BatchNormalization()(x)
x = LeakyReLU(alpha=alpha)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Conv2D(128, (3, 3), padding='valid', strides=(1, 1),
kernel_regularizer=l2(reg), kernel_constraint=maxnorm(3),
activation=None)(x)
x = BatchNormalization()(x)
x = LeakyReLU(alpha=alpha)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Conv2D(256, (3, 3), padding='valid', strides=(1, 1),
kernel_regularizer=l2(reg), kernel_constraint=maxnorm(3),
activation=None)(x)
x = BatchNormalization()(x)
x = LeakyReLU(alpha=alpha)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Flatten()(x)
x = Dense(500, activation='relu', kernel_regularizer=l2(reg))(x)
x = Dense(500, activation='relu', kernel_regularizer=l2(reg))(x)
x = Dense(500, activation='relu', kernel_regularizer=l2(reg))(x)
x = Dense(500, activation='relu', kernel_regularizer=l2(reg))(x)
x = Dense(500, activation='relu', kernel_regularizer=l2(reg))(x)
x = Dense(500, activation='relu', kernel_regularizer=l2(reg))(x)
x = Dense(output_size, activation='linear', kernel_regularizer=l2(reg))(x)
model = Model(inputs=inputs, outputs=x)
return model
Finally, a custom metric:
def root_mean_squared_error(y_true, y_pred):
return K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1))
I was using the following snippet to create and compile the model:
model = create_model(input_shape=X.shape[1:], output_size=y.shape[1])
model.compile(loss=root_mean_squared_error, optimizer='adamax')
As usual, I was using a checkpoint callback to save the model. To load the model, you need to pass the custom layers classes and metrics as well into load_model function:
def load_custom_model(path):
return load_model(path, custom_objects={
'LeakyReLU': LeakyReLU,
'root_mean_squared_error': root_mean_squared_error
})
Does it help?
After having researched this to a great extent: I'm certain that there is no built-in, universal way to do this in Tensorflow/Keras.
There are, however, still ways to achieve the same goals but in a different way. The problem is that there's no universal solution to this in Tensorflow with any Keras layer, and so you'll have to approach it on a case-by-case basis.
So for example, if what you wanted to do is stack a bunch of Dense layers and then have some dimension of your input that would correspond to each one (simple example), what you would instead want to do is construct a custom Dense layer and add extra dimensions to its weights and biases, and then do the appropriate operations given some extra dimension in your input.
So ultimately the same (desired) operations would be performed here in the way that you want them to be, each input along some dimension would be put through a separate Dense layer with separate weights/biases: but it would be done concurrently, without any python looping. In essence, you would be reducing the size and complexity of the graph and performing the same operations in a more concurrent way; this ought to be much more efficient.
The strategy outlined here generalises to any layer/input type. It's not great news, in that it would be of very high value to us (users) if there were some standardised Keras-friendly way of stacking a bunch of layers and then passing input to them in a more concurrent way that didn't involve python looping but rather concatenating the internal parameters into a new dimension and managing alignment between an extra 'stacking' dimension in both the inputs and parameters.
Like, in the same way we have tf.keras.Sequential we could also benefit from something like tf.keras.Parallel as a universal solution to this common ML need.
If I understand your question correctly, you can solve this problem simply by using a for-loop when building the model. I'm not sure if you need any special layer so I will assume you only use Dense here:
def MyModel():
input = Input(shape=...)
x = input
for i in range(N):
x = Dense(number_of_nodes, name='dense %i' %i)(x)
// Or some custom layers
output = Dense(number_of_output)(x)
return Model(inputs=input, outputs=output)