Keras model.predict does not work with 2 inputs

Keras model.predict does not work with 2 inputs - python

I'm completely new to Keras and AI. I have Keras 2.9 with Python 3.8.10 under Ubuntu 20.04. I have a model trained using 2 X inputs and an Y, and technically the training runs. Now I wanted to predict the Y using 2 inputs, but it fails. The training is done using this code fragment (I think only input and output format is interesting here):
def generate(aBatchSize:int=32, aRepeatParameter:int=2, aPort:int=12345):
dim = (512, 512)
paraShape = (aRepeatParameter * 2,)
def generator():
while True:
# fill variables
yield ((xParameter, xImage), y)
dataset = tensorflow.data.Dataset.from_generator(generator,
output_signature=(
(tensorflow.TensorSpec(shape=paraShape, dtype=tensorflow.float32),
tensorflow.TensorSpec(shape=dim, dtype=tensorflow.float32)),
tensorflow.TensorSpec(shape=(1), dtype=tensorflow.float32)
))
dataset = dataset.batch(aBatchSize)
return dataset
repeatParameter = 2
batchSize = 16
model.fit(landscapeGenerator.generate(batchSize, repeatParameter, port), validation_data=landscapeGenerator.generate(batchSize, repeatParameter, port),
epochs=50, steps_per_epoch=math.ceil(sampleSize / batchSize), validation_steps=validationSize/batchSize )
Printing the model input and output from training code yields this:
model.input [<KerasTensor: shape=(None, 4) dtype=float32 (created by layer 'input_1')>, <KerasTensor: shape=(None, 512, 512, 1) dtype=float32 (created by layer 'input_2')>]
model.output KerasTensor(type_spec=TensorSpec(shape=(None, 1), dtype=tf.float32, name=None), name='dense_4/BiasAdd:0', description="created by layer 'dense_4'")
This is the failing inference code:
image = numpy.multiply(imageio.imread(filename), 1.0 / 255.0)
model = tensorflow.keras.models.load_model(modelDir)
repeatParameter = 2
paraShape = (repeatParameter * 2,)
parameter = numpy.empty(paraShape, dtype=float)
# fill parameters
tempDiff = 5.0 * model.predict((parameter, image))
It writes, because does not understand that the model has 2 inputs of different size:
ValueError: Data cardinality is ambiguous:
x sizes: 4, 512
Make sure all arrays contain the same number of samples.
I also wanted to make the prediction using a generator, because for that I know how to provide shape info, but no success:
def generate(aParameter, aParaShape, aImage):
dim = (512, 512)
def generator():
while True:
yield (aParameter, aImage)
dataset = tensorflow.data.Dataset.from_generator(generator,
output_signature=(
tensorflow.TensorSpec(shape=paraShape, dtype=tensorflow.float32),
tensorflow.TensorSpec(shape=dim, dtype=tensorflow.float32)
))
dataset = dataset.batch(1)
return dataset
image = numpy.multiply(imageio.imread(filename), 1.0 / 255.0)
model = tensorflow.keras.models.load_model(modelDir)
repeatParameter = 2
paraShape = (repeatParameter * 2,)
parameter = numpy.empty(paraShape, dtype=float)
# fill parameters
tempDiff = 5.0 * model.predict(generate(parameter, paraShape, image), batch_size=1, steps=1)
This one complains: ValueError: Layer "model_2" expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, 4) dtype=float32>]
EDIT: current model generation
def createMlp(aRepeatParameter:int):
vectorSize = aRepeatParameter * 2
inputs = Input(shape=(vectorSize,))
x = inputs
# do not process now, raw data are better x = Dense(vectorSize, activation="relu")(x)
return Model(inputs, x)
def createCnn():
filters=(256, 64, 16)
inputShape = (512, 512, 1)
chanDim = -1
inputs = Input(shape=inputShape)
x = inputs
for (i, f) in enumerate(filters):
x = Conv2D(f, (3, 3), padding="same")(x)
x = Activation("relu")(x)
x = BatchNormalization(axis=chanDim)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Flatten()(x)
x = Dense(512, activation='relu')(x)
x = Dense(16, activation='relu')(x)
x = BatchNormalization(axis=chanDim)(x)
x = Dropout(0.5)(x)
x = Dense(4)(x)
x = Activation("relu")(x)
return Model(inputs, x)
repeatParameter:int = 2
mlp = createMlp(repeatParameter)
cnn = createCnn()
combinedInput = Concatenate(axis=1)([mlp.output, cnn.output])
x = Dense(4, activation="relu")(combinedInput)
x = Dense(1, activation="linear")(x)
model = Model(inputs=[mlp.input, cnn.input], outputs=x)

It turned out I needed reshaping my inputs, and even that had a typo in it. The working solution is:
def generate(aParameter, aParaShape, aImage):
dim = (512, 512)
def generator():
while True:
yield (aParameter, aImage)
dataset = tensorflow.data.Dataset.from_generator(generator,
output_signature=(
tensorflow.TensorSpec(shape=paraShape, dtype=tensorflow.float32),
tensorflow.TensorSpec(shape=dim, dtype=tensorflow.float32)
))
dataset = dataset.batch(1)
return dataset
image = numpy.multiply(imageio.imread(filename), 1.0 / 255.0)
model = tensorflow.keras.models.load_model(modelDir)
repeatParameter = 2
paraShape = (repeatParameter * 2,)
parameter = numpy.empty(paraShape, dtype=float)
# fill it
parameter = numpy.reshape(parameter, (1, 4))
image = numpy.reshape(image, (1, 512, 512))
tempDiff = 5.0 * model.predict([parameter, image], batch_size=1, steps=1)

Related

Keras Model Subclassing TypeError: '<' not supported between instances of 'NoneType' and 'int'

I'm working on a project using Keras Model Subclassing in order to create a model with 2 inputs and 2 outputs. The training data for this model is essentially a dataset of other image classification datasets, with each image being paired with it's corresponding label; a dataset of datasets. One input of the network receives the label, the other receives the image.
train_img = generate_tensors(train, 0)
train_ans = generate_tensors(train, 1)
val_img = generate_tensors(val, 0)
val_ans = generate_tensors(val, 1)
train_img_b = train_img.batch(batch_size) # b for batched
train_ans_b = train_ans.batch(batch_size)
structuremodel = StructureModel()
hnet_output, anet_output = structuremodel([train_img_b, train_ans_b])
In the above code, I'm trying to perform a single forward propagation on my custom "StructureModel" class. "train_img" and "train_ans" are of shapes (None, 100, 224, 224, 1) and [insert shape] respectively. I have set the batch_size to 1.
The model itself is defined as follows:
class StructureModel(keras.Model):
num_images = 100 # images per timestep
resolution = [224, 224]
hnet_pred_vars = 9
anet_pred_vars = 25 # the thing on my whiteboard didnt include a stopping node
alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!\"#$%&'()*+,-./:;<=>?#[\\]^_`{|}~ "
def __init__(self):
super().__init__()
self.anet_layer = ArchitectureNet(self.anet_pred_vars)
def call(self, inputs):
# CNN-RNN/CNN-LSTM for processing images and corresponding answers
# Copied VGG16 for structure
# Image processing
# shape=(timesteps,resolution,resolution,rgb channels)
images = inputs[0]
answers = inputs[1]
x = TimeDistributed(Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu'))(images)
x = TimeDistributed(Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu'))(x)
x = TimeDistributed(MaxPooling2D(pool_size=(2, 2), strides=2))(x)
filters_convs = [(128, 2), (256, 3), (512, 3), (512, 3)]
for n_filters, n_convs in filters_convs:
for _ in range(n_convs):
x = TimeDistributed(Conv2D(filters=n_filters, kernel_size=(3, 3), padding='same', activation='relu'))(x)
x = TimeDistributed(MaxPooling2D(pool_size=(2, 2), strides=2))(x)
x = TimeDistributed(Flatten())(x)
img_embed = TimeDistributed(Dense(units=1000), name='Image_Preprocessing')(x)
# Answer embedding
# Number of image-answer pairs, characters in answer, single character
x = TimeDistributed(LSTM(units=500))(answers) # All answers, shape (100, None, 95)
answer_embed = TimeDistributed(Dense(units=1000), name='Answer_Preprocessing/Embed')(x)
# Combines both models
merge = Concatenate(axis=2)([img_embed, answer_embed])
x = LSTM(units=100)(merge)
dataset_embed = Dense(units=100, activation='relu', name='Dataset_Embed')(x)
# hnet
x = Dense(units=50)(dataset_embed)
hnet_output = Dense(units=self.hnet_pred_vars, name='Hyperparameters')(x)
# anet
anet_output = self.anet_layer(dataset_embed)
return hnet_output, anet_output
There's a lot of extra fluff in it, and I'm sure there's many other errors in the model, but the main one that I care about is the TypeError that I keep receiving. Without resolving that, I can't get to debugging anything else. The error is as follows:
File ~\Documents\Programming\Python\HYPAT\NetworksV2.py:83 in call
x = TimeDistributed(Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu'))(images)
TypeError: Exception encountered when calling layer "structure_model_7" (type StructureModel).
'<' not supported between instances of 'NoneType' and 'int'
Call arguments received by layer "structure_model_7" (type StructureModel):
• inputs=['<BatchDataset element_spec=TensorSpec(shape=(None, 100, 224, 224, 1), dtype=tf.float32, name=None)>', '<BatchDataset element_spec=TensorSpec(shape=(None, 100, 2, 95), dtype=tf.float64, name=None)>']
If it would be of any use, here's the entirety of the code.
import keras
from keras.layers import TimeDistributed, Conv2D, Dense, MaxPooling2D, Flatten, LSTM, Concatenate
from tensorflow.keras.utils import plot_model
import pickle
import tqdm
import tensorflow as tf
from varname import nameof
# constants/hyperparamete
batch_size = 1
epochs = 10
train_test_split = 0.25
with open("datasets", "rb") as fp:
datasets = pickle.load(fp)
class ArchitectureNet(keras.layers.Layer):
def __init__(self, anet_pred_vars, **kwargs):
super().__init__()
self.anet_pred_vars = anet_pred_vars
self.concat = Concatenate(axis=1)
self.dense1 = Dense(units=50, activation='relu')
self.dense2 = Dense(units=50, activation='relu')
self.anet_output = Dense(units=self.anet_pred_vars, name='Architecture')
self.stopping_node = Dense(units=1, activation='sigmoid')
def call(self, prev_output, dataset_embed):
x = self.concat([prev_output, dataset_embed])
x = self.dense1(x)
x = self.dense2(x)
anet_output = self.anet_output(x)
stop_node_output = self.stopping_node(x)
print(tf.make_ndarray(stop_node_output))
return anet_output
class StructureModel(keras.Model):
num_images = 100 # images per timestep
resolution = [224, 224]
hnet_pred_vars = 9
anet_pred_vars = 25 # the thing on my whiteboard didnt include a stopping node
alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!\"#$%&'()*+,-./:;<=>?#[\\]^_`{|}~ "
def __init__(self):
super().__init__()
self.anet_layer = ArchitectureNet(self.anet_pred_vars)
def call(self, inputs):
# CNN-RNN/CNN-LSTM for processing images and corresponding answers
# Copied VGG16 for structure
# Image processing
# shape=(timesteps,resolution,resolution,rgb channels)
images = inputs[0]
answers = inputs[1]
x = TimeDistributed(Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu'))(images)
x = TimeDistributed(Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu'))(x)
x = TimeDistributed(MaxPooling2D(pool_size=(2, 2), strides=2))(x)
filters_convs = [(128, 2), (256, 3), (512, 3), (512, 3)]
for n_filters, n_convs in filters_convs:
for _ in range(n_convs):
x = TimeDistributed(Conv2D(filters=n_filters, kernel_size=(3, 3), padding='same', activation='relu'))(x)
x = TimeDistributed(MaxPooling2D(pool_size=(2, 2), strides=2))(x)
x = TimeDistributed(Flatten())(x)
img_embed = TimeDistributed(Dense(units=1000), name='Image_Preprocessing')(x)
# Answer embedding
# Number of image-answer pairs, characters in answer, single character
x = TimeDistributed(LSTM(units=500))(answers) # All answers, shape (100, None, 95)
answer_embed = TimeDistributed(Dense(units=1000), name='Answer_Preprocessing/Embed')(x)
# Combines both models
merge = Concatenate(axis=2)([img_embed, answer_embed])
x = LSTM(units=100)(merge)
dataset_embed = Dense(units=100, activation='relu', name='Dataset_Embed')(x)
# hnet
x = Dense(units=50)(dataset_embed)
hnet_output = Dense(units=self.hnet_pred_vars, name='Hyperparameters')(x)
# anet
anet_output = self.anet_layer(dataset_embed)
return hnet_output, anet_output
def compile(self):
super().compile()
# Reserve 10,000 samples for validation
ratio = int(train_test_split * len(datasets))
val = datasets[:ratio]
train = datasets[ratio:]
if len(val) == 0: # look at me mom i'm a real programmer
raise IndexError('List \"x_val\" is empty; \"train_test_split\" is set too small')
# Prepare the training and testing datasets
def generate_tensors(data, img_or_ans): # 0 for image, 1 for ans
# technically the images aren't ragged arrays but for simplicity sake we'll keep them alll as ragged tensors
column = [i[img_or_ans] for i in data]
tensor_data = tf.ragged.constant(column)
tensor_data = tensor_data.to_tensor()
tensor_dataset = tf.data.Dataset.from_tensor_slices(tensor_data)
return tensor_dataset
train_img = generate_tensors(train, 0)
train_ans = generate_tensors(train, 1)
val_img = generate_tensors(val, 0)
val_ans = generate_tensors(val, 1)
# TODO: Test if CIFAR 100 dataset (which has variable length answers) will work
#train_dataset = tf.data.Dataset.zip((train_img, train_ans))
#train_dataset = train_dataset.batch(batch_size)
train_img_b = train_img.batch(batch_size) # b for batched
train_ans_b = train_ans.batch(batch_size)
structuremodel = StructureModel()
hnet_output, anet_output = structuremodel([train_img_b, train_ans_b])
plot_model(StructureModel, to_file='aeu.png', show_shapes=True)
"""
for epoch in tqdm.trange(epochs, desc="Epochs"):
# Iterate over the batches of the dataset.
for step, (x_batch_train, y_batch_train) in tqdm(enumerate(train_dataset), leave=False):
# Open a GradientTape to record the operations run
# during the forward pass, which enables auto-differentiation.
with tf.GradientTape() as tape:
# Run the forward pass of the layer.
# The operations that the layer applies
# to its inputs are going to be recorded
# on the GradientTape.
# Logits for this minibatch
logits = model(x_batch_train, training=True)
# Compute the loss value for this minibatch.
loss_value = los5s_fn(y_batch_train, logits)
# Use the gradient tape to automatically retrieve
# the gradients of the trainable variables with respect to the loss.
grads = tape.gradient(loss_value, model.trainable_weights)
# Run one step of gradient descent by updating
# the value of the variables to minimize the loss.
optimizer.apply_gradients(zip(grads, model.trainable_weights))
# Log every 200 batches.
if step % 200 == 0:
print(
"Training loss (for one batch) at step %d: %.4f"
% (step, float(loss_value))
)
print("Seen so far: %s samples" % ((step + 1) * batch_size))
"""

You cannot feed tf.data.Datasets directly to keras layers. Try this:
dataset1 = tf.data.Dataset.from_tensor_slices((tf.random.uniform((5, 100, 224, 224, 1)))).batch(1)
dataset2 = tf.data.Dataset.from_tensor_slices((tf.random.uniform((5, 100, 2, 95)))).batch(1)
structuremodel = StructureModel()
for (x1, x2) in zip(dataset1.take(1), dataset2.take(1)):
hnet_output, anet_output = structuremodel([x1, x2])
Note, however, that StructureModel is buggy, but I'm sure you know that.

How to concatenate a tensor to a keras layer along batch (without specifying batch size)?

I want to concatenate the output from an embedding layer with a custom tensor (myarr / myconst). I can specify everything with a fixed batch size like follows:
import numpy as np
import tensorflow as tf
BATCH_SIZE = 100
myarr = np.ones((10, 5))
myconst = tf.constant(np.tile(myarr, (BATCH_SIZE, 1, 1)))
# Model definition
inputs = tf.keras.layers.Input((10,), batch_size=BATCH_SIZE)
x = tf.keras.layers.Embedding(10, 5)(inputs)
x = tf.keras.layers.Concatenate(axis=1)([x, myconst])
model = tf.keras.models.Model(inputs=inputs, outputs=x)
However, if I don't specify batch size and tile my array, i.e. just the following...
myarr = np.ones((10, 5))
myconst = tf.constant(myarr)
# Model definition
inputs = tf.keras.layers.Input((10,))
x = tf.keras.layers.Embedding(10, 5)(inputs)
x = tf.keras.layers.Concatenate(axis=1)([x, myconst])
model = tf.keras.models.Model(inputs=inputs, outputs=x)
... I get an error specifying that shapes [(None, 10, 5), (10, 5)] can't be concatenated. Is there a way to add this None / batch_size axis to avoid tiling?
Thanks in advance

You want to concatenate to a 3D tensor of shape (batch, 10, 5) a constant of shape (10, 5) along the batch dimensionality. To do this your constant must be 3D. So you have to reshape it in (1, 10, 5) and repeat it along the axis=0 in order to match the shape (batch, 10, 5) and operate a concatenation.
We do this inside a Lambda layer:
X = np.random.randint(0,10, (100,10))
Y = np.random.uniform(0,1, (100,20,5))
myarr = np.ones((1, 10, 5)).astype('float32')
myconst = tf.convert_to_tensor(myarr)
def repeat_const(tensor, myconst):
shapes = tf.shape(tensor)
return tf.repeat(myconst, shapes[0], axis=0)
inputs = tf.keras.layers.Input((10,))
x = tf.keras.layers.Embedding(10, 5)(inputs)
xx = tf.keras.layers.Lambda(lambda x: repeat_const(x, myconst))(x)
x = tf.keras.layers.Concatenate(axis=1)([x, xx])
model = tf.keras.models.Model(inputs=inputs, outputs=x)
model.compile('adam', 'mse')
model.fit(X, Y, epochs=3)

TypeError: new(): argument 'size' must be tuple of ints, but found element of type NoneType at pos 2 when using pytorch, using nn.linear

File "C:\Users\J2\Desktop\Pytorchseries\thenn.py", line 50, in
net = Net()
TypeError: new(): argument 'size' must be tuple of ints, but found element of type NoneType at pos 2
If it helps I was following the sentdex pytorch tutorial. Any help would be appreciated. I am new to machine learning, and I was hoping that this would work. Please help me out!
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import matplotlib.pyplot as plt
import tqdm
training_data = np.load('training_data.npy', allow_pickle=True)
print(len(training_data))
X = torch.Tensor([i[0] for i in training_data]).view(-1,50,50)
X = X/255.0
y = torch.Tensor([i[1] for i in training_data])
plt.imshow(X[0], cmap='gray')
print(y[0])
class Net(nn.Module):
def __init__(self):
super().__init__() # just run the init of parent class (nn.Module)
self.conv1 = nn.Conv2d(1, 32, 5) # input is 1 image, 32 output channels, 5x5 kernel / window
self.conv2 = nn.Conv2d(32, 64, 5) # input is 32, bc the first layer output 32. Then we say the output will be 64 channels, 5x5 kernel / window
self.conv3 = nn.Conv2d(64, 128, 5)
x = torch.randn(50,50).view(-1,1,50,50)
self._to_linear = None
self.convs(x)
self.fc1 = nn.Linear(self._to_linear, 512) #flattening.
self.fc2 = nn.Linear(512, 2) # 512 in, 2 out bc we're doing 2 classes (dog vs cat).
def convs(self, x):
# max pooling over 2x2
x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
x = F.max_pool2d(F.relu(self.conv2(x)), (2, 2))
x = F.max_pool2d(F.relu(self.conv3(x)), (2, 2))
def forward(self, x):
x = self.convs(x)
x = x.view(-1, self._to_linear) # .view is reshape ... this flattens X before
x = F.relu(self.fc1(x))
x = self.fc2(x) # bc this is our output layer. No activation here.
return F.softmax(x, dim=1)
if self._to_linear is None:
self._to_linear = x[0].shape[0]*x[0].shape[1]*x[0].shape[2]
return x
net = Net()
print(net)
import torch.optim as optim
optimizer = optim.Adam(net.parameters(), lr=0.001)
loss_function = nn.MSELoss()
X = torch.Tensor([i[0] for i in training_data]).view(-1,50,50)
X = X/255.0
y = torch.Tensor([i[1] for i in training_data])
VAL_PCT = 0.1 # lets reserve 10% of our data for validation
val_size = int(len(X)*VAL_PCT)
print(val_size)
train_X = X[:-val_size]
train_y = y[:-val_size]
test_X = X[-val_size:]
test_y = y[-val_size:]
print(len(train_X), len(test_X))
BATCH_SIZE = 100
EPOCHS = 1
for epoch in range(EPOCHS):
for i in tqdm(range(0, len(train_X), BATCH_SIZE)): # from 0, to the len of x, stepping BATCH_SIZE at a time. [:50] ..for now just to dev
#print(f"{i}:{i+BATCH_SIZE}")
batch_X = train_X[i:i+BATCH_SIZE].view(-1, 1, 50, 50)
batch_y = train_y[i:i+BATCH_SIZE]
net.zero_grad()
outputs = net(batch_X)
loss = loss_function(outputs, batch_y)
loss.backward()
optimizer.step() # Does the update
print(f"Epoch: {epoch}. Loss: {loss}")
correct = 0
total = 0
with torch.no_grad():
for i in tqdm(range(len(test_X))):
real_class = torch.argmax(test_y[i])
net_out = net(test_X[i].view(-1, 1, 50, 50))[0] # returns a list,
predicted_class = torch.argmax(net_out)
if predicted_class == real_class:
correct += 1
total += 1
print("Accuracy: ", round(correct/total, 3))

The issue is with self._to_linear. You use it in __init__ as:
self._to_linear = None
self.convs(x)
self.fc1 = nn.Linear(self._to_linear, 512) #flattening.
The call to nn.Linear has it as a parameter. This parameter should equal the number of input features in the linear layer, and cannot be None, since the value will determine the shape of the layer (number of weights and biases). How to fix this depends on what you're trying to achieve.

Issue retrieiving value error `decode_predictions` expects a batch of predictions

I have the following code trying to perform predictions on part of resnet model. However, I am retrieving error.
def layer_input_shape(Model, layer_index):
input_shape = np.array(Model.layers[layer_index - 1].output_shape)
input_shape = np.ndarray.tolist(np.delete(input_shape, 0))
return input_shape
def resnet50_Model(Model, trainable=True):
input_shape = layer_input_shape(Model, 1)
input = tf.keras.layers.Input(shape=input_shape)
first_layer = Model.layers[0]
first_layer.trainable = trainable
out = first_layer(input)
for i in range(1, 12):
layer_i = Model.layers[i]
layer_i.trainable = trainable
out = layer_i(out)
out = Conv2D(filters=2, kernel_size=2, strides=(2,2), activation='relu')(out)
out = Flatten()(out)
out = Dense(units=2,activation='softmax')(out)
result_model = tf.keras.models.Model(inputs=[input], outputs=out)
return result_model
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
img='/content/elephant.jpg'
img = image.load_img(img, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = resnet_skip_model.predict(x)
print('Predicted:', decode_predictions(preds, top=3)[0])
Retrieving below error:
ValueError: `decode_predictions` expects a batch of predictions (i.e. a 2D array of shape (samples,
1000)). Found array with shape: (1, 3)

I added two output dense layer so I can only predict two classes and when I call decode it expects 1000 output last dense layer, therefore changed units from two to 1000
out = Dense(units=1000,activation='softmax')(out)

Correct way to apply Minibatch Standard Deviation to Keras GAN layer

I'm trying to improve the stability of my GAN model by adding a standard deviation variable to my layer's feature map. I'm following the example set in the GANs-in-Action git. The math itself makes sense to me. The mechanics of my model and the reasons why this addresses mode collapse makes sense to me. However, a shortcoming from the example is that they never actually show how this code is executed.
def minibatch_std_layer(layer, group_size=4):
group_size = keras.backend.minimum(group_size, tf.shape(layer)[0])
shape = list(keras.backend.int_shape(input))
shape[0] = tf.shape(input)[0]
minibatch = keras.backend.reshape(layer,(group_size, -1, shape[1], shape[2], shape[3]))
minibatch -= tf.reduce_mean(minibatch, axis=0, keepdims=True)
minibatch = tf.reduce_mean(keras.backend.square(minibatch), axis = 0)
minibatch = keras.backend.square(minibatch + 1e8)
minibatch = tf.reduce_mean(minibatch, axis=[1,2,4], keepdims=True)
minibatch = keras.backend.tile(minibatch,[group_size, 1, shape[2], shape[3]])
return keras.backend.concatenate([layer, minibatch], axis=1)
def build_discriminator():
const = ClipConstraint(0.01)
discriminator_input = Input(shape=(4000,3), batch_size=BATCH_SIZE, name='discriminator_input')
x = discriminator_input
x = Conv1D(64, 3, strides=1, padding="same", kernel_constraint=const)(x)
x = BatchNormalization()(x)
x = LeakyReLU(0.3)(x)
x = Dropout(0.25)(x)
x = Conv1D(128, 3, strides=2, padding="same", kernel_constraint=const)(x)
x = LeakyReLU(0.3)(x)
x = Dropout(0.25)(x)
x = Conv1D(256, 3, strides=3, padding="same", kernel_constraint=const)(x)
x = LeakyReLU(0.3)(x)
x = Dropout(0.25)(x)
# Trying to add it to the feature map here
x = minibatch_std_layer(Conv1D(256, 3, strides=3, padding="same", kernel_constraint=const)(x))
x = Flatten()(x)
x = Dense(1000)(x)
discriminator_output = Dense(1, activation='sigmoid')(x)
return Model(discriminator_input, discriminator_output, name='discriminator_model')
d = build_discriminator()
No matter how I structure it, I can't get the discriminator to build. It continues to return different types of AttributeErrors but I've been unable to understand what it wants. Searching the issue, there were lots of Medium posts showing a high level overview of what this does in a progressive GAN, but nothing I could find showing its application.
Does anyone have any suggestions about how the above code is added to a layer?

For those want using Minibatch Standard Deviation as Keras layer here is code:
# mini-batch standard deviation layer
class MinibatchStdev(layers.Layer):
def __init__(self, **kwargs):
super(MinibatchStdev, self).__init__(**kwargs)
# calculate the mean standard deviation across each pixel coord
def call(self, inputs):
mean = K.mean(inputs, axis=0, keepdims=True)
mean_sq_diff = K.mean(K.square(inputs - mean), axis=0, keepdims=True) + 1e-8
mean_pix = K.mean(K.sqrt(mean_sq_diff), keepdims=True)
shape = K.shape(inputs)
output = K.tile(mean_pix, [shape[0], shape[1], shape[2], 1])
return K.concatenate([inputs, output], axis=-1)
# define the output shape of the layer
def compute_output_shape(self, input_shape):
input_shape = list(input_shape)
input_shape[-1] += 1
return tuple(input_shape)
From: How to Train a Progressive Growing GAN in Keras for Synthesizing Faces

this is my proposal...
the problem is related to the minibatch_std_layer function. first of all your network deals with 3d data while the original minibatch_std_layer deals with 4d data so you need to adapt it. secondly, the input variable defined in this function is unknown (also in the source code you cited) so I think the most obvious and logical solution is to consider it as the layer variable (the input of minibatch_std_layer). with this in mind the modified minibatch_std_layer becomes:
def minibatch_std_layer(layer, group_size=4):
group_size = K.minimum(4, layer.shape[0])
shape = layer.shape
minibatch = K.reshape(layer,(group_size, -1, shape[1], shape[2]))
minibatch -= tf.reduce_mean(minibatch, axis=0, keepdims=True)
minibatch = tf.reduce_mean(K.square(minibatch), axis = 0)
minibatch = K.square(minibatch + 1e-8) #epsilon=1e-8
minibatch = tf.reduce_mean(minibatch, axis=[1,2], keepdims=True)
minibatch = K.tile(minibatch,[group_size, 1, shape[2]])
return K.concatenate([layer, minibatch], axis=1)
that we can put inside our model in this way:
def build_discriminator():
# const = ClipConstraint(0.01)
discriminator_input = Input(shape=(4000,3), batch_size=32, name='discriminator_input')
x = discriminator_input
x = Conv1D(64, 3, strides=1, padding="same")(x)
x = BatchNormalization()(x)
x = LeakyReLU(0.3)(x)
x = Dropout(0.25)(x)
x = Conv1D(128, 3, strides=2, padding="same")(x)
x = LeakyReLU(0.3)(x)
x = Dropout(0.25)(x)
x = Conv1D(256, 3, strides=3, padding="same")(x)
x = LeakyReLU(0.3)(x)
x = Dropout(0.25)(x)
# Trying to add it to the feature map here
x = Conv1D(256, 3, strides=3, padding="same")(x)
x = Lambda(minibatch_std_layer)(x)
x = Flatten()(x)
x = Dense(1000)(x)
discriminator_output = Dense(1, activation='sigmoid')(x)
return Model(discriminator_input, discriminator_output, name='discriminator_model')
I don't know what it's ClipConstraint but It doesn't seem problematic. I ran the code with TF 2.2 but also think that it's quite easy to make it run with TF 1 (if u are using it). here the running code: https://colab.research.google.com/drive/1A6UNYkveuHPF7r4-XAe8MuCHZJ-1vcpl?usp=sharing

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Keras model.predict does not work with 2 inputs - python

Related

Keras Model Subclassing TypeError: '<' not supported between instances of 'NoneType' and 'int'

How to concatenate a tensor to a keras layer along batch (without specifying batch size)?

TypeError: new(): argument 'size' must be tuple of ints, but found element of type NoneType at pos 2 when using pytorch, using nn.linear

Issue retrieiving value error `decode_predictions` expects a batch of predictions

Correct way to apply Minibatch Standard Deviation to Keras GAN layer

Categories

Resources