I am trying to use this DSNT-layer from GitHub:
https://github.com/ashwhall/dsnt/
It seems that the implementation has a problem with the placeholder consisting of the input size and the batch size.
My understanding is that the batch size is usually unknown during graph initialization unless one defines a batch value in the input layer or until the learning process begins.
Based on the batch-size, the dsnt layer creates tensors as seen below:
batch_count = tf.shape(norm_heatmap)[0]
height = tf.shape(norm_heatmap)[1]
width = tf.shape(norm_heatmap)[2]
# TODO scalars for the new coord system
scalar_x = ((2 * width) - (width + 1)) / width
scalar_x = tf.cast(scalar_x, tf.float32)
scalar_y = ((2 * height) - (height + 1)) / height
scalar_y = tf.cast(scalar_y, tf.float32)
# Build the DSNT x, y matrices
dsnt_x = tf.tile([[(2 * tf.range(1, width+1) - (width + 1)) / width]], [batch_count, height, 1]) # <-- point of error
dsnt_x = tf.cast(dsnt_x, tf.float32)
dsnt_y = tf.tile([[(2 * tf.range(1, height+1) - (height + 1)) / height]], [batch_count, width, 1])
dsnt_y = tf.cast(tf.transpose(dsnt_y, perm=[0, 2, 1]), tf.float32)
When I run this code, I get following error message:
raise e.with_traceback(filtered_tb) from None
ValueError: Shape [1,2,3,4,5,...,64] is too large (more than 2**63 - 1 entries) for '{{node Placeholder}} = Placeholder[dtype=DT_INT32, shape=[1,2,3,4,5,..., 64]]()' with input shapes: .
I found answers in stackoverflow recommending to use tf.shape to avoid problems with handling unknown dimensions. This does not seem to be enough here.
If an input with dimension (none, none, 1) is used, the code is executed. Furthermore, it will be executed when running on Tensorflow 2.5.3 or lower.
My Question:
How do I use unknown values that will be defined only when the process of learning started.
I attached a minimal example:
Using Python3.10 and Tensorflow2.8
The input is an image of a certain size, e.g. 128x64x1, and the output is the normalized coordinate of the center of mass.
def Minimal_Model(_):
input_shape = (128, 64, 1)
X_input = Input(shape=input_shape)
X_out = Conv2D(filters=1, kernel_size=(1, 1), strides=(1, 1), padding='valid',
name="conv", kernel_initializer=he_uniform())(X_input)
norm_heatmap, coordinates = dsnt.dsnt(X_out)
model = Model(inputs=X_input, outputs=coordinates, name='Test-DSNT')
model.compile(optimizer=tensorflow.keras.optimizers.Adam(0.0001),
loss=[tf.keras.losses.MeanSquaredError(), tf.keras.losses.MeanSquaredError()],
metrics=[tf.keras.metrics.MeanSquaredError()])
return model
import tensorflow as tf
from models import Minimal_Model
from keras_tuner.tuners import BayesianOptimization
import time
tf.get_logger().setLevel('DEBUG')
MAX_TRIALS = 10
EXECUTION_PER_TRIAL = 1
BATCH_SIZE = 8
EPOCHS = 10
LOG_DIR = 'results-random' + f"{int(time.time())}"
train_images = tf.random.uniform((2000, 64, 128, 1), minval=0, dtype=tf.float32, maxval=1)
test_images = tf.random.uniform((200, 64, 128, 1), minval=0, dtype=tf.float32, maxval=1)
train_labels = tf.random.uniform((2000, 2, 1), minval=0, dtype=tf.float32, maxval=1)
test_labels = tf.random.uniform((200, 2, 1), minval=0, dtype=tf.float32, maxval=1)
tuner = BayesianOptimization(
Minimal_Model,
seed=1,
objective='val_mean_squared_error',
max_trials=MAX_TRIALS,
executions_per_trial=EXECUTION_PER_TRIAL,
directory=LOG_DIR,
project_name="project"
)
tuner.search(train_images, train_labels, epochs=EPOCHS, batch_size=BATCH_SIZE,
validation_data=(test_images, test_labels),
callbacks=[tf.keras.callbacks.EarlyStopping(monitor='val_mean_squared_error', restore_best_weights=True,
patience=3, mode='min')])
# Show a summary of the search
tuner.results_summary(num_trials=1)
Related
I'm completely new to Keras and AI. I have Keras 2.9 with Python 3.8.10 under Ubuntu 20.04. I have a model trained using 2 X inputs and an Y, and technically the training runs. Now I wanted to predict the Y using 2 inputs, but it fails. The training is done using this code fragment (I think only input and output format is interesting here):
def generate(aBatchSize:int=32, aRepeatParameter:int=2, aPort:int=12345):
dim = (512, 512)
paraShape = (aRepeatParameter * 2,)
def generator():
while True:
# fill variables
yield ((xParameter, xImage), y)
dataset = tensorflow.data.Dataset.from_generator(generator,
output_signature=(
(tensorflow.TensorSpec(shape=paraShape, dtype=tensorflow.float32),
tensorflow.TensorSpec(shape=dim, dtype=tensorflow.float32)),
tensorflow.TensorSpec(shape=(1), dtype=tensorflow.float32)
))
dataset = dataset.batch(aBatchSize)
return dataset
repeatParameter = 2
batchSize = 16
model.fit(landscapeGenerator.generate(batchSize, repeatParameter, port), validation_data=landscapeGenerator.generate(batchSize, repeatParameter, port),
epochs=50, steps_per_epoch=math.ceil(sampleSize / batchSize), validation_steps=validationSize/batchSize )
Printing the model input and output from training code yields this:
model.input [<KerasTensor: shape=(None, 4) dtype=float32 (created by layer 'input_1')>, <KerasTensor: shape=(None, 512, 512, 1) dtype=float32 (created by layer 'input_2')>]
model.output KerasTensor(type_spec=TensorSpec(shape=(None, 1), dtype=tf.float32, name=None), name='dense_4/BiasAdd:0', description="created by layer 'dense_4'")
This is the failing inference code:
image = numpy.multiply(imageio.imread(filename), 1.0 / 255.0)
model = tensorflow.keras.models.load_model(modelDir)
repeatParameter = 2
paraShape = (repeatParameter * 2,)
parameter = numpy.empty(paraShape, dtype=float)
# fill parameters
tempDiff = 5.0 * model.predict((parameter, image))
It writes, because does not understand that the model has 2 inputs of different size:
ValueError: Data cardinality is ambiguous:
x sizes: 4, 512
Make sure all arrays contain the same number of samples.
I also wanted to make the prediction using a generator, because for that I know how to provide shape info, but no success:
def generate(aParameter, aParaShape, aImage):
dim = (512, 512)
def generator():
while True:
yield (aParameter, aImage)
dataset = tensorflow.data.Dataset.from_generator(generator,
output_signature=(
tensorflow.TensorSpec(shape=paraShape, dtype=tensorflow.float32),
tensorflow.TensorSpec(shape=dim, dtype=tensorflow.float32)
))
dataset = dataset.batch(1)
return dataset
image = numpy.multiply(imageio.imread(filename), 1.0 / 255.0)
model = tensorflow.keras.models.load_model(modelDir)
repeatParameter = 2
paraShape = (repeatParameter * 2,)
parameter = numpy.empty(paraShape, dtype=float)
# fill parameters
tempDiff = 5.0 * model.predict(generate(parameter, paraShape, image), batch_size=1, steps=1)
This one complains: ValueError: Layer "model_2" expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, 4) dtype=float32>]
EDIT: current model generation
def createMlp(aRepeatParameter:int):
vectorSize = aRepeatParameter * 2
inputs = Input(shape=(vectorSize,))
x = inputs
# do not process now, raw data are better x = Dense(vectorSize, activation="relu")(x)
return Model(inputs, x)
def createCnn():
filters=(256, 64, 16)
inputShape = (512, 512, 1)
chanDim = -1
inputs = Input(shape=inputShape)
x = inputs
for (i, f) in enumerate(filters):
x = Conv2D(f, (3, 3), padding="same")(x)
x = Activation("relu")(x)
x = BatchNormalization(axis=chanDim)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Flatten()(x)
x = Dense(512, activation='relu')(x)
x = Dense(16, activation='relu')(x)
x = BatchNormalization(axis=chanDim)(x)
x = Dropout(0.5)(x)
x = Dense(4)(x)
x = Activation("relu")(x)
return Model(inputs, x)
repeatParameter:int = 2
mlp = createMlp(repeatParameter)
cnn = createCnn()
combinedInput = Concatenate(axis=1)([mlp.output, cnn.output])
x = Dense(4, activation="relu")(combinedInput)
x = Dense(1, activation="linear")(x)
model = Model(inputs=[mlp.input, cnn.input], outputs=x)
It turned out I needed reshaping my inputs, and even that had a typo in it. The working solution is:
def generate(aParameter, aParaShape, aImage):
dim = (512, 512)
def generator():
while True:
yield (aParameter, aImage)
dataset = tensorflow.data.Dataset.from_generator(generator,
output_signature=(
tensorflow.TensorSpec(shape=paraShape, dtype=tensorflow.float32),
tensorflow.TensorSpec(shape=dim, dtype=tensorflow.float32)
))
dataset = dataset.batch(1)
return dataset
image = numpy.multiply(imageio.imread(filename), 1.0 / 255.0)
model = tensorflow.keras.models.load_model(modelDir)
repeatParameter = 2
paraShape = (repeatParameter * 2,)
parameter = numpy.empty(paraShape, dtype=float)
# fill it
parameter = numpy.reshape(parameter, (1, 4))
image = numpy.reshape(image, (1, 512, 512))
tempDiff = 5.0 * model.predict([parameter, image], batch_size=1, steps=1)
I'm working on a project using Keras Model Subclassing in order to create a model with 2 inputs and 2 outputs. The training data for this model is essentially a dataset of other image classification datasets, with each image being paired with it's corresponding label; a dataset of datasets. One input of the network receives the label, the other receives the image.
train_img = generate_tensors(train, 0)
train_ans = generate_tensors(train, 1)
val_img = generate_tensors(val, 0)
val_ans = generate_tensors(val, 1)
train_img_b = train_img.batch(batch_size) # b for batched
train_ans_b = train_ans.batch(batch_size)
structuremodel = StructureModel()
hnet_output, anet_output = structuremodel([train_img_b, train_ans_b])
In the above code, I'm trying to perform a single forward propagation on my custom "StructureModel" class. "train_img" and "train_ans" are of shapes (None, 100, 224, 224, 1) and [insert shape] respectively. I have set the batch_size to 1.
The model itself is defined as follows:
class StructureModel(keras.Model):
num_images = 100 # images per timestep
resolution = [224, 224]
hnet_pred_vars = 9
anet_pred_vars = 25 # the thing on my whiteboard didnt include a stopping node
alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!\"#$%&'()*+,-./:;<=>?#[\\]^_`{|}~ "
def __init__(self):
super().__init__()
self.anet_layer = ArchitectureNet(self.anet_pred_vars)
def call(self, inputs):
# CNN-RNN/CNN-LSTM for processing images and corresponding answers
# Copied VGG16 for structure
# Image processing
# shape=(timesteps,resolution,resolution,rgb channels)
images = inputs[0]
answers = inputs[1]
x = TimeDistributed(Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu'))(images)
x = TimeDistributed(Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu'))(x)
x = TimeDistributed(MaxPooling2D(pool_size=(2, 2), strides=2))(x)
filters_convs = [(128, 2), (256, 3), (512, 3), (512, 3)]
for n_filters, n_convs in filters_convs:
for _ in range(n_convs):
x = TimeDistributed(Conv2D(filters=n_filters, kernel_size=(3, 3), padding='same', activation='relu'))(x)
x = TimeDistributed(MaxPooling2D(pool_size=(2, 2), strides=2))(x)
x = TimeDistributed(Flatten())(x)
img_embed = TimeDistributed(Dense(units=1000), name='Image_Preprocessing')(x)
# Answer embedding
# Number of image-answer pairs, characters in answer, single character
x = TimeDistributed(LSTM(units=500))(answers) # All answers, shape (100, None, 95)
answer_embed = TimeDistributed(Dense(units=1000), name='Answer_Preprocessing/Embed')(x)
# Combines both models
merge = Concatenate(axis=2)([img_embed, answer_embed])
x = LSTM(units=100)(merge)
dataset_embed = Dense(units=100, activation='relu', name='Dataset_Embed')(x)
# hnet
x = Dense(units=50)(dataset_embed)
hnet_output = Dense(units=self.hnet_pred_vars, name='Hyperparameters')(x)
# anet
anet_output = self.anet_layer(dataset_embed)
return hnet_output, anet_output
There's a lot of extra fluff in it, and I'm sure there's many other errors in the model, but the main one that I care about is the TypeError that I keep receiving. Without resolving that, I can't get to debugging anything else. The error is as follows:
File ~\Documents\Programming\Python\HYPAT\NetworksV2.py:83 in call
x = TimeDistributed(Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu'))(images)
TypeError: Exception encountered when calling layer "structure_model_7" (type StructureModel).
'<' not supported between instances of 'NoneType' and 'int'
Call arguments received by layer "structure_model_7" (type StructureModel):
• inputs=['<BatchDataset element_spec=TensorSpec(shape=(None, 100, 224, 224, 1), dtype=tf.float32, name=None)>', '<BatchDataset element_spec=TensorSpec(shape=(None, 100, 2, 95), dtype=tf.float64, name=None)>']
If it would be of any use, here's the entirety of the code.
import keras
from keras.layers import TimeDistributed, Conv2D, Dense, MaxPooling2D, Flatten, LSTM, Concatenate
from tensorflow.keras.utils import plot_model
import pickle
import tqdm
import tensorflow as tf
from varname import nameof
# constants/hyperparamete
batch_size = 1
epochs = 10
train_test_split = 0.25
with open("datasets", "rb") as fp:
datasets = pickle.load(fp)
class ArchitectureNet(keras.layers.Layer):
def __init__(self, anet_pred_vars, **kwargs):
super().__init__()
self.anet_pred_vars = anet_pred_vars
self.concat = Concatenate(axis=1)
self.dense1 = Dense(units=50, activation='relu')
self.dense2 = Dense(units=50, activation='relu')
self.anet_output = Dense(units=self.anet_pred_vars, name='Architecture')
self.stopping_node = Dense(units=1, activation='sigmoid')
def call(self, prev_output, dataset_embed):
x = self.concat([prev_output, dataset_embed])
x = self.dense1(x)
x = self.dense2(x)
anet_output = self.anet_output(x)
stop_node_output = self.stopping_node(x)
print(tf.make_ndarray(stop_node_output))
return anet_output
class StructureModel(keras.Model):
num_images = 100 # images per timestep
resolution = [224, 224]
hnet_pred_vars = 9
anet_pred_vars = 25 # the thing on my whiteboard didnt include a stopping node
alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!\"#$%&'()*+,-./:;<=>?#[\\]^_`{|}~ "
def __init__(self):
super().__init__()
self.anet_layer = ArchitectureNet(self.anet_pred_vars)
def call(self, inputs):
# CNN-RNN/CNN-LSTM for processing images and corresponding answers
# Copied VGG16 for structure
# Image processing
# shape=(timesteps,resolution,resolution,rgb channels)
images = inputs[0]
answers = inputs[1]
x = TimeDistributed(Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu'))(images)
x = TimeDistributed(Conv2D(filters=64, kernel_size=(3, 3), padding='same', activation='relu'))(x)
x = TimeDistributed(MaxPooling2D(pool_size=(2, 2), strides=2))(x)
filters_convs = [(128, 2), (256, 3), (512, 3), (512, 3)]
for n_filters, n_convs in filters_convs:
for _ in range(n_convs):
x = TimeDistributed(Conv2D(filters=n_filters, kernel_size=(3, 3), padding='same', activation='relu'))(x)
x = TimeDistributed(MaxPooling2D(pool_size=(2, 2), strides=2))(x)
x = TimeDistributed(Flatten())(x)
img_embed = TimeDistributed(Dense(units=1000), name='Image_Preprocessing')(x)
# Answer embedding
# Number of image-answer pairs, characters in answer, single character
x = TimeDistributed(LSTM(units=500))(answers) # All answers, shape (100, None, 95)
answer_embed = TimeDistributed(Dense(units=1000), name='Answer_Preprocessing/Embed')(x)
# Combines both models
merge = Concatenate(axis=2)([img_embed, answer_embed])
x = LSTM(units=100)(merge)
dataset_embed = Dense(units=100, activation='relu', name='Dataset_Embed')(x)
# hnet
x = Dense(units=50)(dataset_embed)
hnet_output = Dense(units=self.hnet_pred_vars, name='Hyperparameters')(x)
# anet
anet_output = self.anet_layer(dataset_embed)
return hnet_output, anet_output
def compile(self):
super().compile()
# Reserve 10,000 samples for validation
ratio = int(train_test_split * len(datasets))
val = datasets[:ratio]
train = datasets[ratio:]
if len(val) == 0: # look at me mom i'm a real programmer
raise IndexError('List \"x_val\" is empty; \"train_test_split\" is set too small')
# Prepare the training and testing datasets
def generate_tensors(data, img_or_ans): # 0 for image, 1 for ans
# technically the images aren't ragged arrays but for simplicity sake we'll keep them alll as ragged tensors
column = [i[img_or_ans] for i in data]
tensor_data = tf.ragged.constant(column)
tensor_data = tensor_data.to_tensor()
tensor_dataset = tf.data.Dataset.from_tensor_slices(tensor_data)
return tensor_dataset
train_img = generate_tensors(train, 0)
train_ans = generate_tensors(train, 1)
val_img = generate_tensors(val, 0)
val_ans = generate_tensors(val, 1)
# TODO: Test if CIFAR 100 dataset (which has variable length answers) will work
#train_dataset = tf.data.Dataset.zip((train_img, train_ans))
#train_dataset = train_dataset.batch(batch_size)
train_img_b = train_img.batch(batch_size) # b for batched
train_ans_b = train_ans.batch(batch_size)
structuremodel = StructureModel()
hnet_output, anet_output = structuremodel([train_img_b, train_ans_b])
plot_model(StructureModel, to_file='aeu.png', show_shapes=True)
"""
for epoch in tqdm.trange(epochs, desc="Epochs"):
# Iterate over the batches of the dataset.
for step, (x_batch_train, y_batch_train) in tqdm(enumerate(train_dataset), leave=False):
# Open a GradientTape to record the operations run
# during the forward pass, which enables auto-differentiation.
with tf.GradientTape() as tape:
# Run the forward pass of the layer.
# The operations that the layer applies
# to its inputs are going to be recorded
# on the GradientTape.
# Logits for this minibatch
logits = model(x_batch_train, training=True)
# Compute the loss value for this minibatch.
loss_value = los5s_fn(y_batch_train, logits)
# Use the gradient tape to automatically retrieve
# the gradients of the trainable variables with respect to the loss.
grads = tape.gradient(loss_value, model.trainable_weights)
# Run one step of gradient descent by updating
# the value of the variables to minimize the loss.
optimizer.apply_gradients(zip(grads, model.trainable_weights))
# Log every 200 batches.
if step % 200 == 0:
print(
"Training loss (for one batch) at step %d: %.4f"
% (step, float(loss_value))
)
print("Seen so far: %s samples" % ((step + 1) * batch_size))
"""
You cannot feed tf.data.Datasets directly to keras layers. Try this:
dataset1 = tf.data.Dataset.from_tensor_slices((tf.random.uniform((5, 100, 224, 224, 1)))).batch(1)
dataset2 = tf.data.Dataset.from_tensor_slices((tf.random.uniform((5, 100, 2, 95)))).batch(1)
structuremodel = StructureModel()
for (x1, x2) in zip(dataset1.take(1), dataset2.take(1)):
hnet_output, anet_output = structuremodel([x1, x2])
Note, however, that StructureModel is buggy, but I'm sure you know that.
I am getting a value error while trying to make a GAN work on RGB photos in Tensorflow.
in the video that I'm following it works in black and white(59:50): https://www.youtube.com/watch?v=LZov6445YAY&list=WL&index=4&t=3426s&ab_channel=SundogEducationwithFrankKane
I am trying to make it work with RGB color channels instead of black and white. But i get the error above.
I have changed:
tensor = tf.io.decode_image(dataset, channels=1, dtype=tf.dtypes.float32)
to:
tensor = tf.io.decode_image(dataset, channels=3, dtype=tf.dtypes.float32)
tensor = tf.io.decode_image(img, channels=1, dtype=tf.dtypes.float32)
to:
tensor = tf.io.decode_image(img, channels=3, dtype=tf.dtypes.float32)
dataset = np.reshape(dataset, (-1, 28, 28, 1))
to:
dataset = np.reshape(dataset, (-1, 28, 28, 3))
keras.layers.InputLayer(input_shape=(28, 28, 1)),
to:
keras.layers.InputLayer(input_shape=(28, 28, 3)),
Full error:
Traceback (most recent call last):
File "C:\Users\m8\Desktop\idek1.py", line 153, in <module>
dLoss = trainDStep(batch)
File "C:\Users\m8\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\m8\AppData\Local\Temp\__autograph_generated_filegoihnx0n.py", line 15, in tf__trainDStep
x = ag__.converted_call(ag__.ld(tf).concat, ([ag__.ld(data), ag__.ld(fake)],), dict(axis=0), fscope)
ValueError: in user code:
File "C:\Users\m8\Desktop\idek1.py", line 94, in trainDStep *
x = tf.concat([data, fake], axis=0)
ValueError: Dimension 2 in both shapes must be equal, but are 3 and 1. Shapes are [28,28,3] and [28,28,1]. for '{{node concat_1}} = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32](data, sequential/conv2d_transpose_2/Sigmoid, concat_1/axis)' with input shapes: [16,28,28,3], [16,28,28,1], [] and with computed input tensors: input[2] = <0>.
Full code:
import tensorflow as tf
import os
import pathlib
import numpy as np
tf.random.set_seed(1)
print(len(tf.config.list_physical_devices("GPU")))
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.Session(config=config)
data_dir = "C:/Users/m8/Desktop/test_fro_tensorflow/train/training/"
dataset = tf.io.read_file("C:/Users/m8/Desktop/test_fro_tensorflow/val/validation/img (1).jpg")
tensor = tf.io.decode_image(dataset, channels=3, dtype=tf.dtypes.float32)
tensor = tf.image.resize(tensor, [28, 28])
dataset = tf.expand_dims(tensor, axis=0)
for file in os.listdir(data_dir):
f = os.path.join(data_dir, file)
full_path = data_dir + file
img = tf.io.read_file(full_path)
tensor = tf.io.decode_image(img, channels=3, dtype=tf.dtypes.float32)
tensor = tf.image.resize(tensor, [28, 28])
new_tensor = tf.expand_dims(tensor, axis=0)
dataset = np.concatenate([dataset, new_tensor])
print(dataset.shape)
dataset = np.expand_dims(dataset, -1).astype("float32") / 255
BATCH_SIZE = 16
dataset = np.reshape(dataset, (-1, 28, 28, 3))
dataset = tf.data.Dataset.from_tensor_slices(dataset)
dataset = dataset.shuffle(buffer_size=1024).batch(BATCH_SIZE)
from tensorflow import keras
from tensorflow.keras import layers
NOISE_DIM = 150
generator = keras.models.Sequential([
keras.layers.InputLayer(input_shape=(NOISE_DIM,)),
layers.Dense(7*7*256),
layers.Reshape(target_shape=(7, 7, 256)),
layers.Conv2DTranspose(256, 3, activation="LeakyReLU", strides=2, padding="same"),
layers.Conv2DTranspose(128, 3, activation="LeakyReLU", strides=2, padding="same"),
layers.Conv2DTranspose(1, 3, activation="sigmoid", padding="same"),
])
generator.summary()
discriminator = keras.models.Sequential([
keras.layers.InputLayer(input_shape=(28, 28, 3)),
layers.Conv2D(256, 3, activation="relu", strides=2, padding="same"),
layers.Conv2D(128, 3, activation="relu", strides=2, padding="same"),
layers.Dense(64, activation="relu"),
layers.Flatten(),
layers.Dropout(0.2),
layers.Dense(1, activation="sigmoid")
])
discriminator.summary()
optimizerG = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.5)
optimizerD = keras.optimizers.Adam(learning_rate=0.003, beta_1=0.5)
lossFn = keras.losses.BinaryCrossentropy(from_logits=True)
gAccMetric = tf.keras.metrics.BinaryAccuracy()
dAccMetric = tf.keras.metrics.BinaryAccuracy()
#tf.function
def trainDStep(data):
batchSize = tf.shape(data)[0]
noise = tf.random.normal(shape=(batchSize, NOISE_DIM))
y_true = tf.concat(
[
tf.ones(batchSize, 1),
tf.zeros(batchSize, 1)
],
axis=0
)
with tf.GradientTape() as tape:
fake = generator(noise)
x = tf.concat([data, fake], axis=0)
y_pred = discriminator(x)
discriminatorLoss = lossFn(y_true, y_pred)
grads = tape.gradient(discriminatorLoss, discriminator.trainable_weights)
optimizerD.apply_gradients(zip(grads, discriminator.trainable_weights))
dAccMetric.update_state(y_true, y_pred)
return {
"discriminator_loss": discriminatorLoss,
"discriminator_accuracy": dAccMetric.result()
}
#tf.function
def trainGStep(data):
batchSize = tf.shape(data)[0]
noise = tf.random.normal(shape=(batchSize, NOISE_DIM))
y_true = tf.ones(batchSize, 1)
with tf.GradientTape() as tape:
y_pred = discriminator(generator(noise))
generatorLoss = lossFn(y_true, y_pred)
grads = tape.gradient(generatorLoss, generator.trainable_weights)
optimizerG.apply_gradients(zip(grads, generator.trainable_weights))
gAccMetric.update_state(y_true, y_pred)
return {
"generator_loss": generatorLoss,
"generator_accuracy": gAccMetric.result()
}
from matplotlib import pyplot as plt
def plotImages(model):
images = model(np.random.normal(size=(4, NOISE_DIM)))
plt.figure(figsize=(9, 9))
for i, image in enumerate(images):
plt.subplot(2,2,i+1)
plt.imshow(np.squeeze(image, -1), cmap="Greys_r")
plt.axis("off")
plt.show();
for epoch in range(50):
dLossSum = 0
gLossSum = 0
dAccSum = 0
gAccSum = 0
cnt = 0
for batch in dataset:
dLoss = trainDStep(batch)
dLossSum += dLoss["discriminator_loss"]
dAccSum += dLoss["discriminator_accuracy"]
gLoss = trainGStep(batch)
gLossSum += dLoss["discriminator_loss"]
gAccSum += dLoss["discriminator_accuracy"]
cnt += 1
print("E:{}, Loss G:{:0.4f}, Loss D:{:0.4f}, Acc G:%{:0.2f}, Acc D:%{:0.2f}".format(
epoch,
gLossSum/cnt,
dLossSum/cnt,
100 * gAccSum/cnt,
100 * dAccSum/cnt
))
if epoch % 2 == 0:
plotImages(generator)
Let us start by inspecting your error alongside the code you have provided.
x = tf.concat([data, fake], axis=0)
ValueError: Dimension 2 in both shapes must be equal, but are 3 and 1. Shapes are [28,28,3] and [28,28,1]. for '{{node concat_1}} = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32](data, sequential/conv2d_transpose_2/Sigmoid, concat_1/axis)' with input shapes: [16,28,28,3], [16,28,28,1], [] and with computed input tensors: input[2] = <0>.
The main takeaway from this error is that data and fake cannot be concatenated - the reason being that they don't match size-wise. As the error states,
data has the shape [28,28,3] (which is expected as you have made the changes to have RGB inputs
fake has the shape [28,28,1] which is not the same as the shape of data
Our solution is to somehow fix the fake variable's shape to match that of data.
We see that fake is created in the code in the line
fake = generator(noise)
And generator is defined as
generator = keras.models.Sequential([
keras.layers.InputLayer(input_shape=(NOISE_DIM,)),
...
...
layers.Conv2DTranspose(1, 3, activation="sigmoid", padding="same"),
])
The last layer of the generator seems to be a Conv2DTranspose but it is using just 1 output channel (the first argument). Here is our error!
To fix it, it would just require the change of making it output 3 channel rather than 1 as
layers.Conv2DTranspose(3, 3, activation="sigmoid", padding="same"),
I've been using VGG16 to create a model that can classify images into two categories, which works perfectly fine. Now I want to create a function to localize anomalies through heat maps as in here: https://towardsdatascience.com/anomaly-detection-in-images-777534980aeb
Unfortunately that's not working for me, the array including my test image has either the wrong size (ValueError: Error when checking input: expected input_1 to have 4 dimensions, but got array with shape (1, 1, 224, 224, 3) or ValueError: Error when checking input: expected input_1 to have 4 dimensions, but got array with shape (224, 224, 3)) or it's telling me ValueError: cannot reshape array of size 0 into shape (224,512). Here you can see a code snippet:
[...]
test_image_generator = ImageDataGenerator(preprocessing_function=preprocess_input)
test_data_gen = test_image_generator.flow_from_directory(batch_size=batch_size,
directory=test_dir,
shuffle=False,
target_size=(IMG_HEIGHT, IMG_WIDTH),
class_mode='binary')
# create the model/ import vgg16
vgg_conv = vgg16.VGG16(weights='imagenet', include_top=False, input_shape = (224, 224, 3))
# Freeze the layers except the last 4 layers
for layer in vgg_conv.layers[:-8]:
layer.trainable = False
# Check the trainable status of the individual layers
for layer in vgg_conv.layers:
print(layer, layer.trainable)
# modify vgg structure
x = vgg_conv.output
x = GlobalAveragePooling2D()(x)
x = Dense(1, activation="sigmoid")(x)
model = Model(vgg_conv.input, x)
model.compile(loss = "binary_crossentropy", optimizer = optimizers.SGD(lr=0.00001, momentum=0.9), metrics=["accuracy"])
model.summary()
[...]
### CREATE FUNCTION TO DRAW ANOMALIES ###
def plot_activation(img):
pred = model.predict(img[:,:,:,:])
pred_class = np.argmax(pred)
weights = model.layers[-1].get_weights()[0] # weights last classification layer
class_weights = weights[:, pred_class]
intermediate = Model(model.input, model.get_layer("block5_conv3").output)
conv_output = intermediate.predict(img)
conv_output = np.squeeze(conv_output)
h = int(img.shape[0] / conv_output.shape[0])
w = int(img.shape[1] / conv_output.shape[1])
activation_maps = sp.ndimage.zoom(conv_output, (h, w, 1), order=1)
out = np.dot(activation_maps.reshape((img.shape[0] * img.shape[1], 512)), class_weights).reshape(img.shape[0],
img.shape[1])
plt.imshow(img.astype('float32').reshape(img.shape[0], img.shape[1], 3))
plt.imshow(out, cmap='jet', alpha=0.35)
plt.title('Crack' if pred_class == 1 else 'No Crack')
# return out, pred_class
img_width, img_height = 224, 224
img = image.load_img('/Volumes/test_image.jpg', target_size = (img_width, img_height))
img = image.img_to_array(img)
img = np.expand_dims(img, axis = 0)
plot_activation(img)
With that code I'm getting the following error message:
Traceback (most recent call last):
File "test.py", line 285, in <module>
plot_activation(img)
File "test.py", line 251, in plot_activation
out = np.dot(activation_maps.reshape((img.shape[0] * img.shape[1], 512)), class_weights).reshape(img.shape[0],
ValueError: cannot reshape array of size 0 into shape (224,512)
I've tried to resize my array but probably haven't done that the right way as I'm only getting different errors (see above). Maybe it's got something to do with how I've been preprocessing the images, but I'm not sure about that one. Can anyone tell me how to fix this?
Additionally - do I need to undo the preprocessing to plot the heat maps correctly ? If so, how will I be able to do so?
If there are any questions left, feel free to ask, I'm happy to answer.
EDIT:
This is my adjusted code which now is working fine:
### CREATE FUNCTION TO DRAW ANOMALIES ###
def plot_activation(img):
pred = model.predict(img[np.newaxis,:,:,:])
# pred_class = np.argmax(pred)
pred_class = np.argmax(pred, axis=-1)
weights = model.layers[-1].get_weights()[0] # weights last classification layer
class_weights = weights[:, pred_class]
intermediate = Model(model.input, model.get_layer("block5_conv3").output)
conv_output = intermediate.predict(img[np.newaxis,:,:,:])
conv_output = np.squeeze(conv_output)
h = int(img.shape[0] / conv_output.shape[0])
w = int(img.shape[1] / conv_output.shape[1])
activation_maps = sp.ndimage.zoom(conv_output, (h, w, 1), order=1)
out = np.dot(activation_maps.reshape((img.shape[0] * img.shape[1], 512)), class_weights).reshape(img.shape[0],
img.shape[1])
plt.imshow(img.astype('float32').reshape(img.shape[0], img.shape[1], 3))
plt.imshow(out, cmap='jet', alpha=0.35)
plt.title('Crack' if pred_class == 1 else 'No Crack')
plt.show()
# return out, pred_class
test_images = test_data_gen[0][0][0]
plot_activation(test_images)
i'm new to NN and trying to create a simple NN for image understanding.
I tried using the triplet loss method, but keep getting errors that made me think i'm missing some fundamental concept.
My code is :
def triplet_loss(x):
anchor, positive, negative = tf.split(x, 3)
pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)), 1)
neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)), 1)
basic_loss = tf.add(tf.subtract(pos_dist, neg_dist), ALPHA)
loss = tf.reduce_mean(tf.maximum(basic_loss, 0.0), 0)
return loss
def build_model(input_shape):
K.set_image_data_format('channels_last')
positive_example = Input(shape=input_shape)
negative_example = Input(shape=input_shape)
anchor_example = Input(shape=input_shape)
embedding_network = create_embedding_network(input_shape)
positive_embedding = embedding_network(positive_example)
negative_embedding = embedding_network(negative_example)
anchor_embedding = embedding_network(anchor_example)
merged_output = concatenate([anchor_embedding, positive_embedding, negative_embedding])
loss = Lambda(triplet_loss, (1,))(merged_output)
model = Model(inputs=[anchor_example, positive_example, negative_example],
outputs=loss)
model.compile(loss='mean_absolute_error', optimizer=Adam())
return model
def create_embedding_network(input_shape):
input_shape = Input(input_shape)
x = Conv2D(32, (3, 3))(input_shape)
x = PReLU()(x)
x = Conv2D(64, (3, 3))(x)
x = PReLU()(x)
x = Flatten()(x)
x = Dense(10, activation='softmax')(x)
model = Model(inputs=input_shape, outputs=x)
return model
Every image is read using:
imageio.imread(imagePath, pilmode="RGB")
And the shape of each image:
(1024, 1024, 3)
Then i use my own triplet method (just creating 3 sets of anchor, positive and negative)
triplets = get_triplets(data)
triplets.shape
The shape is (number of examples, triplet, x_image, y_image, number of channels
(RGB)):
(20, 3, 1024, 1024, 3)
Then i use the build_model function:
model = build_model((1024, 1024, 3))
And the problem starts here:
model.fit(triplets, y=np.zeros(len(triplets)), batch_size=1)
For this line of code when i'm trying to train my model i'm getting this error:
For more details, my code is in this collab notebook
The pictures i used can be found in this Drive
For this to run seamlessly - place this folder under
My Drive/Colab Notebooks/images/
For anyone also struggling
My problem was actually the dimension of each observation.
By changing the dimension as suggested in the comments
(?, 1024, 1024, 3)
The colab notebook updated with the solution
P.s - i also changed the size of the pictures to 256 * 256 so that the code will run much faster on my pc.