Siamese network assigns same label to every couple of images

Siamese network assigns same label to every couple of images - python

I'm working on a siamese network to identify if the input images are the same picture or not.
The problem is that the network keep assigning the label 0 (rather than 0 and 1) resulting in 0.5 accuracy every time but I don't know why.
Here's the model:
def Model():
input_dim = (200, 200, 1)
img_a = Input(shape = input_dim)
img_b = Input(shape = input_dim)
base_net = build_base_network(input_dim)
features_a = base_net(img_a)
features_b = base_net(img_b)
distance = Lambda(euclidean_distance, output_shape = eucl_dist_output_shape)([features_a, features_b])
model = Model(inputs=[img_a, img_b], outputs=distance)
return model
And here's the distance
def contrastive_loss(y_true, y_pred):
margin = 1
return K.mean(y_true * K.square(y_pred) + (1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))
#Optimizer
rms = RMSprop()
#Distance
def euclidean_distance(vects):
x, y = vects
return K.sqrt(K.sum(K.square(x - y), axis=1, keepdims=True))
def eucl_dist_output_shape(shapes):
shape1, shape2 = shapes
return (shape1[0], 1)
Training and X/y shapes
#Pair_equal --> every element is a tuple of numpy arrays representing same images
#Pair_diff --> every element is a tuple of numpy arrays representing different images
#y_equal --> for every Pair equal's element, contains 0
#y_diff --> for every Pair equal's element, contains 1
if len(Pair_equal) > len(Pair_diff):
Pair_equal = Pair_equal[0:len(Pair_diff)]
y_equal = y_equal[0:len(y_diff)]
elif len(Pair_equal) < len(Pair_diff):
Pair_diff = Pair_diff[0:len(Pair_equal)]
y_diff = y_diff[0:len(y_equal)]
Pair_equal = np.array(Pair_equal)
Pair_diff = np.array(Pair_diff)
y_equal = np.array(y_equal)
y_diff = np.array(y_diff)
X = np.concatenate([Pair_equal, Pair_diff], axis=0)
y = np.concatenate([y_equal, y_diff], axis=0)
y = y.reshape(-1, 1)
#index shuffling
indices = np.arange(X.shape[0])
np.random.shuffle(indices)
X = X[indices]
y = y[indices]
#X shape: (32, 2, 200, 200, 1)
#y shape: (32, 1)
return [X[:,0,...], X[:,1,...]], y
Thank you in advance for the help.

Related

Keras tensor operation dimensionality problem when implementing inductive bias

I am trying to implement the following inductive bias (the following is for a batch size/sample size of one, the first array dimension):
A function g takes five variables/features 100 times (second array dimension in X below). I then find the average of these hundred function evaluations and feed it into a second function f. Using the Keras-only API, I tried implementing this inductive bias in the following way:
def call(self, x):
y_i = self.g(x)[:,:, 0]
y = keras.backend.sum(y_i, axis=1, keepdims=True) / y_i.shape[1]
z = self.f(y)
return z[:, 0]
Sadly, I get the following error.
NotImplementedError: Unable to build a Value from a 'tuple' instance
which is a Keras problem when dimensionality does not match. If I write instead:
def call(self, x):
y_i = self.g(x)[:,:, 0]
return y_i
skipping f entirely, the whole model runs fine. I therefore suspect that my summing y = keras.backend.sum(y_i, axis=1, keepdims=True) / y_i.shape[1] is wrong.
For reproduction, here is the entire runnable code. You can skip import os and os.environ... if you are not on a Mac with PlaidML and AMD.
Code
# %%
import os
os.environ["KERAS_BACKEND"] = "plaidml.keras.backend"
import numpy as np
import keras
def mlp2(size_in, size_out):
hidden = 128
inputs = keras.Input(shape=(size_in,))
x = keras.layers.Dense(hidden, name='layer1', activation='relu')(inputs)
x = keras.layers.Dense(hidden, name='layer2', activation='relu')(x)
x = keras.layers.Dense(hidden, name='layer3', activation='relu')(x)
outputs = keras.layers.Dense(size_out, name='layer4', activation='relu')(x)
m = keras.Model(inputs, outputs)
return m
class SumNet(keras.models.Model):
def __init__(self):
super(SumNet, self).__init__()
########################################################
# The same inductive bias as above!
self.g = mlp2(5, 1)
self.f = mlp2(1, 1)
def call(self, x):
y_i = self.g(x)[:,:, 0]
y = keras.backend.sum(y_i, axis=1, keepdims=True) / y_i.shape[1]
z = self.f(y)
return z[:,]
# %%
###### np.random.seed(0)
factoring = SumNet()
# check if there is an argument for a maximum learning rate, set to default 10^-3
# check difference of epochs vs total steps in pytorch "scheduler" object
optimizer = keras.optimizers.Adam(lr=1e-3)
factoring.compile(optimizer, loss=keras.losses.mean_squared_error)
# %%
N = 100000
Nt = 100
X = 6 * np.random.rand(N, Nt, 5) - 3
y_i = X[..., 0] ** 2 + 6 * np.cos(2 * X[..., 2])
y = np.sum(y_i, axis=1) / y_i.shape[1]
z = y ** 2
X.shape, y.shape
# create array along first dim of X
f_dim = np.arange(len(X))
training_indeces = np.random.choice(f_dim, int(.8*f_dim.shape[0]), replace=False)
# include_idx = set(training_indeces) #Set is more efficient, but doesn't reorder your elements if that is desireable
mask = np.array([(i in training_indeces) for i in np.arange(len(X))])
Xtrain = X[mask]
ztrain = z[mask]
Xtest = X[~mask]
ztest = z[~mask]
# %%
factoring.fit(Xtrain, ztrain, batch_size=64, epochs=3, validation_split=.05)
results = factoring.evaluate(Xtest, ztest, batch_size=64)
print("test loss, test acc:", results)

Visualizing the attention map of a multihead attention in ViT

I'm trying to visualize the attention map of mit Visual Transformer architecture in keras/tensorflow. For this I was able to implement the ViT model the following way:
def model():
input_layer = layers.Input(shape=input_shape)
#image_patches = create_patches(input_layer)
#print(input_layer.shape)
image_patches = Patches(patch_size)(input_layer)
#print(image_patches.shape)
encoded_patches = PatchEncoder(num_patch, projection_dim)(image_patches)
#print(encoded_patches.shape)
#for i in range(transformer_blocks):
x1 = layers.LayerNormalization()(encoded_patches)
x1 = layers.MultiHeadAttention(num_heads=num_heads, key_dim=projection_dim, name='MHA_1')(x1, x1)
x = layers.Add()([x1, encoded_patches])
x2 = layers.LayerNormalization()(x)
x2 = mlp_head(x2, transformer_units)
encoded_patches = layers.Add()([x2, x])
x = layers.LayerNormalization()(encoded_patches)
x = layers.Flatten()(x)
x = layers.Dropout(0.5)(x)
x = layers.Dense(2)(x)
model = tf.keras.Model(inputs=input_layer, outputs=x)
print(model.summary())
return model
I'm now trying to visualize the attention map based on an input image and my model output. For this I first try to predict the outcome and reshape the weights:
def attention_map(model, image):
size = model.input_shape[1]
grid_size = int(np.sqrt(model.layers[4].output_shape[-2] - 1))
# Prepare the input
X = preprocess_inputs(cv2.resize(image, (size, size)))#[np.newaxis, :] # type: ignore
# Get the attention weights from each transformer.
outputs = [
l.output[1] for l in model.layers if isinstance(l, layers.MultiHeadAttention)
]
weights = np.array(
tf.keras.models.Model(inputs=model.inputs, outputs=outputs).predict(X_test)
)
print(weights.shape)
num_layers = weights.shape[0]
num_heads = weights.shape[1]
reshaped = weights.reshape(
(num_layers, num_heads, grid_size ** 2 + 1, grid_size ** 2 + 1)
)
# From Appendix D.6 in the paper ...
# Average the attention weights across all heads.
reshaped = reshaped.mean(axis=1)
# From Section 3 in https://arxiv.org/pdf/2005.00928.pdf ...
# To account for residual connections, we add an identity matrix to the
# attention matrix and re-normalize the weights.
reshaped = reshaped + np.eye(reshaped.shape[1])
reshaped = reshaped / reshaped.sum(axis=(1, 2))[:, np.newaxis, np.newaxis]
# Recursively multiply the weight matrices
v = reshaped[-1]
for n in range(1, len(reshaped)):
v = np.matmul(v, reshaped[-1 - n])
# Attention from the output token to the input space.
mask = v[0, 1:].reshape(grid_size, grid_size)
mask = cv2.resize(mask / mask.max(), (image.shape[1], image.shape[0]))[
..., np.newaxis
]
return (mask * image).astype("uint8")
However my problem is now to reshape my weight matrix getting in mismatch. Can someone give me a hint on why this is occuring? A hint based on the output dimension given by
weights = np.array(
tf.keras.models.Model(inputs=model.inputs, outputs=outputs).predict(X_test)
)
would also help.

Dimensions must be equal issue

I created an Autoencoder model to minimise a PAPR and BER metrics in the same time i use Categorical Cross entropy for BER and a function 'PAPRLoss' for PAPR reduction i have a problem with the shapes
def paprLoss(y_true, y_pred):
sigR = tf.math.real(y_pred)
sigI = tf.math.imag(y_pred)
sigR = tf.reshape(sigR,(1,(72*btch)))
sigI = tf.reshape(sigI,(1,(72*btch)))
sigRI = tf.concat((sigR, sigI), 0)
yPower = K.sqrt(K.sum(K.square(sigRI), axis=-1))
yMax = K.max(yPower, axis=-1)
yMean = K.mean(yPower, axis=-1)
yPAPR = 10 * tf.experimental.numpy.log10(yMax/yMean)
return yMax
#generating data of size N
N = 1024000
label = np.random.randint(M,size=N)
# creating one hot encoded vectors
data = []
for i in label:
temp = np.zeros(M)
temp[i] = 1
data.append(temp)
data = np.array(data)
n_channel=2
R = k/n_channel
input_signal = Input(shape=(M,))
encoded = Dense(10*M, activation='relu')(input_signal)
encoded1 = Dense(10*M, activation='relu')(encoded)
encoded1 = Dense(n_channel, activation='linear')(encoded1)
encoded2=Lambda(lambda x:x / K.sqrt(K.mean(x**2)))(encoded1)#Average Power constraint
encoded3 = Lambda(lambda x: OFDM_mod(x, Mfft, CP))(encoded2)
SNRdB_train = 55
SNR_Linear = 10**(SNRdB_train/10)
encoded51 = Lambda(lambda x: OFDM_demod(x, Mfft, CP))(encoded3)
decoded0 = Dense(10*M,activation='relu')(encoded51)
decoded = Dense(10*M,activation='relu')(decoded0)
decoded1 = Dense(M, activation='softmax')(decoded)
autoencoder= Model(inputs=input_signal, outputs=[decoded1,encoded3], name="PAPRnet_Encoder")
autoencoder.compile(optimizer='adamax', loss=['categorical_crossentropy',paprLoss],loss_weights=[1.0,0.1], metrics=['accuracy'])
print (autoencoder.summary())
#training phase
history=autoencoder.fit(data, [data,data],
epochs=100,
batch_size=btch*Mfft,
validation_freq=1,)
I got the following error. Shapes of y_true and y_pred are not equal
Dimensions must be equal, but are 512000 and 8000 for '{{node Equal_1}} = Equal[T=DT_INT64, incompatible_shape_error=true](ArgMax_2, ArgMax_3)' with input shapes: [512000], [8000].

IndexError: Dimension out of range - PyTorch dimension expected to be in range of [-1, 0], but got 1

Despite already numerous answers on this very topic, failing to see in the example below (extract from https://gist.github.com/lirnli/c16ef186c75588e705d9864fb816a13c on Variational Recurrent Networks) which input and output dimensions trigger the error.
Having tried to change dimensions in torch.cat and also suppress the call to squeeze(), the error persists,
<ipython-input-51-cdc928891ad7> in generate(self, hidden, temperature)
56 x_sample = x = x_out.div(temperature).exp().multinomial(1).squeeze()
57 x = self.phi_x(x)
---> 58 tc = torch.cat([x,z], dim=1)
59
60 hidden_next = self.rnn(tc,hidden)
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
Thus how to shape the dimensions in x and z in tc = torch.cat([x,z], dim=1)?
Note the code as follows,
import torch
from torch import nn, optim
from torch.autograd import Variable
class VRNNCell(nn.Module):
def __init__(self):
super(VRNNCell,self).__init__()
self.phi_x = nn.Sequential(nn.Embedding(128,64), nn.Linear(64,64), nn.ELU())
self.encoder = nn.Linear(128,64*2) # output hyperparameters
self.phi_z = nn.Sequential(nn.Linear(64,64), nn.ELU())
self.decoder = nn.Linear(128,128) # logits
self.prior = nn.Linear(64,64*2) # output hyperparameters
self.rnn = nn.GRUCell(128,64)
def forward(self, x, hidden):
x = self.phi_x(x)
# 1. h => z
z_prior = self.prior(hidden)
# 2. x + h => z
z_infer = self.encoder(torch.cat([x,hidden], dim=1))
# sampling
z = Variable(torch.randn(x.size(0),64))*z_infer[:,64:].exp()+z_infer[:,:64]
z = self.phi_z(z)
# 3. h + z => x
x_out = self.decoder(torch.cat([hidden, z], dim=1))
# 4. x + z => h
hidden_next = self.rnn(torch.cat([x,z], dim=1),hidden)
return x_out, hidden_next, z_prior, z_infer
def calculate_loss(self, x, hidden):
x_out, hidden_next, z_prior, z_infer = self.forward(x, hidden)
# 1. logistic regression loss
loss1 = nn.functional.cross_entropy(x_out, x)
# 2. KL Divergence between Multivariate Gaussian
mu_infer, log_sigma_infer = z_infer[:,:64], z_infer[:,64:]
mu_prior, log_sigma_prior = z_prior[:,:64], z_prior[:,64:]
loss2 = (2*(log_sigma_infer-log_sigma_prior)).exp() \
+ ((mu_infer-mu_prior)/log_sigma_prior.exp())**2 \
- 2*(log_sigma_infer-log_sigma_prior) - 1
loss2 = 0.5*loss2.sum(dim=1).mean()
return loss1, loss2, hidden_next
def generate(self, hidden=None, temperature=None):
if hidden is None:
hidden=Variable(torch.zeros(1,64))
if temperature is None:
temperature = 0.8
# 1. h => z
z_prior = self.prior(hidden)
# sampling
z = Variable(torch.randn(z_prior.size(0),64))*z_prior[:,64:].exp()+z_prior[:,:64]
z = self.phi_z(z)
# 2. h + z => x
x_out = self.decoder(torch.cat([hidden, z], dim=1))
# sampling
x_sample = x = x_out.div(temperature).exp().multinomial(1).squeeze()
x = self.phi_x(x)
# 3. x + z => h
# hidden_next = self.rnn(torch.cat([x,z], dim=1),hidden)
tc = torch.cat([x,z], dim=1)
hidden_next = self.rnn(tc,hidden)
return x_sample, hidden_next
def generate_text(self, hidden=None,temperature=None, n=100):
res = []
hidden = None
for _ in range(n):
x_sample, hidden = self.generate(hidden,temperature)
res.append(chr(x_sample.data[0]))
return "".join(res)
# Test
net = VRNNCell()
x = Variable(torch.LongTensor([12,13,14]))
hidden = Variable(torch.rand(3,64))
output, hidden_next, z_infer, z_prior = net(x, hidden)
loss1, loss2, _ = net.calculate_loss(x, hidden)
loss1, loss2
hidden = Variable(torch.zeros(1,64))
net.generate_text()

The error
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
means that you're trying to access an index that doesn't exist in the tensor. For instance, the following code would cause the same IndexError you're experiencing.
# sample input tensors
In [210]: x = torch.arange(4)
In [211]: z = torch.arange(6)
# trying to concatenate along the second dimension
# but the tensors have only one dimension (i.e., `0`).
In [212]: torch.cat([x, z], dim=1)
So, one way to overcome this is to promote the tensors to higher dimensions before concatenation, if that is what you need.
# promoting tensors to 2D before concatenation
In [216]: torch.cat([x[None, :], z[None, :]], dim=1)
Out[216]: tensor([[0, 1, 2, 3, 0, 1, 2, 3, 4, 5]])
Thus, in your case, you've to analyze and understand what shape you need for x so that it can be concatenated with z along dimension 1 and then the tc passed as input to self.rnn() along with hidden.
As far as I can see, x[None, :] , z[None, :] should work.
Debugging for successful training
The code you posted has been written for PyTorch v0.4.1. A lot has changed in the PyTorch Python API since then, but the code was not updated.
Below are the changes you need to make the code run and train successfully. Copy the below functions and paste it at appropriate places in your code.
def generate(self, hidden=None, temperature=None):
if hidden is None:
hidden=Variable(torch.zeros(1,64))
if temperature is None:
temperature = 0.8
# 1. h => z
z_prior = self.prior(hidden)
# sampling
z = Variable(torch.randn(z_prior.size(0),64))*z_prior[:,64:].exp()+z_prior[:,:64]
z = self.phi_z(z)
# 2. h + z => x
x_out = self.decoder(torch.cat([hidden, z], dim=1))
# sampling
x_sample = x = x_out.div(temperature).exp().multinomial(1).squeeze()
x = self.phi_x(x)
# 3. x + z => h
x = x[None, ...] # changed here
xz = torch.cat([x,z], dim=1) # changed here
hidden_next = self.rnn(xz,hidden) # changed here
return x_sample, hidden_next
def generate_text(self, hidden=None,temperature=None, n=100):
res = []
hidden = None
for _ in range(n):
x_sample, hidden = self.generate(hidden,temperature)
res.append(chr(x_sample.data)) # changed here
return "".join(res)
for epoch in range(max_epoch):
batch = next(g)
loss_seq = 0
loss1_seq, loss2_seq = 0, 0
optimizer.zero_grad()
for x in batch:
loss1, loss2, hidden = net.calculate_loss(Variable(x),hidden)
loss1_seq += loss1.data # changed here
loss2_seq += loss2.data # changed here
loss_seq = loss_seq + loss1+loss2
loss_seq.backward()
optimizer.step()
hidden.detach_()
if epoch%100==0:
print('>> epoch {}, loss {:12.4f}, decoder loss {:12.4f}, latent loss {:12.4f}'.format(epoch, loss_seq.data, loss1_seq, loss2_seq)) # changed here
print(net.generate_text())
print()
Note: After these changes, the training loop at my end proceeds without any errors on PyTorch v1.7.1. Have a look at the comments with # changed here to understand the changes.

Developing an encoder/decoder for image modification

On the project I am currently working on, my goal is to train a neural network to convert images of circles to ellipses in a way that models convolution/blurring in real imaging processes.
What remains is to construct a neural network, preferably a CNN, that has the desired results - i.e. takes an image with circles as an input and returns an image with ellipses. However, I have not been able to do this. At best, neural nets (including CNNs) that I have used so far have at best returned blurred images of the circles. I can't tell whether it is the fault of the neural network or the fault of the preprocessing code I am using.
I am a beginner in machine learning, and I read that convolutional neural networks are the best architecture to use for image processing. Hence, I have attempted to develop a CNN for this purpose. Somebody suggested to use an encoder/decoder model for this problem, but I do not know how to do this.
#First, importing the necessary modules:
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Activation, Reshape
from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D
import numpy as np
import pandas as pd
from collections import OrderedDict
import itertools
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import random
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
import math
from math import sqrt
from keras.models import Model, load_model
#Next, creating and storing the input (circle) and output (ellipse) images:
def create_blank_image(size):
data = np.ndarray(shape=(size, size))
for i in range(0, size):
for j in range(0, size):
data[[i], [j]] = 0
#print(data)
return data
def circle_randomizer():
number_of_circles = random.randint(4,10)
intensity = np.ndarray(shape=(128, 128))
#print(number_of_circles)
radius_list = []
for i in range(number_of_circles):
radius_list.append(random.uniform(8, 10))
#print(radius_list)
center_coords = np.zeros((2,1))
center_coords[[0],[0]] = random.uniform(0,size)
center_coords[[1],[0]] = random.uniform(0,size)
for i in range(number_of_circles):
#temp_array = np.ndarray(shape=(2,1))
#temp_array[[0],[0]] = random.uniform(0,size)
#temp_array[[1],[0]] = random.uniform(0,size)
if i > 0:
j = 0
#print(i,j)
while j in range(i):
#print(i,j)
#print(center_coords)
temp_array = np.ndarray(shape=(2,1))
temp_array[[0],[0]] = random.uniform(0,size)
temp_array[[1],[0]] = random.uniform(0,size)
#while sqrt((center_coords[[0],[i]] - center_coords[[0],[j]])**2 + (center_coords[[1],[i]] - center_coords[[1],[j]])**2) < radius_list[i] + radius_list[j]:
while sqrt((temp_array[[0],[0]] - center_coords[[0],[j]])**2 + (temp_array[[1],[0]] - center_coords[[1],[j]])**2) < radius_list[i] + radius_list[j]:
temp_array[[0],[0]] = random.uniform(0,size)
temp_array[[1],[0]] = random.uniform(0,size)
j = 0
center_coords = np.concatenate((center_coords,temp_array), axis = 1)
j = j + 1
#print('loop ran ' + str(j) + ' times')
return radius_list, center_coords
def image_creator(centers, radii, img_data, size):
x = np.arange(1, size, 1)
y = np.arange(1, size, 1)
for c in range(len(centers)):
x0 = centers[[c],[0]]
y0 = centers[[c],[1]]
radius = radii[c]
for i in range(0, size-1):
for j in range(0, size-1):
height2 = radius**2 - (x[i]-x0)**2 - (y[j]-y0)**2
if height2 >= 0:
img_data[[i], [j]] = sqrt(radius**2 - (x[i]-x0)**2 - (y[j]-y0)**2)
return img_data
def make_ellipses(size, radii, center_coords):
# idea: use a random number generator to create a random rotation of the x,y axes for the ellipse
# size is the length of a side of the square
# length is the length of the ellipse
# defined as equal to the radius of the circle later
my_label = np.ndarray(shape=(size, size))
x = np.arange(1, size, 1)
y = np.arange(1, size, 1)
# inefficiently zero the array
for i in range(0, size):
for j in range(0, size):
my_label[[i], [j]] = 0
# print(my_label)
for c in range(len(center_coords)):
x0 = center_coords[[c],[0]]
y0 = center_coords[[c],[1]]
#theta = random.uniform(0, 6.28318)
theta = 0.775
for i in range(0, size - 1):
for j in range(0, size - 1):
xprime = (x[i] - x0) * math.cos(theta) + (y[j] - y0) * math.sin(theta)
yprime = -(x[i] - x0) * math.sin(theta) + (y[j] - y0) * math.cos(theta)
height2 = (0.5 * radii[c]) ** 2 - 0.25 * xprime ** 2 - yprime ** 2
if height2 >= 0:
my_label[[i], [j]] = sqrt((0.5 * radii[c]) ** 2 - 0.25 * xprime ** 2 - yprime ** 2)
return my_label
size = 128
radii, centers = circle_randomizer()
#print(radii)
#print(centers)
#Make labels and samples consistent with rest of code
N = 100
circle_images = []
ellipse_images = []
coords = []
for sample in range(0, N):
blank_image = create_blank_image(size)
radii, centers = circle_randomizer()
temp_image = image_creator(centers, radii, blank_image, size)
circle_images.append(temp_image)
temp_output = make_ellipses(size, radii, centers)
ellipse_images.append(temp_output)
coords.append(centers)
#print(labels)
#print(samples[0][40])
#Storing the images in files:
filenames = []
for i in range(0,N):
np.save('ellipses_' + str(i) + '.npy', ellipse_images[i])
filenames.append('ellipses_' + str(i) + '.npy')
np.save('circles_' + str(i) + '.npy', circle_images[i])
circles_stack = np.stack(circle_images,axis=0)
ellipses_stack = np.stack(ellipse_images,axis=0)
np.save('ellipses_stack.npy', ellipses_stack)
np.save('circles_stack.npy', circles_stack)
#Loading the images:
# load training images and corresponding "labels"
# training samples
training_images_path = 'circles_stack.npy'
labels_path = 'ellipses_stack.npy'
X = np.load(training_images_path,'r')/20.
y = np.load(labels_path,'r')/20.
#Defining the image preprocessing functions:
#(I'm not sure why preprocessing_X and preprocessing_y are different; this is
#code I've partially adopted from a research paper.)
# Preprocessing for training images
def preprocessing_X(image_data, image_size):
image_data = image_data.reshape(image_data.shape[0], image_size[0], image_size[1], 1)
image_data = image_data.astype('float32')
image_data = (image_data - np.amin(image_data))/(np.amax(image_data) - np.amin(image_data))
return image_data

# preprocessing for "labels" (ground truth)
def preprocessing_Y(image_data, image_size):
n_images = 0
label = np.array([])
for idx in range(image_data.shape[0]):
img = image_data[idx,:,:]
n, m = img.shape
img = np.array(OneHotEncoder(n_values=nb_classes).fit_transform(img.reshape(-1,1)).todense())
img = img.reshape(n, m, nb_classes)
label = np.append(label, img)
n_images += 1
label_4D = label.reshape(n_images, image_size[0], image_size[1], nb_classes)
return label_4D
Preprocessing the images:
# Split into train/test and make the shapes of tensors compatible with tensorflow format
nb_classes = 10
target_size = (128, 128)
#Below line randomizes which images are picked for train/test sets. ~20% will go to test.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42)
X_train = preprocessing_X(X_train, target_size)
X_test = preprocessing_X(X_test, target_size)
y_train = preprocessing_Y(y_train, target_size)
y_test = preprocessing_Y(y_test, target_size)
#The encoder-decoder model that I'm using right now:
def model_shape(input_img, nb_classes = 2):
x = Convolution2D(2, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(4, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(4, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(2, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(nb_classes, (3, 3), activation = 'linear', padding='same')(x)
x = Convolution2D(nb_classes, (1, 1), activation = 'linear', padding='same')(x)
#x = Reshape((target_size[0] * target_size[1], nb_classes))(x)
output = Activation('softmax')(x)
return Model(input_img, output)
#Defining dice loss:
smooth = 1
def dice_coef(y_true, y_pred):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
def dice_coef_loss(y_true, y_pred):
return -dice_coef(y_true, y_pred)
#Compiling the model:
nb_classes = 2
input_img = Input(shape=(target_size[0], target_size[1], 1))
model = model_shape(input_img)
model.compile(optimizer='adam', loss=dice_coef_loss, metrics = [dice_coef])
callback_tb = TensorBoard(log_dir='/tmp/Deconvoluter', histogram_freq=0,
write_graph=True, write_images=False)
model.fit(X_train, y_train, epochs=10, batch_size=32,
validation_data=(X_test, y_test))
model.save("/content/artificial_label_train.h5")
model.save_weights("/content/artificial_label_train_weights.h5")
print('Saved model and weights to disk.\n')
The code will compile up until this point, but returns an error after the following block of code:
# Loading models and obtaining softmax output (pixel-wise predictions)
def get_decoded_imgs(input_imgs, filepath, nb_channels = 2):
model = load_model(filepath)
decoded_imgs = model.predict(input_imgs)
decoded_imgs = decoded_imgs.reshape(input_imgs.shape[0], target_size[0], target_size[1], nb_channels)
print("FCN output obtained\n")
return decoded_imgs
decoded_imgs = get_decoded_imgs(X_test, '/content/artificial_label_train.h5')
Outputs = {}
for i in range(X_test.shape[0]):
decoded_img = decoded_imgs[i,:,:,1]
#dictionary = OrderedDict()
Outputs[i] = decoded_img
print('Plotting the results...\n')
plt.figure(figsize=(16, 8), dpi = 96)
for i in range(1, 5):
FCN_output = Outputs[i-1]
ax = plt.subplot(3, 5, i)
plt.imshow(output,cmap='gray')
print(FCN_output.shape)
plt.title('output {0}'.format(i), fontsize = 10)
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.tight_layout()
print('output' + str(i) + '\n' + str(output))
print(np.max(y_test[i-1]),np.max(output))
plt.show()
The error is "Unknown loss function:dice_coef_loss" even though dice_coef_loss is clearly defined.
The input is supposed to be images of circles; the ground truth on which the neural network is trained is supposed to be images of ellipses.
The most recent encoder-decoder Keras model that I have used so far is shown in the code above; it returns ~55% val_dice_coeff, but this can probably be improved by adding more epochs. I used to use a simple CNN with mse loss which only returned lower-resolution images of the inputs.
What I am having problems with now is understanding why I am getting the error when I try to use model.predict() with dice loss as the loss function.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Siamese network assigns same label to every couple of images - python

Related

Keras tensor operation dimensionality problem when implementing inductive bias

Visualizing the attention map of a multihead attention in ViT

Dimensions must be equal issue

IndexError: Dimension out of range - PyTorch dimension expected to be in range of [-1, 0], but got 1

Developing an encoder/decoder for image modification

Categories

Resources