Why is the accuracy rate of autoencoder zero? - python

I am using the automatic encoder to reduce the dimension of the matrix, and I find that the accuracy is zero, which indicates that the data reconstructed by the automatic encoder is very different from the original data. Why?
enter image description here
Part of the code is shown below.
self.autoencoder = Model(inputs=self.input_factor, outputs=self.decoded)
self.autoencoder.compile(optimizer='adam', loss='mse',metrics = ['accuracy'])
supplement:
I'm working on the cancer dataset.
The input to the encoder is a matrix of (141, 39505).
The goal is to reduce the dimension of the matrix to (141,100).
The problem is that the accuracy of the encoder is always zero.

Related

In UNet model of prediction to test dataset, how to get multiple inference probabilities for each pixel?

I have trained the Unet model,then I predicted test dataset with the trained Unet.In training stage, the patch size is 256×256,and in test stage,the patch size is also 256×256,but strides is 64.that is to say,there is overlapping between patches.After predicting all the patches from test dataset,in the post-processing phase of predicted patches,how to put all the patches which were predicted into category labels together,especially in overlapping part.
How to encode using python? Thanks.

Can a trained ANN (tensorflow) model be made predictable?

I'm new to ANN, but I've managed to train a convolutional model successfully (using some legacy tensorflow v1 code) up to ~90% accuracy or so on my data. But when I evaluate (test) it on any given batch, the result is somewhat random, even though it's 90% correct. I've tried to re-evaluate the data N times and averaging (using N's between 1 and 25), but still each evaluation differs from the others between 3% to 10% of the data points.
Is there any way to make the evaluation predictable, so that the evaluation of an input batch X always yield the exact same result Y every time I run it (once training is done)?
I'm not sure if it's relevant, but my layers are batch normalized like so:
inp = tf.identity(inp)
channels = inp.get_shape()[-1]
offset = tf.compat.v1.get_variable(
'offset', [channels],
dtype=tf.float32,
initializer=tf.compat.v1.zeros_initializer())
scale = tf.compat.v1.get_variable(
'scale', [channels],
dtype=tf.float32,
initializer=tf.compat.v1.random_normal_initializer(1.0, 0.02))
mean, variance = tf.nn.moments(x=inp, axes=[0, 1], keepdims=False)
variance_epsilon = 1e-5
normalized = tf.nn.batch_normalization(
inp, mean, variance, offset, scale, variance_epsilon=variance_epsilon)
The scale part is initialized with random data, but I assume that gets loaded when I do tf.compat.v1.train.Saver().restore(session, checkpoint_fname)?
I am assuming you are testing the model on your training batches?
You can't equate the accuracy of a portion of your total training dataset to the accuracy of the whole.
Think of it like a regression problem. If you only take a part of the dataset, there is no guarantee that it would average out close to the full dataset.
If you want consistent accuracy, evaluate on the full dataset.

FCN with patches creates boundary

I am trying to train a Unet model to do per pixel regression predictions on images. To do this, I separate my large image (1000x1000) to 200x200 pixel squares. Then use that to train an FCN model with a linear final layer. The loss function is MSE loss. In the prediction stage, I extract the same boxes but stitch it together and obtain a final output image. When I do that, the problem I am getting is that there is discontinuities between the boundary of boxes. (I can clearly see the boxes)
I've tried to deal with this by feeding 250x250 boxes to my FCN and calculating the loss for the 200x200 centre region. I do the same process for the prediction state. Extract 250x250 patches crop the 200x200 centre region and stitch the image back together. Please see some code below:
Loss Function:
criterion = nn.MSELoss()
optimizer = optim.Adam(self.model.parameters(), lr=LR)
for inputs, labels in train_loader:
inputs, labels = inputs.to(device), labels.to(device)
optimizer.zero_grad()
output = model(inputs)
output = output.squeeze()
_, dimx, dimy = output.shape
loss = criterion(output[:,25:dimx-25, 25:dimy-25], labels[:,25:dimx-25, 25:dimy-25])
loss.backward()
optimizer.step()
My code for predictions is as follows:
pred = np.zeros((height, width))
for i in range(25, height, 200):
for j in range(25, width, 200):
patch = img[:, i-25:i+225, j-25:j+225]
patch = torch.from_numpy(patch)
patch = patch.unsqueeze(dim=0).to(device)
out = model(patch)
out = out[0,0,25:225, 25:225]
pred[i:i+200, j:j+200] = out.cpu().numpy()
I'm not sure if my problem makes complete sense. I can provide more clarification if necessary but I have been stuck on this for a while now.
It makes sense to have discontinuity near the boundary because there is no requirement for the network to have smooth predictions across boxes during the training.
I assume you have limited GPU memory, so you take only 200x200 pixels as input at a time; Thus, I would suggest the following two possible workarounds.
First, You could use torchvision.transform.RandomCrop to generate 200x200 cropped regions as inputs of the training. At the testing phase, you directly input the whole image to do the prediction. The intuition is that the model can see the full resolution of images, which is the same as testing data, while consuming fewer GPU memory during the training. In this case, you would also expect that the model needs more time to learn all training data patterns because it only sees partial data at a time.
Second, You could simply downsample training data, say 0.5x, and keep the output size, i.e. 1x. For example, in your case, after downsampling the input image to 200x200, the model takes it to predict 1000x1000 pixel level labels (you could use bilinear upsampling or deconv layers). This workaround method has been used in some segmentation implementations (AdaptSeg, DISE).
After some troubleshooting I realized that I had this problem because I was performing batch normalization between each convolutional layer. Removing that step solved the discontinuity problem.

Simple autoencoder keeping constant tensor as predict in keras

I'm new in keras and deep learning field. In fact, I want to make a dense vector for each document in my data so that i built a simple autoencoder using keras library.
The input data are normalized using Word2vec with 200 as embedding size and all features are between -1 and 1. I prepared a 3D tensor that contains 137 samples (number of document) with 469 columns (maximum numbers of words) and the third dimension is the embedding size.I used mse loss function and GRU as recurrent neural network. I am having the same vector for all documents as the autoencoder prediction output while loss start with a very low value and became constant after a few number of epochs.
I tried different number of epochs but I got the same thing. I tried also to change the batch size but no change. Can any one help me find the problem please.
input = Input(shape=(469,200))
encoder = GRU(120,activation='sigmoid',dropout=0.2)(input)
neck = Dense(20)(encoder)
decoder1 = RepeatVector(469)(neck)
decoder1 = GRU(120,return_sequences=True,activation='sigmoid',dropout=0.2)(decoder1)
decoder1 = TimeDistributed(Dense(200,activation='tanh'))(decoder1)
model = Model(inputs=input, outputs=decoder1)
model.compile(optimizer='adam', loss='mse')
history = model.fit(x_train, x_train,validation_data=(x_test,x_test) ,epochs=10, batch_size=8)
this is the input data "x_train" :
print(model.predict(x_train)) return this values (same vectors):
Why "model.predict(x_train)" return the same vector for the 137 samples ?
Thank you in advance.

Training E-net on human segmentation

I am trying to train a semantic-segmentation network (E-Net) in particular for high-quality human segmentation. For that, I have collected the "Supervisely Person" data-set and extracted the annotation masks using the provided API. This data-set holds high quality masks, thus I think it will provide better results in comparison to e.g. COCO data-set.
Supervisely - Example below : original image - ground truth.
First I want to give some details of the model. The network itself (Enet_arch) returns logits from the last convolution layer and probabilities which are produced through tf.nn.sigmoid(logits,name='logits_to_softmax').
I am using sigmoid cross-entropy on the ground truth and the returned logits, momentum and exponential decay on the learning rate. The model instance and the training pipeline is as follows.
self.global_step = tf.Variable(0, name='global_step', trainable=False)
self.momentum = tf.Variable(0.9, trainable=False)
# introducing weight decay
#with slim.arg_scope(ENet_arg_scope(weight_decay=2e-4)):
self.logits, self.probabilities = Enet_arch(inputs=self.input_data, num_classes=self.num_classes, batch_size=self.batch_size) # returns logits (2d), probabilities (2d)
#self.gt is int32 with values 0 or 1 (coming from read_tfrecords.Read_TFRecords annotation images + placeholder defined to int)
self.gt = self.input_masks
# self.probabilities is output of sigmoid, pixel-wise between probablities [0, 1].
# self.predictions is filtered probabilities > 0.5 = 1 else 0
self.predictions = tf.to_int32(self.probabilities > 0.5)
# capture segmentation accuracy
self.accuracy, self.accuracy_update = tf.metrics.accuracy(labels=self.gt, predictions=self.predictions)
# losses and updates
# calculate cross entropy loss on logits
loss = tf.losses.sigmoid_cross_entropy(multi_class_labels=self.gt, logits=self.logits)
# add the loss to total loss and average (?)
self.total_loss = tf.losses.get_total_loss()
# decay_steps = depend on the number of epochs
self.learning_rate = tf.train.exponential_decay(self.starter_learning_rate, global_step=self.global_step, decay_steps=123893, decay_rate=0.96, staircase=True)
#Now we can define the optimizer
#optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate, epsilon=1e-8)
optimizer = tf.train.MomentumOptimizer(self.learning_rate, self.momentum)
#Create the train_op.
self.train_op = optimizer.minimize(loss, global_step=self.global_step)
I first tried to over-fit the model on a single image to identify the depth of details that this network can capture. To increase the output quality I resized all the images to 1080p before feeding them to the network. On this trial I trained the network for 10K iterations and the total error reached ~30% (captured from tf.losses.get_total_loss() ).
The results while training on a single image are pretty good as you can see below.
Supervisely - Example below : (1) Loss (2) input (before resizing) | ground truth (before resizing) | 1080p out
Later, I tried to train on the whole data-set but the training loss produce lot of oscillations. That means that in some images the network perform well and in some other not. As a results after 743360 iterations (which is 160 epochs, since the training set holds 4646 images) I stopped training since obviously there is something wrong with the hyper-parameters selection that I made.
Supervisely - Example below : (1) Loss (2) learning rate (3) input (before resizing) | ground truth (before resizing) | 1080p out
On the other hand on some instances of the training set images the network produce fair (not very good though) results like below.
Supervisely - Example below : input (before resizing) | ground truth (before resizing) | 1080p out
Why do I have those differences on these training instances? Are there any obvious changes that I should do on the model or on the hyper-parameters? Is it possible that this model is just not suitable for this use-case (e.g. low network capacity) ?
Thanks in advance.
It turns out that the problem here is indeed E-net architecture. I changed the architecture with DeepLabV3 and saw a big difference in loss behaviour and performance.. even in small resolution!

Categories