Weird Discrepencies in Layer Shapes when Calling Model - python

I am trying to use the output of a variational autoencoder to aid in classifying images. I have pre-trainned the autoencoder and am now trying to load the weights in another script to use the weights of the encoder model for prediction. I am having a weird error when calling the encoder that I cannot make sense of. When I try to call the encoder on a sample, I am told that the shapes are incompatible:
ValueError: Input 0 of layer dense is incompatible with the layer: expected axis -1 of input shape to have value 1048576 but received input with shape (256, 8192). This is confusing because I have pre-trained the model fine and have instantiated the model like I did before (I copy/pasted the code). I have based my model on this YouTube tutorial.
I will also paste in my code:
########## Library Imports ##########
import os, sys
import tensorflow as tf
import numpy as np
from tensorflow.keras.layers import Conv2D, Input, Flatten, Dense, Lambda, Reshape, Conv2DTranspose
import keras
import keras.backend as K
from keras.models import Model
from PIL import Image
print(tf.version.VERSION)
img_height = 256 #chosen
img_width = 256
num_channels = 1 #grayscale
input_shape = (img_height, img_width, num_channels)
########## Load VAE Weights ##########
vae_path = os.path.join(os.getcwd(), 'vae_training')
checkpoint_path = os.path.join(vae_path, 'cp.ckpt')
print('vae_path listdir\n', os.listdir(vae_path))
#load patches
#patch_locs = sys.argv[1] #path to the patch folders
patch_locs = r'C:\Users\Daniel\Documents\GitHub\endo_git_v2\patches\single_wsi_for_local_parent'
patch_folders = os.listdir(patch_locs)
print(patch_folders)
########## INSTANTIATE MODEL AND LOAD WEIGHTS ##########
#REPARAMETERIZATION TRICK
# Define sampling function to sample from the distribution
# Reparameterize sample based on the process defined by Gunderson and Huang
# into the shape of: mu + sigma squared x eps
#This is to allow gradient descent to allow for gradient estimation accurately.
def sample_z(args):
z_mu, z_sigma = args
z_mu = tf.cast(z_mu, dtype=tf.float32)
z_sigma = tf.cast(z_sigma, dtype=tf.float32)
eps = K.random_normal(shape=(K.shape(z_mu)[0], K.int_shape(z_mu)[1]))
out = z_mu + K.exp(z_sigma / 2) * eps
return out
#Define custom loss
#VAE is trained using two loss functions reconstruction loss and KL divergence
#Let us add a class to define a custom layer with loss
class CustomLayer(keras.layers.Layer):
def vae_loss(self, x, z_decoded):
x = K.flatten(x)
z_decoded = K.flatten(z_decoded)
# Reconstruction loss (as we used sigmoid activation we can use binarycrossentropy)
recon_loss = keras.metrics.binary_crossentropy(x, z_decoded)
recon_loss = tf.cast(recon_loss, dtype=tf.float32)
# KL divergence
kl_loss = -5e-4 * K.mean(1 + z_sigma - K.square(z_mu) - K.exp(z_sigma), axis=-1)
kl_loss = tf.cast(kl_loss, dtype=tf.float32)
return K.mean(recon_loss + kl_loss)
# add custom loss to the class
def call(self, inputs):
x = inputs[0]
z_decoded = inputs[1]
loss = self.vae_loss(x, z_decoded)
self.add_loss(loss, inputs=inputs)
return x
# # ================= #############
# # Encoder
#Let us define 4 conv2D, flatten and then dense
# # ================= ############
latent_dim = 256 # Number of latent dim parameters
input_img = Input(shape=input_shape, name='encoder_input')
print(input_img.shape)
x = Conv2D(32, 3, padding='same', activation='relu')(input_img)
print(x.shape)
x = Conv2D(64, 3, padding='same', activation='relu',strides=(2, 2))(x)
print(x.shape)
x = Conv2D(64, 3, padding='same', activation='relu')(x)
print(x.shape)
x = Conv2D(64, 3, padding='same', activation='relu')(x)
print(x.shape)
conv_shape = K.int_shape(x) #Shape of conv to be provided to decoder (taken after all the conv layers)
print(conv_shape)
#Flatten
x = Flatten()(x)
print(x.shape)
x = Dense(32, activation='relu')(x)
print(x.shape)
# Two outputs, for latent mean and log variance (std. dev.)
#Use these to sample random variables in latent space to which inputs are mapped.
z_mu = Dense(latent_dim, name='latent_mu')(x) #Mean values of encoded input
z_sigma = Dense(latent_dim, name='latent_sigma')(x) #Std dev. (variance) of encoded
z_mu = tf.cast(z_mu, dtype=tf.float32)
z_sigma = tf.cast(z_sigma, dtype=tf.float32)
print('z_mu.dtype:', z_mu.dtype)
print('z_sigma.dtype:', z_sigma.dtype)
# sample vector from the latent distribution
# z is the labda custom layer we are adding for gradient descent calculations
# using mu and variance (sigma)
z = Lambda(sample_z, output_shape=(latent_dim, ), name='z')([z_mu, z_sigma])
print('z.dtype:', z.dtype)
#Z (lambda layer) will be the last layer in the encoder.
# Define and summarize encoder model.
encoder = Model(input_img, [z_mu, z_sigma, z], name='encoder')
print(encoder.summary())
# ================= ###########
# Decoder
#
# ================= #################
# decoder takes the latent vector as input
decoder_input = Input(shape=(latent_dim, ), name='decoder_input')
# Need to start with a shape that can be remapped to original image shape as
#we want our final utput to be same shape original input.
#So, add dense layer with dimensions that can be reshaped to desired output shape
x = Dense(conv_shape[1]*conv_shape[2]*conv_shape[3], activation='relu')(decoder_input)
# reshape to the shape of last conv. layer in the encoder, so we can
x = Reshape((conv_shape[1], conv_shape[2], conv_shape[3]))(x)
# upscale (conv2D transpose) back to original shape
# use Conv2DTranspose to reverse the conv layers defined in the encoder
x = Conv2DTranspose(32, 3, padding='same', activation='relu',strides=(2, 2))(x)
#Can add more conv2DTranspose layers, if desired.
#Using sigmoid activation
x = Conv2DTranspose(num_channels, 3, padding='same', activation='sigmoid', name='decoder_output')(x)
# Define and summarize decoder model
decoder = Model(decoder_input, x, name='decoder')
decoder.summary()
# apply the decoder to the latent sample
z_decoded = decoder(z)
# apply the custom loss to the input images and the decoded latent distribution sample
y = CustomLayer()([input_img, z_decoded])
# y is basically the original image after encoding input img to mu, sigma, z
# and decoding sampled z values.
#This will be used as output for vae
vae = Model(input_img, y, name='vae')
# Compile VAE
vae.compile(optimizer='adam', loss=None, experimental_run_tf_function=False)
vae.summary()
model_weights_dir = r'C:\Users\Daniel\Documents\GitHub\endo_git_v2\vae_training'
checkpoint_path = os.path.join(model_weights_dir, 'cp.ckpt')
print(os.listdir(model_weights_dir))
#vae.load_weights(checkpoint_path)
##################################################################
########## Open all WSI, then Open all Patches ##########
#for wsi in patch_folders: #loops through all the wsi folders
wsi = patch_folders[0]
#start of wsi loop
print('wsi:', wsi)
current_wsi_directory = os.path.join(patch_locs, wsi) #take the current wsi
print('current_wsi_directory:', current_wsi_directory)
patches = os.listdir(current_wsi_directory)
latent_shape = (203, 147, 256)
latent_wsi = np.zeros(latent_shape) #initialized placeholders for latent representations
row = 0
col = 0
for i in range(1):#len(patches)): #should be 29841 every time
#load patch as numpy array
patch_path = os.path.join(current_wsi_directory, '{}_{}.jpeg'.format(wsi, i)) #numerical order not alphabetical
print('patch_path:', patch_path)
image = Image.open(patch_path)
data = np.asarray(image)
#emulate rescale of 1/.255
data = data / 255.
data = np.expand_dims(data, axis=-1)
print('data.shape:', data.shape)
encoder(data, training=False)
Any help or tips are very much appreciated

I solved my issue. Long story short that I'm an idiot. I was passing in a numpy array that was (256,256,1) in size (note that the batch dimension was missing). Reshaping to (1, 256, 256, 1) solved my issue (note that the first 1 is the batch dimension)

Related

Masking input for ConvLSTM1D

I am doing a binary regression problem using keras.
The input shape is: (None, 2, 94, 3) (channels is the last dimension)
I have the following architecture:
input1 = Input(shape=(time, n_rows, n_channels))
masking = Masking(mask_value=-999)(input1)
convlstm = ConvLSTM1D(filters=16, kernel_size=15,
data_format='channels_last',
activation="tanh")(masking)
dropout = Dropout(0.2)(convlstm)
flatten1 = Flatten()(dropout)
outputs = Dense(n_outputs, activation='sigmoid')(flatten1)
model = Model(inputs=input1, outputs=outputs)
model.compile(loss=keras.losses.BinaryCrossentropy(),
optimizer=tf.keras.optimizers.Adam(learning_rate=0.01))
However when training I get this error: Dimensions must be equal, but are 94 and 80 for '{{node conv_lstm1d/while/SelectV2}} = SelectV2[T=DT_FLOAT](conv_lstm1d/while/Tile, conv_lstm1d/while/mul_5, conv_lstm1d/while/Placeholder_2)' with input shapes: [?,94,16], [?,80,16], [?,80,16].
If I remove the masking layer this error disappears, what is the masking doing that triggers this error? Also the only way I was able to run the above architecture was with a kernel_size of 1.
Seems like the ConvLSTM1D layer needs a mask with the shape (samples, timesteps) according to the docs. The mask you are calculating has the shape (samples, time, rows). Here is one solution to fix your problem but I am not sure if it is the 'correct' way to go:
import tensorflow as tf
input1 = tf.keras.layers.Input(shape=(2, 94, 3))
masking = tf.keras.layers.Masking(mask_value=-999)(input1)
convlstm = tf.keras.layers.ConvLSTM1D(filters=16, kernel_size=15,
data_format='channels_last',
activation="tanh")(inputs = masking, mask = tf.reduce_all(masking._keras_mask, axis=-1))
dropout = tf.keras.layers.Dropout(0.2)(convlstm)
flatten1 = tf.keras.layers.Flatten()(dropout)
outputs = tf.keras.layers.Dense(1, activation='sigmoid')(flatten1)
model = tf.keras.Model(inputs=input1, outputs=outputs)
model.compile(loss=tf.keras.losses.BinaryCrossentropy(),
optimizer=tf.keras.optimizers.Adam(learning_rate=0.01))
This line mask = tf.reduce_all(masking._keras_mask, axis=-1) essentially reduces your mask to (samples, timesteps) by applying an AND operation to the last dimension of the mask. Alternatively, you could just create your own custom mask layer:
import tensorflow as tf
class Reduce(tf.keras.layers.Layer):
def __init__(self):
super(Reduce, self).__init__()
def call(self, inputs):
return tf.reduce_all(tf.reduce_any(tf.not_equal(inputs, -999), axis=-1, keepdims=False), axis=1)
input1 = tf.keras.layers.Input(shape=(2, 94, 3))
reduce_layer = Reduce()
boolean_mask = reduce_layer(input1)
convlstm = tf.keras.layers.ConvLSTM1D(filters=16, kernel_size=15,
data_format='channels_last',
activation="tanh")(inputs = input1, mask = boolean_mask)
dropout = tf.keras.layers.Dropout(0.2)(convlstm)
flatten1 = tf.keras.layers.Flatten()(dropout)
outputs = tf.keras.layers.Dense(1, activation='sigmoid')(flatten1)
model = tf.keras.Model(inputs=input1, outputs=outputs)
model.compile(loss=tf.keras.losses.BinaryCrossentropy(),
optimizer=tf.keras.optimizers.Adam(learning_rate=0.01))
print(model.summary(expand_nested=True))
x = tf.random.normal((50, 2, 94, 3))
y = tf.random.uniform((50, ), maxval=3, dtype=tf.int32)
model.fit(x, y)

Issue retrieiving value error `decode_predictions` expects a batch of predictions

I have the following code trying to perform predictions on part of resnet model. However, I am retrieving error.
def layer_input_shape(Model, layer_index):
input_shape = np.array(Model.layers[layer_index - 1].output_shape)
input_shape = np.ndarray.tolist(np.delete(input_shape, 0))
return input_shape
def resnet50_Model(Model, trainable=True):
input_shape = layer_input_shape(Model, 1)
input = tf.keras.layers.Input(shape=input_shape)
first_layer = Model.layers[0]
first_layer.trainable = trainable
out = first_layer(input)
for i in range(1, 12):
layer_i = Model.layers[i]
layer_i.trainable = trainable
out = layer_i(out)
out = Conv2D(filters=2, kernel_size=2, strides=(2,2), activation='relu')(out)
out = Flatten()(out)
out = Dense(units=2,activation='softmax')(out)
result_model = tf.keras.models.Model(inputs=[input], outputs=out)
return result_model
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
img='/content/elephant.jpg'
img = image.load_img(img, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = resnet_skip_model.predict(x)
print('Predicted:', decode_predictions(preds, top=3)[0])
Retrieving below error:
ValueError: `decode_predictions` expects a batch of predictions (i.e. a 2D array of shape (samples,
1000)). Found array with shape: (1, 3)
I added two output dense layer so I can only predict two classes and when I call decode it expects 1000 output last dense layer, therefore changed units from two to 1000
out = Dense(units=1000,activation='softmax')(out)

Why my DataGenerator iterates on more data than the size of dataset and give IndexError: list index out of range?

I'm trying to implement a network with keras and tensorflow back-end, I'm using transfer learning model (VGG16), my dataset is a medical images dataset so instead of having only one image, I have a series of slices, so my dataset is organized in a folder and each serie is a np.array() with size (nb_slices,512,512,3).
My dataset is composed by 1130 train samples and 120 valid samples, so I don't think that datas is the problem.
I tried to create a dataGenerator to load my image series in my model without a batch-size problem (I used this Training a Keras model from batches of .npy files using generator? to make my generator class) , (and I reshaped my volumes with size (nb_slices, 224,224,3))
then I tried to use transfer learning, and custom a VGG16 network with 1 more convolution layer, MaxPooling, Flatten, Dense, Dropout and final Dense layer.
When I start training, it seems there is no problem, but at a moment it returns IndexError: list index out of range, and I saw that DataGenerator iterates more than the size of dataset but I don't know why...
Which part could cause it ?
Here is my DataGenerator
INPUT_DIM = 224
MAX_PIXEL_VAL = 255
MEAN = 58.09
STDDEV = 49.73
class DataGenerator(keras.utils.Sequence):
def __init__(self, file_list, labels, data_loc):
self.listIDs = file_list
self.labels = labels
self.data_loc = data_loc
self.on_epoch_end()
def __len__(self):
return int(len(self.listIDs))
def __getitem__(self, index):
indexes = self.indexes[index:(index + 1)]
list_IDS_temp = [self.listIDs[k] for k in indexes]
X, y = self.__data_generation(list_IDS_temp)
return X, y
def on_epoch_end(self):
self.indexes = np.arange(len(self.listIDs))
def __data_generation(self, list_IDS_temp):
for ID in list_IDS_temp:
vol = np.load(self.data_loc + ID + '.npy')
nb_slices = vol.shape[0]
pad = int((vol.shape[2] - INPUT_DIM) / 2)
vol = vol[:, pad:-pad, pad:-pad]
# standardize
vol = (vol - np.min(vol)) / (np.max(vol) - np.min(vol)) * MAX_PIXEL_VAL
# normalize
vol = (vol - MEAN) / STDDEV
# convert to RGB
vol = np.stack((vol,) * 3, axis=3)
y = np.empty(nb_slices, dtype=int)
# y = self.labels[ID]
for i in range(nb_slices):
y[i] = self.labels[int(ID)]
return vol, keras.utils.to_categorical(y, num_classes=2)
and my model:
train_set = DataGenerator(df_train['exams'].tolist(), df_train['labels'].tolist(), all_file_loc_train)
valid_set = DataGenerator(df_val['exams'].tolist(), df_val['labels'].tolist(), all_file_loc_val)
model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3)) # , input_shape=(224, 224, 3)
layer_dict = dict([(layer.name, layer) for layer in model.layers])
x = layer_dict['block2_pool'].output
x = Conv2D(filters=64, kernel_size=(3, 3), activation='relu')(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Flatten()(x)
x = Dense(256, activation='relu')(x)
x = Dropout(0.4)(x)
x = Dense(2, activation='softmax')(x)
custom_model = Model(inputs=model.input, outputs=x)
for layer in custom_model.layers[:7]:
layer.trainable = False
custom_model.compile(loss='categorical_crossentropy', optimizer=optimizers.SGD(lr=0.0001, momentum=0.9),
metrics=["accuracy"])
results = custom_model.fit_generator(generator=train_set, validation_data=valid_set, epochs=50, verbose=2)
I expected accuracy around 80-90% of accuracy but it seems that something goes wrong in my DataGenerator and I don't know what. Please I need help...

keras model equivalent of tf.depth_to_space

I want to accomplish the equivalent of tf.depth_to_space in a Keras model. Specifically, the data in the Keras model is shaped H x W x 4 (i.e., depth of 4) and I want to permute the data so that the output is sized H x W x 1, with the mapping done as viewing the 4 input channels as 2x2 blocks; i.e.,
input location is y, x, k
output location is 2*y+(k//2), 2*x+(k%2), 1
I know that I can get the correct shape with:
outputs = keras.layers.Reshape((H*2,W*2,1), input_shape=(H,W,4))(inputs)
But I think that the mapping will be
input location is y, x, k
Linear_addess is y*W*4+x*4+k
output location is Linear_addess//(H*2), Linear_addess % (H*2), 1
which is not what I want
I tried directly using the
outputs = tf.depth_to_space(inputs, 2)
but that lead to an error:
TypeError: Output tensors to a Model must be Keras tensors. Found Tensor("DepthToSpace:0", shape=(?, 1024, 1024, 1), dtype=float32)
the problem can be seen with this simple function
def simple_net(H=512, W=512):
inputs = keras.layers.Input((H, W, 4))
# gets the correct shape but not the correct order
outputs = keras.layers.Reshape((H*2,W*2,1), input_shape=(H,W,4))(inputs)
# Run time error message
#outputs = tf.depth_to_space(output_planes, 2)
model = keras.models.Model(inputs, outputs)
return model
you should use Keras Lamda layer
from keras.layers import Lambda
import tensorflow as tf
Subpixel_layer = Lambda(lambda x:tf.nn.depth_to_space(x,scale))
x = Subpixel_layer(inputs=x)
MINIMAL MODEL
import tensorflow as tf
from keras.layers import Input,Lambda
in=Input(shape=(32,32,3))
x = Conv2D(32, (3,3), activation='relu')(in)
x = Conv2D(32, (3,3), activation='relu')(x)
sub_layer = Lambda(lambda x:tf.nn.depth_to_space(x,2))
x = sub_layer(inputs=x)
model = Model(inputs=in, outputs=x)
# model.compile(optimizer = Adam(), loss = mean_squared_error)
model.summary()
Summary

TensorFlow model gets zero loss

import tensorflow as tf
import numpy as np
import os
import re
import PIL
def read_image_label_list(img_directory, folder_name):
# Input:
# -Name of folder (test\\\\train)
# Output:
# -List of names of files in folder
# -Label associated with each file
cat_label = 1
dog_label = 0
filenames = []
labels = []
dir_list = os.listdir(os.path.join(img_directory, folder_name)) # List of all image names in 'folder_name' folder
# Loop through all images in directory
for i, d in enumerate(dir_list):
if re.search("train", folder_name):
if re.search("cat", d): # If image filename contains 'Cat', then true
labels.append(cat_label)
else:
labels.append(dog_label)
filenames.append(os.path.join(img_dir, folder_name, d))
return filenames, labels
# Define convolutional layer
def conv_layer(input, channels_in, channels_out):
w_1 = tf.get_variable("weight_conv", [5,5, channels_in, channels_out], initializer=tf.contrib.layers.xavier_initializer())
b_1 = tf.get_variable("bias_conv", [channels_out], initializer=tf.zeros_initializer())
conv = tf.nn.conv2d(input, w_1, strides=[1,1,1,1], padding="SAME")
activation = tf.nn.relu(conv + b_1)
return activation
# Define fully connected layer
def fc_layer(input, channels_in, channels_out):
w_2 = tf.get_variable("weight_fc", [channels_in, channels_out], initializer=tf.contrib.layers.xavier_initializer())
b_2 = tf.get_variable("bias_fc", [channels_out], initializer=tf.zeros_initializer())
activation = tf.nn.relu(tf.matmul(input, w_2) + b_2)
return activation
# Define parse function to make input data to decode image into
def _parse_function(img_path, label):
img_file = tf.read_file(img_path)
img_decoded = tf.image.decode_image(img_file, channels=3)
img_decoded.set_shape([None,None,3])
img_decoded = tf.image.resize_images(img_decoded, (28, 28), method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
img_decoded = tf.image.per_image_standardization(img_decoded)
img_decoded = tf.cast(img_decoded, dty=tf.float32)
label = tf.one_hot(label, 1)
return img_decoded, label
tf.reset_default_graph()
# Define parameterspe
EPOCHS = 10
BATCH_SIZE_training = 64
learning_rate = 0.001
img_dir = 'C:/Users/tharu/PycharmProjects/cat_vs_dog/data'
batch_size = 128
# Define data
features, labels = read_image_label_list(img_dir, "train")
# Define dataset
dataset = tf.data.Dataset.from_tensor_slices((features, labels)) # Takes slices in 0th dimension
dataset = dataset.map(_parse_function)
dataset = dataset.batch(batch_size)
iterator = dataset.make_initializable_iterator()
# Get next batch of data from iterator
x, y = iterator.get_next()
# Create the network (use different variable scopes for reuse of variables)
with tf.variable_scope("conv1"):
conv_1 = conv_layer(x, 3, 32)
pool_1 = tf.nn.max_pool(conv_1, ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")
with tf.variable_scope("conv2"):
conv_2 = conv_layer(pool_1, 32, 64)
pool_2 = tf.nn.max_pool(conv_2, ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")
flattened = tf.contrib.layers.flatten(pool_2)
with tf.variable_scope("fc1"):
fc_1 = fc_layer(flattened, 7*7*64, 1024)
with tf.variable_scope("fc2"):
logits = fc_layer(fc_1, 1024, 1)
# Define loss function
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf.cast(y, dtype=tf.int32)))
# Define optimizer
train = tf.train.AdamOptimizer(learning_rate).minimize(loss)
with tf.Session() as sess:
# Initiliaze all the variables
sess.run(tf.global_variables_initializer())
# Train the network
for i in range(EPOCHS):
# Initialize iterator so that it starts at beginning of training set for each epoch
sess.run(iterator.initializer)
print("EPOCH", i)
while True:
try:
_, epoch_loss = sess.run([train, loss])
except tf.errors.OutOfRangeError: # Error given when out of data
if i % 2 == 0:
# [train_accuaracy] = sess.run([accuracy])
# print("Step ", i, "training accuracy = %{}".format(train_accuaracy))
print(epoch_loss)
break
I've spent a few hours trying to figure out systematically why I've been getting 0 loss when I run this model.
Features = list of file locations for each image (e.g. ['\data\train\cat.0.jpg', /data\train\cat.1.jpg])
Labels = [Batch_size, 1] one_hot vector
Initially I thought it was because there was something wrong with my data. But I've viewed the data after being resized and the images seems fine.
Then I tried a few different loss functions because I thought maybe I'm misunderstanding what the the tensorflow function softmax_cross_entropy does, but that didn't fix anything.
I've tried running just the 'logits' section to see what the output is. This is just a small sample and the numbers seem fine to me:
[[0.06388957]
[0. ]
[0.16969752]
[0.24913025]
[0.09961276]]
Surely then the softmax_cross_entropy function should be able to compute this loss given that the corresponding labels are 0 or 1? I'm not sure if I'm missing something. Any help would be greatly appreciated.
As documented:
logits and labels must have the same shape, e.g. [batch_size, num_classes] and the same dtype (either float16, float32, or float64).
Since you mentioned your label is "[Batch_size, 1] one_hot vector", I would assume both your logits and labels are [Batch_size, 1] shape. This will certainly lead to zero loss. Conceptually speaking, you have only 1 class (num_classes=1) and your cannot be wrong (loss=0).
So at least for you labels, you should transform it: tf.one_hot(indices=labels, depth=num_classes). Your prediction logits should also have a shape [batch_size, num_classes] output.
Alternatively, you can use sparse_softmax_cross_entropy_with_logits, where:
A common use case is to have logits of shape [batch_size, num_classes] and labels of shape [batch_size]. But higher dimensions are supported.

Categories