Classify single image based on trained tensorflow model - python

I'm using convolutional neural networks to classify images in 3 labels. I did all the training and test and got an accuracy of 60%. Then, I saved this model and I want to load a single image and classify it into one of those labels. The code I'm using:
X_new = process_data() # That's my input image after some processing
pred = convolutional_neural_network(x) # That's my CNN
with tf.Session() as sess:
# Here I restore the trained model
saver = tf.train.import_meta_graph('modelo.meta')
saver.restore(sess, 'modelo')
print('Model loaded')
sess.run(tf.initialize_all_variables())
# Here I'm trying to predict the label of my image
c = sess.run(pred, feed_dict={x: X_new})
print(c)
When I print c, it returns me something like this:
[[ 1.5495030e+07 -2.3345528e+08 -1.5847101e+08]]
But I couldn't find out what that means and what I should do with it.
Anyway, what I'm trying to do is to get a percentage of how much an image belongs to some label. If anyone can help me with that, I will be so grateful! I'm new to tensorflow and I'm having difficulties.
Thank you so much!
EDIT:
The convolutional_neural_network method:
def convolutional_neural_network(x):
weights = {'W_conv1': tf.Variable(tf.random_normal([3, 3, 3, 1, 32])),
'W_conv2': tf.Variable(tf.random_normal([3, 3, 3, 32, 64])),
'W_fc': tf.Variable(tf.random_normal([54080, 1024])),
'out': tf.Variable(tf.random_normal([1024, n_classes]))}
biases = {'b_conv1': tf.Variable(tf.random_normal([32])),
'b_conv2': tf.Variable(tf.random_normal([64])),
'b_fc': tf.Variable(tf.random_normal([1024])),
'out': tf.Variable(tf.random_normal([n_classes]))}
x = tf.reshape(x, shape=[-1, IMG_PX_SIZE, IMG_PX_SIZE, HM_SLICES, 1])
conv1 = tf.nn.relu(conv3d(x, weights['W_conv1']) + biases['b_conv1'])
conv1 = maxpool3d(conv1)
conv2 = tf.nn.relu(conv3d(conv1, weights['W_conv2']) + biases['b_conv2'])
conv2 = maxpool3d(conv2)
fc = tf.reshape(conv2, [-1, 54080])
fc = tf.nn.relu(tf.matmul(fc, weights['W_fc']) + biases['b_fc'])
fc = tf.nn.dropout(fc, keep_rate)
output = tf.matmul(fc, weights['out']) + biases['out']
return output

Depends on the type of activation layer used at the last layer of the model. I am assuming softmax as there are 3 labels to predict. Based on the values, I think the network was not able to predict the label with high certainty hence the low values. If you just want to know the label of the image use argmax function of numpy library like this
import numpy as np
c=np.argmax(c)
print(c)
Hope this helps.

Right now you seem to just have a linear regression output layer (or logits layer), so it is hard to interpret the outputs. We would also need to see your loss function to understand the outputs you are seeing.
However, for classification into one of 3 labels you would normally want to terminate your network with a softmax layer, then the outputs will be interpretable as a probabilities for each label (multiply by 100 if you need percentages).
probabilities = tf.nn.softmax(output)
You would still train your classifier using your output/logits layer with softmax cross entropy loss with your grounth-truth 'y' values.
losses = tf.nn.softmax_cross_entropy_with_logits(logits=output, labels=y)

Related

Add non-image features to the Inception network

I am training an binary classifier using the Inception V3 model and I would like to feed some of the non-image features of my dataset into the network.
I have previously trained a logistic regression model with these features which performed well and I would like to see if I can improve my cnn by combining these models.
It looks like inception has a fully connected layer (logits layer) just before the softmax and I believe I should concatenate some nodes onto that layer to feed in my features. I have never done this, however.
The logits layer is build here - a snippet of the inception code
# Final pooling and prediction
with tf.variable_scope('logits'):
shape = net.get_shape()
net = ops.avg_pool(net, shape[1:3], padding='VALID', scope='pool')
# 1 x 1 x 2048
net = ops.dropout(net, dropout_keep_prob, scope='dropout')
net = ops.flatten(net, scope='flatten')
# 2048
logits = ops.fc(net, num_classes, activation=None, scope='logits',
restore=restore_logits)
# 1000
end_points['logits'] = logits
if FLAGS.mode == '0_softmax':
end_points['predictions'] = tf.nn.softmax(logits, name='predictions')
The function to make the fully connected layer:
#scopes.add_arg_scope
def fc(inputs,
num_units_out,
activation=tf.nn.relu,
stddev=0.01,
bias=0.0,
weight_decay=0,
batch_norm_params=None,
is_training=True,
trainable=True,
restore=True,
scope=None,
reuse=None):
"""Adds a fully connected layer followed by an optional batch_norm layer.
FC creates a variable called 'weights', representing the fully connected
weight matrix, that is multiplied by the input. If `batch_norm` is None, a
second variable called 'biases' is added to the result of the initial
vector-matrix multiplication.
Args:
inputs: a [B x N] tensor where B is the batch size and N is the number of
input units in the layer.
num_units_out: the number of output units in the layer.
activation: activation function.
stddev: the standard deviation for the weights.
bias: the initial value of the biases.
weight_decay: the weight decay.
batch_norm_params: parameters for the batch_norm. If is None don't use it.
is_training: whether or not the model is in training mode.
trainable: whether or not the variables should be trainable or not.
restore: whether or not the variables should be marked for restore.
scope: Optional scope for variable_scope.
reuse: whether or not the layer and its variables should be reused. To be
able to reuse the layer scope must be given.
Returns:
the tensor variable representing the result of the series of operations.
"""
with tf.variable_scope(scope, 'FC', [inputs], reuse=reuse):
num_units_in = inputs.get_shape()[1]
weights_shape = [num_units_in, num_units_out]
weights_initializer = tf.truncated_normal_initializer(stddev=stddev)
l2_regularizer = None
if weight_decay and weight_decay > 0:
l2_regularizer = losses.l2_regularizer(weight_decay)
weights = variables.variable('weights',
shape=weights_shape,
initializer=weights_initializer,
regularizer=l2_regularizer,
trainable=trainable,
restore=restore)
if batch_norm_params is not None:
outputs = tf.matmul(inputs, weights)
with scopes.arg_scope([batch_norm], is_training=is_training,
trainable=trainable, restore=restore):
outputs = batch_norm(outputs, **batch_norm_params)
else:
bias_shape = [num_units_out,]
bias_initializer = tf.constant_initializer(bias)
biases = variables.variable('biases',
shape=bias_shape,
initializer=bias_initializer,
trainable=trainable,
restore=restore)
outputs = tf.nn.xw_plus_b(inputs, weights, biases)
if activation:
outputs = activation(outputs)
return outputs
My model has 10 non-image features, so I suppose I will use num_units_out + 10? I am not sure how what to do with the inputs. I assume I will add the feature data directly into this layer, by adding it to the input already coming from the previous layers. So in essence I will have two input layers.
Add your features just before the FC layer:
net = ops.flatten(net, scope='flatten')
# assuming both tensors have a shape like <batch>x<features>
net = tf.concat([net, my_other_features], axis=-1)
This will combine the existing FC part with a Input->FC->Sigmoid part into a single layer. Another way of saying it is that the final logistic layer (FC->Sigmoid) will get a feature vector input that consists of both your features and the features calculated by the CNN from the image.

Differentiating a trained simple neural net with tensorflow gives the wrong result and changes model weights

I am trying to teach a neural net a simple function (f(t) = 2t) and then compute derivative with respect to input (df/dt = 2). I use a net with one dense layer and no activation:
model = Sequential()
model.add(Dense(output_dim=1, input_shape=(1,), bias_initializer='ones'))
opt = RMSprop(lr=0.01, rho=0.9, epsilon=None, decay=0.0)
model.compile(optimizer=opt, loss='mse', metrics=['mae'])#optimizer=opt,
model.summary()
My data consist of pairs t -> f(t), where t is chosen randomly on [0, 1] to compute df/dt of my net I found this code:
https://groups.google.com/forum/#!msg/keras-users/g2JmncAIT9w/36MJZI7NBQAJ
and https://colab.research.google.com/drive/1l9YdIa2N40Fj3Y09qb3r3RhqKPXoaVJC
This is my full code on colab:
model.fit(train_x ,train_y, epochs=100,validation_data=(test_x, test_y),shuffle=False, batch_size=32)
model.layers[0].get_weights()# this displays 2.0069, quite right
outputTensor = model.output
listOfVariableTensors = model.inputs[0]
gradients = k.gradients(outputTensor, listOfVariableTensors)
sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())
evaluated_gradients = sess.run(gradients,feed_dict={model.input:np.array([[10]])})
evaluated_gradients # this displays kinda random number
model.layers[0].get_weights() # this displays same number as above
I believe my model performs a simple w*t + b transform, so its derivative should be just w. But the code I found provides wrong results and breaks trained weights. I actually think it resets them to initial weights because if I initialize dense layer weights with kernel_initializer= "ones", the code returns 1 as a derivative.
So, I need help with correct derivation of neural net.
sess = tf.InteractiveSession() # run this before anything happens
#....
model.fit(train_x ,train_y, epochs=100,validation_data=(test_x, test_y),shuffle=False, batch_size=32)
model.layers[0].get_weights()# this displays 2.0069, quite right
outputTensor = model.output
listOfVariableTensors = model.inputs[0]
gradients = k.gradients(outputTensor, listOfVariableTensors)
evaluated_gradients = sess.run(gradients,feed_dict={model.input:np.array([[10]])})
evaluated_gradients

Doing Multi-Label classification with BERT

I want to use BERT model to do multi-label classification with Tensorflow.
To do so, I want to adapt the example run_classifier.py from BERT github repository, which is an example on how to use BERT to do simple classification, using the pre-trained weights given by Google Research. (For example with BERT-Base, Cased)
I have X different labels which have value of either 0 or 1, so I want to add to the original BERT model a new Dense layer of size X and using the sigmoid_cross_entropy_with_logits activation function.
So, for the theorical part I think I am OK.
The problem is that I don't know how I can append a new output layer and retrain only this new layer with my dataset, using the existing BertModel class.
Here is the original create_model() function from run_classifier.py where I guess I have to do my modifications. But I am a bit lost on what to do.
def create_model(bert_config, is_training, input_ids, input_mask, segment_ids,
labels, num_labels, use_one_hot_embeddings):
"""Creates a classification model."""
model = modeling.BertModel(
config=bert_config,
is_training=is_training,
input_ids=input_ids,
input_mask=input_mask,
token_type_ids=segment_ids,
use_one_hot_embeddings=use_one_hot_embeddings)
output_layer = model.get_pooled_output()
hidden_size = output_layer.shape[-1].value
output_weights = tf.get_variable(
"output_weights", [num_labels, hidden_size],
initializer=tf.truncated_normal_initializer(stddev=0.02))
output_bias = tf.get_variable(
"output_bias", [num_labels], initializer=tf.zeros_initializer())
with tf.variable_scope("loss"):
if is_training:
# I.e., 0.1 dropout
output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)
logits = tf.matmul(output_layer, output_weights, transpose_b=True)
logits = tf.nn.bias_add(logits, output_bias)
probabilities = tf.nn.softmax(logits, axis=-1)
log_probs = tf.nn.log_softmax(logits, axis=-1)
one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)
per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
loss = tf.reduce_mean(per_example_loss)
return (loss, per_example_loss, logits, probabilities)
And here is the same function, with some of my modifications, but where there is things missing (and wrong things too? )
def create_model(bert_config, is_training, input_ids, input_mask, segment_ids, labels, num_labels):
"""Creates a classification model."""
model = modeling.BertModel(
config=bert_config,
is_training=is_training,
input_ids=input_ids,
input_mask=input_mask,
token_type_ids=segment_ids)
output_layer = model.get_pooled_output()
hidden_size = output_layer.shape[-1].value
output_weights = tf.get_variable("output_weights", [num_labels, hidden_size],initializer=tf.truncated_normal_initializer(stddev=0.02))
output_bias = tf.get_variable("output_bias", [num_labels], initializer=tf.zeros_initializer())
with tf.variable_scope("loss"):
if is_training:
# I.e., 0.1 dropout
output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)
logits = tf.matmul(output_layer, output_weights, transpose_b=True)
logits = tf.nn.bias_add(logits, output_bias)
probabilities = tf.nn.softmax(logits, axis=-1)
log_probs = tf.nn.log_softmax(logits, axis=-1)
per_example_loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=labels, logits=logits)
loss = tf.reduce_mean(per_example_loss)
return (loss, per_example_loss, logits, probabilities)
The other things I have adapted in the code and for which I had no problem :
DataProcessor to load and parse my custom dataset
Changing the type of labels variable from numerical values to arrays everywhere it is used
So, if anyone knows what I should do to resolve my problem, or even point out some obvious mistake I may have done, I would be glad to hear it.
Notes :
I found this article that correspond pretty well to what I am trying to do, but it use PyTorch, and I can not translate it into Tensorflow.
You want to replace the softmax that models a single distribution over possible outputs (all scores sum up to one) with sigmoid which models an independent distribution for each class (there is yes/no distribution for each output).
So, you correctly change the loss function, but you also need to change how you compute the probabilities. It should be:
probabilities = tf.sigmoid(logits)
In this case, you don't need the log_probs.

U-Net with Pixel-wise weighted cross entropy: Input dimension errors

I have been using Zhixuhao's implementation of U-Net to try to do semantic binary segmentation and I modified it slightly using suggestions from this Stackoverflow answer:
Keras, binary segmentation, add weight to loss function
to be able to do a pixel-wise weighted binary cross-entropy, as they do in the original U-Net paper (see page 5), to force my U-Net to learn border pixels. Essentially the idea is to add a lambda layer that computes the pixel-wise weighted cross-entropy within the model itself and then use an "identity loss" that just copies the output of the network.
Here is what my input data looks like:
input image groundtruth weights
And here is what my code looks like:
def unet(pretrained_weights = None,input_size = (256,256,1)):
inputs = Input(input_size)
# [... Unet architecture from Zhixuhao's model.py file...]
conv10 = Conv2D(1, 1, activation = 'sigmoid', name='true_output')(conv9)
mask_weights = Input(input_size)
true_masks = Input(input_size)
loss1 = Lambda(weighted_binary_loss, output_shape=input_size, name='loss_output')([conv10, mask_weights, true_masks])
model = Model(inputs = [inputs, mask_weights, true_masks], outputs = loss1)
model.compile(optimizer = Adam(lr = 1e-4), loss =identity_loss)
And added those two functions:
def weighted_binary_loss(X):
y_pred, weights, y_true = X
loss = keras.losses.binary_crossentropy(y_pred, y_true)
loss = multiply([loss, weights])
return loss
def identity_loss(y_true, y_pred):
return y_pred
And finally here is the relevant part of my main.py:
input_size = (256,256,1)
target_size = (256,256)
myGene = trainGenerator(5,'data/moma/train','img','seg','wei',data_gen_args,save_to_dir=None,target_size=target_size)
model = unet(input_size=input_size)
model_checkpoint = ModelCheckpoint('unet_moma_weights.hdf5',monitor='loss',verbose=1, save_best_only=True)
model.fit_generator(myGene,steps_per_epoch=300,epochs=5,callbacks=[model_checkpoint])
Now this code runs fine, I can train my U-Net and it does learn border pixels, but only if I resize my input images to be 256*256 in size. If I instead use input_size=(256,32,1) and target_size=(256,32) in main.py , which is the relevant dimensions for my data and that allows me to use bigger batch sizes, I get the following error:
ValueError: Operands could not be broadcast together with shapes (256,
32, 1) (256, 32)
For the line loss = multiply([loss, weights]). And indeed the weights have one extra singleton dimension. I don't understand why the error is not raised when I use 256*256 inputs, but I tried to make both inputs the same dimensions with either k.expand_dims() or Reshape(), but while the code does not issue an error and the loss converges, when I test my network on extra inputs I get blank outputs (ie fully grey or white or black images, or stuff that has nothing to do with my inputs).
So this is a lot of text for the following question: Why does multiply() issue an error in the 256*32 case and not 256*256, and why creating/removing dimensions on the inputs does not help?
Thanks!
ps: In order to get the network to output the actual prediction instead of the pixel-wise loss after training, I remove the loss layer and the two extra input layers with the following code:
new_model = Model(inputs=model.inputs,outputs=model.get_layer("true_output").output)
new_model.compile(optimizer = Adam(lr = 1e-4), loss = 'binary_crossentropy')
new_model.set_weights(model.get_weights())
This works fine (again in the 256*256 case at least)
So for anyone who stumbles upon this question, here is how I implemented the loss function:
def pixelwise_weighted_binary_crossentropy(y_true, y_pred):
'''
Pixel-wise weighted binary cross-entropy loss.
The code is adapted from the Keras TF backend.
(see their github)
Parameters
----------
y_true : Tensor
Stack of groundtruth segmentation masks + weight maps.
y_pred : Tensor
Predicted segmentation masks.
Returns
-------
Tensor
Pixel-wise weight binary cross-entropy between inputs.
'''
try:
# The weights are passed as part of the y_true tensor:
[seg, weight] = tf.unstack(y_true, 2, axis=-1)
seg = tf.expand_dims(seg, -1)
weight = tf.expand_dims(weight, -1)
except:
pass
epsilon = tf.convert_to_tensor(K.epsilon(), y_pred.dtype.base_dtype)
y_pred = tf.clip_by_value(y_pred, epsilon, 1. - epsilon)
y_pred = tf.math.log(y_pred / (1 - y_pred))
zeros = array_ops.zeros_like(y_pred, dtype=y_pred.dtype)
cond = (y_pred >= zeros)
relu_logits = math_ops.select(cond, y_pred, zeros)
neg_abs_logits = math_ops.select(cond, -y_pred, y_pred)
entropy = math_ops.add(relu_logits - y_pred * seg, math_ops.log1p(math_ops.exp(neg_abs_logits)), name=None)
# This is essentially the only part that is different from the Keras code:
return K.mean(math_ops.multiply(weight, entropy), axis=-1)

tensorflow neural network model based on mnist with two outputs

I want to train a neural network with 12 inputs and 2 outputs. Here I have a simple tensorflow neural network that has two outputs. When I run the code it always consistently gives one output. That is, if the two outputs are labeled 'l1' and 'l2' the model always chooses 'l1' for its output. Is this a problem with my input (that it doesn't vary enough between 'l1' and 'l2') or is this a problem with choosing to use just two outputs? This is my question. If it's the latter, what do I do to remidy this? My model is supposed to detect skin tones in a photo. ('l1' = skin tone, 'l2' = not skin tone). I'm not sure this makes sense. It is adapted from the mnist example, but that code has ten outputs.
def nn_setup(self):
input_num = 4 * 3
mid_num = 3
output_num = 2
x = tf.placeholder(tf.float32, [None, input_num])
W_1 = tf.Variable(tf.zeros([input_num, mid_num]))
b_1 = tf.Variable(tf.zeros([mid_num]))
y_mid = tf.nn.softmax(tf.matmul(x,W_1) + b_1)
W_2 = tf.Variable(tf.zeros([mid_num, output_num]))
b_2 = tf.Variable(tf.zeros([output_num]))
y = tf.nn.softmax(tf.matmul(y_mid, W_2) + b_2)
y_ = tf.placeholder(tf.float32, [None, output_num])
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
init = tf.initialize_all_variables()
self.sess = tf.Session()
self.sess.run(init)
for i in range(1000):
batch_xs, batch_ys = self.get_nn_next_train()
self.sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
self.nn_test.images, self.nn_test.labels = self.get_nn_next_test()
print(self.sess.run(accuracy, feed_dict={x: self.nn_test.images, y_: self.nn_test.labels}))
There are a few "odd" things with your network, such as having softmax in your middle layer.
You have two major issues I can find with your implementation.
1. Weight initialisation
W_1 = tf.Variable(tf.zeros([input_num, mid_num]))
W_2 = tf.Variable(tf.zeros([mid_num, output_num]))
This will initialise the weights to be identical. So they will have identical gradient values, and be changed at each step identically.
Effectively by doing this you have created a network with one neuron in each layer (which is then copied to create the layer matrix that you use).
Use a different initial value, it is usual to take a small random matrix like this:
W_1 = tf.Variable(tf.random_normal([input_num, mid_num], stddev=0.5))
In general you will want a smaller standard deviation the larger your layers are. You don't have to do this for biases as well, but you can if you like.
This won't fix everything with your network, but it should at least start to calculate different values from input data and train a little.
2. Use of cost function
You have used this loss function incorrectly:
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(y, y_) )
. . . because softmax_cross_entropy_with_logits is designed to work with the input to softmax, not the output. So your cost function is incorrect. Instead you want to reference y_logits like this where currently you calculate y:
y_logits = tf.matmul(y_mid, W_2) + b_2
y = tf.nn.softmax(y_logits)
Then your cross-entropy would be
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(y_logits, y_) )
After the hidden layer initialization, you have calculated softmax of the logits for the hidden layer: y_mid = tf.nn.softmax(tf.matmul(x,W_1) + b_1). In a classification problem, softmax should be applied to the values obtained from the output layer. Try something like: y_mid = tf.nn.relu(tf.matmul(x,W_1) + b_1) to compute the activations from the hidden layer and see if your classification improves. If that does not solve your problem, check for the population of 'l1' and 'l2' in your training data. If your training data is highly skewed towards 'l1', you will always get 'l1' as the output. You may consider minority-oversampling or undersampling techniques to resolve population imbalance problem.

Categories