I'm using keras' pre-trained model VGG16, following this link: Transfer learning I'm trying predict content of an image:
# example of using a pre-trained model as a classifier
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.vgg16 import preprocess_input
from keras.applications.vgg16 import decode_predictions
from keras.applications.vgg16 import VGG16
# load an image from file
image = load_img('dog.jpg', target_size=(224, 224))
# convert the image pixels to a numpy array
image = img_to_array(image)
# reshape data for the model
image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))
# prepare the image for the VGG model
image = preprocess_input(image)
# load the model
model = VGG16()
# predict the probability across all output classes
yhat = model.predict(image)
# convert the probabilities to class labels
label = decode_predictions(yhat)
# retrieve the most likely result, e.g. highest probability
label = label[0][0]
# print the classification
print('%s (%.2f%%)' % (label[1], label[2]*100))
Full Error Output:
ValueError: decode_predictions expects a batch of predictions (i.e. a 2D array of shape (samples, 2622)) for V1 or (samples, 8631) for V2.Found array with shape: (1, 1000)
This is link to a seemingly similar question on SO.
Any comments and suggestions highly appreciated. Thank you!
I ran your code and it works properly. Since I do not have your image dog.jpg I used a color jpg image of an Afghan dog and the network identified it correctly as an Afghan Hound. So I suspect there is something amiss with your image. Yhat is a 1 X 1000 array as expected. Ensure you image is an rgb image.
thank you for your help. I was running this in Colab and had earlier tests code where in different cell i have imported for:
from keras_vggface.vggface import VGGFace
from keras_vggface.utils import preprocess_input
from keras_vggface.utils import decode_predictions
That was the reason for the error.... –
Related
I want to use trained model from https://www.kaggle.com/janeresh/imageprocess. I copied the notebook and tested it works perfectly. But if I download model.h5 and use it in Google Colab to predict masked or non masked it always return 0
I use this code to predict:
from tensorflow.keras.models import load_model
from keras.preprocessing import image
import numpy as np
import cv2
model = load_model('model.h5',compile=True)
model.summary()
img = cv2.imread('masked_image.png')
img = cv2.resize(img,(75,75))
img = np.reshape(img,[1,75,75,3])
prediction = model.predict_classes(img)
prediction
and the return is
array([[0]], dtype=int32)
Thanks
I would like to train with a different custom image augmentation during each epoch in the training.
The wrong solution would be to save the augmented images, and run the training on the saved images. Because if you try to loads hundreds of thousands of images for the training, you will get a memory error.
The right solution will have to use augmentation during the fit routine.
Can you please indicate me how to do it, pointing out a working example?
It won't create many images, and you won't get a memory error. While iterating through the dataset, it will apply random transformations to the image, without "creating" new images that will be saved in memory. So just do it normally:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import tensorflow as tf
import tensorflow_datasets as tfds
[train_set_raw] = tfds.load('cats_vs_dogs', split=['train[:100]'], as_supervised=True)
def augment(tensor):
tensor = tf.cast(x=tensor, dtype=tf.float32)
tensor = tf.image.rgb_to_grayscale(images=tensor)
tensor = tf.image.resize(images=tensor, size=(96, 96))
tensor = tf.divide(x=tensor, y=tf.constant(255.))
tensor = tf.image.random_flip_left_right(image=tensor)
tensor = tf.image.random_brightness(image=tensor, max_delta=2e-1)
tensor = tf.image.random_crop(value=tensor, size=(64, 64, 1))
return tensor
train_set_raw = train_set_raw.shuffle(128).map(lambda x, y: (augment(x), y)).batch(16)
import matplotlib.pyplot as plt
plt.imshow((next(iter(train_set_raw))[0][0][..., 0].numpy()*255).astype(int))
plt.show()
Total newbie here, I'm using this pytorch SegNet implementation with a '.pth' file containing weights from a 50 epochs training.
How can I load a single test image and see the net prediction?
I know this may sound like a stupid question but I'm stuck.
What I've got is:
from segnet import SegNet
import torch
model = SegNet(2)
model.load_state_dict(torch.load('./model_segnet_epoch50.pth'))
How do I "use" the net on a single test picture?
I provide with an example of ResNet152 pre-trained model.
def image_loader(loader, image_name):
image = Image.open(image_name)
image = loader(image).float()
image = torch.tensor(image, requires_grad=True)
image = image.unsqueeze(0)
return image
data_transforms = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor()
])
model_ft = models.resnet152(pretrained=True)
model_ft.eval()
print( np.argmax(model_ft(image_loader(data_transforms, $FILENAME)).detach().numpy()))
$FILENAME is the path and name of your image to be loaded. I got necessary help from this post.
output = model(image)
.
Note that the image should be a Variable object and that the output will be as well.
If your image is, for example, a Numpy array, you can convert it like so:
var_image = Variable(torch.Tensor(image))
I use the pre-trained VGG-16 model from Keras.
My working source code so far is like this:
from keras.applications.vgg16 import VGG16
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.vgg16 import preprocess_input
from keras.applications.vgg16 import decode_predictions
model = VGG16()
print(model.summary())
image = load_img('./pictures/door.jpg', target_size=(224, 224))
image = img_to_array(image) #output Numpy-array
image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))
image = preprocess_input(image)
yhat = model.predict(image)
label = decode_predictions(yhat)
label = label[0][0]
print('%s (%.2f%%)' % (label[1], label[2]*100))
I wound out that the model is trained on 1000 classes. It there any possibility to get the list of the classes this model is trained on? Printing out all the prediction labels is not an option because there are only 5 returned.
Thanks in advance
You could use decode_predictions and pass the total number of classes in the top=1000 parameter (only its default value is 5).
Or you could look at how Keras does this internally: It downloads the file imagenet_class_index.json (and usually caches it in ~/.keras/models/). This is a simple json file containing all class labels.
I think if you do something like this:
vgg16 = keras.applications.vgg16.VGG16(include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000)
vgg16.decode_predictions(np.arange(1000), top=1000)
Substitute your prediction array for np.arange(1000). Untested code so far.
Link to training labels here I think: http://image-net.org/challenges/LSVRC/2014/browse-synsets
If you edit your code a little bit you could get a list of all top predictions for the example you provided. Tensorflow decode_predictions returns a list of list class predictions tuples. So first, add top=1000 argument as #YSelf recommended to label = decode_predictions(yhat, top=1000) Then change label = label[0][0] to label = label[0][:] to select all of the predictions. label would look like this:
[('n04252225', 'snowplow', 0.4144803),
('n03796401', 'moving_van', 0.09205707),
('n04461696', 'tow_truck', 0.08912289),
('n03930630', 'pickup', 0.07173037),
('n04467665', 'trailer_truck', 0.048759833),
('n02930766', 'cab', 0.043586567),
('n04037443', 'racer', 0.036957625),....)]
From here you need to make a tuple unpacking. If you want to just get the list of 1000 classes you could just call [y for (x,y,z) in label] and you will have your list of all the 1000 classes. Output would look like this:
['snowplow',
'moving_van',
'tow_truck',
'pickup',
'trailer_truck',
'cab',
'racer',....]
this single line shall print out all class names and indices:
decode_predictions(np.expand_dims(np.arange(1000), 0), top=1000)
I have been trying to use the pre-trained inception_resnet_v2 model released by Google. I am using their model definition(https://github.com/tensorflow/models/blob/master/slim/nets/inception_resnet_v2.py) and given checkpoint(http://download.tensorflow.org/models/inception_resnet_v2_2016_08_30.tar.gz) to load the model in tensorflow as below [Download a extract the checkpoint file and download sample images dog.jpg and panda.jpg to test this code]-
import tensorflow as tf
slim = tf.contrib.slim
from PIL import Image
from inception_resnet_v2 import *
import numpy as np
checkpoint_file = 'inception_resnet_v2_2016_08_30.ckpt'
sample_images = ['dog.jpg', 'panda.jpg']
#Load the model
sess = tf.Session()
arg_scope = inception_resnet_v2_arg_scope()
with slim.arg_scope(arg_scope):
logits, end_points = inception_resnet_v2(input_tensor, is_training=False)
saver = tf.train.Saver()
saver.restore(sess, checkpoint_file)
for image in sample_images:
im = Image.open(image).resize((299,299))
im = np.array(im)
im = im.reshape(-1,299,299,3)
predict_values, logit_values = sess.run([end_points['Predictions'], logits], feed_dict={input_tensor: im})
print (np.max(predict_values), np.max(logit_values))
print (np.argmax(predict_values), np.argmax(logit_values))
However, the results from this model code does not give the expected results (class no 918 is predicted irrespective of the input image). Can someone help me understand where I am going wrong?
The Inception networks expect the input image to have color channels scaled from [-1, 1]. As seen here.
You could either use the existing preprocessing, or in your example just scale the images yourself: im = 2*(im/255.0)-1.0 before feeding them to the network.
Without scaling the input [0-255] is much larger than the network expects and the biases all work to very strongly predict category 918 (comic books).