Load a single image in a pretrained pytorch net - python

Total newbie here, I'm using this pytorch SegNet implementation with a '.pth' file containing weights from a 50 epochs training.
How can I load a single test image and see the net prediction?
I know this may sound like a stupid question but I'm stuck.
What I've got is:
from segnet import SegNet
import torch
model = SegNet(2)
model.load_state_dict(torch.load('./model_segnet_epoch50.pth'))
How do I "use" the net on a single test picture?

I provide with an example of ResNet152 pre-trained model.
def image_loader(loader, image_name):
image = Image.open(image_name)
image = loader(image).float()
image = torch.tensor(image, requires_grad=True)
image = image.unsqueeze(0)
return image
data_transforms = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor()
])
model_ft = models.resnet152(pretrained=True)
model_ft.eval()
print( np.argmax(model_ft(image_loader(data_transforms, $FILENAME)).detach().numpy()))
$FILENAME is the path and name of your image to be loaded. I got necessary help from this post.

output = model(image)
.
Note that the image should be a Variable object and that the output will be as well.
If your image is, for example, a Numpy array, you can convert it like so:
var_image = Variable(torch.Tensor(image))

Related

How to generate accuracy from a saved model of Keras?

I already trained my Keras model in .h5. My model use 6 classes and it able to classify all the classes by using images. The model able to output the name of the class that it successfully classified. However, I want to generate accuracy when testing the model with an image input by user. I already searching everywhere but still there are no answer for this problem.
model = load_model('prototype-tl2-80-20.h5')
classes = { 1:'Kacip Fatimah',
2:'Mempisang',
3:'Misai Adam',
4:'Pandan Serapat',
5:'Tapak Sulaiman',
6:'Tongkat Ali'}
image = Image.open(file_path)
image = image.resize((224,224))
image = numpy.expand_dims(image, axis=0)
image = numpy.array(image)
pred = model.predict_classes([image])[0]
sign = classes[pred+1]
print(sign)
to predict an image using a trained model you have to be careful to make sure the image is processed exactly as the training images were processed. The image should be the same size (height,width) as the training images and have the same number of color bands example 'rgb' or 'grayscale'. Make sure color bands are in the same order as used in training. Next you must apply the same preprocessing to the image. For example if your training images were scaled to be between 0 and 1 then you need to rescale your test image with image=image/255. After that than do
pred = model.predict(image)
index=np.argmax(pred)
class=classes[index]
print (index, class)

Keras: ValueError: decode_predictions expects a batch of predictions, NEW

I'm using keras' pre-trained model VGG16, following this link: Transfer learning I'm trying predict content of an image:
# example of using a pre-trained model as a classifier
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.vgg16 import preprocess_input
from keras.applications.vgg16 import decode_predictions
from keras.applications.vgg16 import VGG16
# load an image from file
image = load_img('dog.jpg', target_size=(224, 224))
# convert the image pixels to a numpy array
image = img_to_array(image)
# reshape data for the model
image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))
# prepare the image for the VGG model
image = preprocess_input(image)
# load the model
model = VGG16()
# predict the probability across all output classes
yhat = model.predict(image)
# convert the probabilities to class labels
label = decode_predictions(yhat)
# retrieve the most likely result, e.g. highest probability
label = label[0][0]
# print the classification
print('%s (%.2f%%)' % (label[1], label[2]*100))
Full Error Output:
ValueError: decode_predictions expects a batch of predictions (i.e. a 2D array of shape (samples, 2622)) for V1 or (samples, 8631) for V2.Found array with shape: (1, 1000)
This is link to a seemingly similar question on SO.
Any comments and suggestions highly appreciated. Thank you!
I ran your code and it works properly. Since I do not have your image dog.jpg I used a color jpg image of an Afghan dog and the network identified it correctly as an Afghan Hound. So I suspect there is something amiss with your image. Yhat is a 1 X 1000 array as expected. Ensure you image is an rgb image.
thank you for your help. I was running this in Colab and had earlier tests code where in different cell i have imported for:
from keras_vggface.vggface import VGGFace
from keras_vggface.utils import preprocess_input
from keras_vggface.utils import decode_predictions
That was the reason for the error.... –

How to remove certain layers from Fastern-RCNN in Pytorch?

Target: I want to use the pretrained Faster-RCNN model to extract features from image.
What I have tried: I use below code to build the model:
import torchvision.models as models
from PIL import Image
import torchvision.transforms as T
import torch
# download the pretrained fasterrcnn model
model = models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
model.eval()
model.cuda()
# remove [2:] layers
modules = list(model.children())[:2]
model_t=torch.nn.Sequential(*modules)
# load image and extract features
img = Image.open('data/person.jpg')
transform = T.Compose([T.ToTensor()])
img_t = transform(img)
batch_t = torch.unsqueeze(img_t, 0).cuda()
ft = model_t(batch_t)
Error: But I got the following error:TypeError: conv2d(): argument 'input' (position 1) must be Tensor, not tuple
Please help! Thank you!
print(model.modules) to get the layer names. Then delete a layer with:
del model.my_layer_name

CNN batch with images of different size

I restored a pre-trained model for face detection which takes a single image at a time and returns bounding boxes. How can I make it take a batch of images if these images have different sizes?
You can use tf.image.resize_images method to achieve this. According to docs tf.image.resize_images:
Resize images to size using the specified method.
Resized images will be distorted if their original aspect ratio is not
the same as size. To avoid distortions see
tf.image.resize_image_with_pad.
How to use it?
import tensorflow as tf
from tensorflow.python.keras.models import Model
x = Input(shape=(None, None, 3), name='image_input')
resize_x = tf.image.resize_images(x, [32,32])
vgg_model = load_vgg()(resize_x)
model = Model(inputs=x, outputs=vgg_model.output)
model.compile(...)
model.predict(...)

Using pre-trained inception_resnet_v2 with Tensorflow

I have been trying to use the pre-trained inception_resnet_v2 model released by Google. I am using their model definition(https://github.com/tensorflow/models/blob/master/slim/nets/inception_resnet_v2.py) and given checkpoint(http://download.tensorflow.org/models/inception_resnet_v2_2016_08_30.tar.gz) to load the model in tensorflow as below [Download a extract the checkpoint file and download sample images dog.jpg and panda.jpg to test this code]-
import tensorflow as tf
slim = tf.contrib.slim
from PIL import Image
from inception_resnet_v2 import *
import numpy as np
checkpoint_file = 'inception_resnet_v2_2016_08_30.ckpt'
sample_images = ['dog.jpg', 'panda.jpg']
#Load the model
sess = tf.Session()
arg_scope = inception_resnet_v2_arg_scope()
with slim.arg_scope(arg_scope):
logits, end_points = inception_resnet_v2(input_tensor, is_training=False)
saver = tf.train.Saver()
saver.restore(sess, checkpoint_file)
for image in sample_images:
im = Image.open(image).resize((299,299))
im = np.array(im)
im = im.reshape(-1,299,299,3)
predict_values, logit_values = sess.run([end_points['Predictions'], logits], feed_dict={input_tensor: im})
print (np.max(predict_values), np.max(logit_values))
print (np.argmax(predict_values), np.argmax(logit_values))
However, the results from this model code does not give the expected results (class no 918 is predicted irrespective of the input image). Can someone help me understand where I am going wrong?
The Inception networks expect the input image to have color channels scaled from [-1, 1]. As seen here.
You could either use the existing preprocessing, or in your example just scale the images yourself: im = 2*(im/255.0)-1.0 before feeding them to the network.
Without scaling the input [0-255] is much larger than the network expects and the biases all work to very strongly predict category 918 (comic books).

Categories