I already trained my Keras model in .h5. My model use 6 classes and it able to classify all the classes by using images. The model able to output the name of the class that it successfully classified. However, I want to generate accuracy when testing the model with an image input by user. I already searching everywhere but still there are no answer for this problem.
model = load_model('prototype-tl2-80-20.h5')
classes = { 1:'Kacip Fatimah',
3:'Misai Adam',
4:'Pandan Serapat',
5:'Tapak Sulaiman',
6:'Tongkat Ali'}
image = Image.open(file_path)
image = image.resize((224,224))
image = numpy.expand_dims(image, axis=0)
image = numpy.array(image)
pred = model.predict_classes([image])[0]
sign = classes[pred+1]
to predict an image using a trained model you have to be careful to make sure the image is processed exactly as the training images were processed. The image should be the same size (height,width) as the training images and have the same number of color bands example 'rgb' or 'grayscale'. Make sure color bands are in the same order as used in training. Next you must apply the same preprocessing to the image. For example if your training images were scaled to be between 0 and 1 then you need to rescale your test image with image=image/255. After that than do
pred = model.predict(image)
print (index, class)
I'm trying to detect mostly text fields (not OCR, i.e, my goal is not to detect what is written but areas where is written some text). Some images may be important as well (but not much)
For example:
The output must be an feature array.
I had an VGG19 from keras working, but the results aren't awesome because i think it isn't trained to deal with text fields or documents.
img = image.load_img(imagepath, target_size=(224, 224))
img_data = image.img_to_array(img)
img_data = np.expand_dims(img_data, axis=0)
img_data = preprocess_input(img_data)
features = model.predict(img_data)
features = np.array(features) - expected output
Is there any CNN already trained to do this? If not, what approach do you suggest?
I am trying to train a Unet model to do per pixel regression predictions on images. To do this, I separate my large image (1000x1000) to 200x200 pixel squares. Then use that to train an FCN model with a linear final layer. The loss function is MSE loss. In the prediction stage, I extract the same boxes but stitch it together and obtain a final output image. When I do that, the problem I am getting is that there is discontinuities between the boundary of boxes. (I can clearly see the boxes)
I've tried to deal with this by feeding 250x250 boxes to my FCN and calculating the loss for the 200x200 centre region. I do the same process for the prediction state. Extract 250x250 patches crop the 200x200 centre region and stitch the image back together. Please see some code below:
Loss Function:
criterion = nn.MSELoss()
optimizer = optim.Adam(self.model.parameters(), lr=LR)
for inputs, labels in train_loader:
inputs, labels = inputs.to(device), labels.to(device)
output = model(inputs)
output = output.squeeze()
_, dimx, dimy = output.shape
loss = criterion(output[:,25:dimx-25, 25:dimy-25], labels[:,25:dimx-25, 25:dimy-25])
My code for predictions is as follows:
pred = np.zeros((height, width))
for i in range(25, height, 200):
for j in range(25, width, 200):
patch = img[:, i-25:i+225, j-25:j+225]
patch = torch.from_numpy(patch)
patch = patch.unsqueeze(dim=0).to(device)
out = model(patch)
out = out[0,0,25:225, 25:225]
pred[i:i+200, j:j+200] = out.cpu().numpy()
I'm not sure if my problem makes complete sense. I can provide more clarification if necessary but I have been stuck on this for a while now.
It makes sense to have discontinuity near the boundary because there is no requirement for the network to have smooth predictions across boxes during the training.
I assume you have limited GPU memory, so you take only 200x200 pixels as input at a time; Thus, I would suggest the following two possible workarounds.
First, You could use torchvision.transform.RandomCrop to generate 200x200 cropped regions as inputs of the training. At the testing phase, you directly input the whole image to do the prediction. The intuition is that the model can see the full resolution of images, which is the same as testing data, while consuming fewer GPU memory during the training. In this case, you would also expect that the model needs more time to learn all training data patterns because it only sees partial data at a time.
Second, You could simply downsample training data, say 0.5x, and keep the output size, i.e. 1x. For example, in your case, after downsampling the input image to 200x200, the model takes it to predict 1000x1000 pixel level labels (you could use bilinear upsampling or deconv layers). This workaround method has been used in some segmentation implementations (AdaptSeg, DISE).
After some troubleshooting I realized that I had this problem because I was performing batch normalization between each convolutional layer. Removing that step solved the discontinuity problem.
I'm trying to build a model for image classification but can't figure out how to plot validation images with class predicted (and probability) like in this guide:
As i use ImageDataGenerator i can get information about predicted classes and probability but i can't get images itself.
for n in range(30):
_ = plt.suptitle("Model predictions")
I get ValueError: could not broadcast input array from shape (400,80,80,3) into shape (400)
I know that the problem is in -> image_batch[n]
Please advise me how to pass information about image to imshow().
code for image generator i use:
img_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1/255)
image_batch = img_generator.flow_from_directory(
target_size=(size, size),
I restored a pre-trained model for face detection which takes a single image at a time and returns bounding boxes. How can I make it take a batch of images if these images have different sizes?
You can use tf.image.resize_images method to achieve this. According to docs tf.image.resize_images:
Resize images to size using the specified method.
Resized images will be distorted if their original aspect ratio is not
the same as size. To avoid distortions see
How to use it?
import tensorflow as tf
from tensorflow.python.keras.models import Model
x = Input(shape=(None, None, 3), name='image_input')
resize_x = tf.image.resize_images(x, [32,32])
vgg_model = load_vgg()(resize_x)
model = Model(inputs=x, outputs=vgg_model.output)
I have successfully built a multi-classes CNN in Keras for image classification purpose. I am now ready to start prediction, but among the test images, there are some images, which do not belong to any of the labels, but it will still be mistakenly classified as one of the labels.
Here is my predict function:
def predict(img):
x = img.resize((img_width, img_height), Image.ANTIALIAS)
x = img_to_array(x)
x = np.expand_dims(x, axis=0)
array = model.predict(x)
result = array[0]
answer = np.argmax(result)
return answer
I am thinking to discard the prediction result if the max value of prediction result array is below a certain value, but I am not sure how small I should set it.
You'll need another training dataset to estimate the best threshold... or you can train a new model with an extra class for all theses images that do not have a label.