I am trying to load a grayscale image dataset(fashion-mnist) to MobileNet model to predict hand written numbers but according to this tutorial only RGB images can be loaded to the model. When I try to feed fashion-mnist samples, it gives me the following error
Error when checking input: expected keras_layer_13_input to have shape
(224, 224, 3) but got array with shape (224, 224, 1)
How to solve this problem ?
Probably pre-trained MobileNet is not suitable for this task. You have two different problems. Mobilenet is made for Imagenet images which are 224x224 images with 3 color channels, while MNIST dataset is 28x28 images with one color channel. You can repeat the color channel in RGB:
# data.shape [70000, 224, 224, 1] -> [70000, 224, 224, 3]
data = np.repeat(data, 3, -1)
But before that, you need to resize images. For example, you can use PIL for resizing images:
from PIL import Image
data = np.array([Image.fromarray(x).resize([224,224]) for x in data])
There are some small details here which you should figure out yourself. Such as dtype of the images if you have loaded from the dataset as numpy. You may need to convert numpy types to integers with np.uint8().
Mobilenet v2 needs RGB. You might also be able to use the convert function from PIL.
Try this:
from PIL import Image
x= Image.open(input_image).resize((96,96)).convert("RGB")
documentation is here: https://pillow.readthedocs.io/en/stable/reference/Image.html
try this, x = np.stack((x,)*3, axis=-1). Please refer to this link for more details: https://github.com/malnakli/ML/blob/master/tf_serving_keras_mobilenetv2/main.ipynb
Related
I am working on CNN model for MNIST fashion dataset. I have created a successful CNN model. But I want to test the model for classification for another image that I downloaded from internet .
My all train and test set is of the shape (28, 28, 1). But now for the image I want to predict I resized it into (28,28) and it made it into one channel of RGB using
cv2.cvtColor(load_img_rz, cv2.COLOR_BGR2GRAY)
Now the shape of image is (28, 28). I tried to input it into the model and its shows error
ValueError: Input 0 of layer sequential_6 is incompatible with the layer: : expected min_ndim=4,
found ndim=3. Full shape received: (None, 28, 3)
I think shape is the issue. So how can I convert it into the shape(28,28,1) if that is the issue.
And does CNN work better in one channel RGB than 3 channel RGB?
A very useful command for me in Deep Learning is the expand_dims from numpy.
your_image.shape
>>> (28, 28)
your_new_array = np.expand_dims(your_image, axis=-1)
your_new_array.shape
>>> (28, 28, 1)
You can play around with the axis parameter to get a better feeling of what is going on here.
since you don't include your code, i'll assume that you have a problem with your input layer. So, you need to specified the number of unit and inpt dim into your input layer first:
model = Sequential()
model.add(Dense(X.shape[1], activation='something you desired', input_dim=X.shape[1]))
and so on
hard to understand what are you dealing with and what you want to achieve since you don't specified / share anything, not even the code.
I have been trying to stack two images.
The end result will be used as the input to my convolutional neural network.
Now I tried to use dstack, I also tried to use PIL by importing Image.blend but I cannot seem to be arriving to my desired result.
I am asking if anyone has any other ideas which I can use would be greatly appreciated.
This could help you out.
from PIL import Image
image1 = Image.open("img1.jpg")
image2 = Image.open("img2.jpg")
image1 = image1.resize((224, 224))
image1_size = image1.size
image2_size = image2.size
new_image = Image.new('RGB',(2*image1_size[0], image1_size[1]), (250,250,250))
Resize them so that they are the same size, and then use np.stack with axis=3 (if you are using multi-channel images. Else, use axis=2.
Or are you trying to combine them into one image? If so, how? Masking, adding subtracting?
I have grayscale images of different dimensions so I need to convert them to same dimension (say, 28*28) for my experiments. I tried to do it using different methods and I was able to do it but I observed that resizing of image lead to increase in number of channels. I am new to python and image processing so please help.
from PIL import Image
image = Image.open('6.tif')
image = image.resize((28, 28), Image.ANTIALIAS)
image.save('6.png', 'PNG', quality=100)
And then following code shows different dimensions:
import imageio
image_data = imageio.imread("6.tif").astype(float)
print(image_data.shape)
image_data = imageio.imread("6.png").astype(float)
print(image_data.shape)
and result is:
(65, 74)
(28, 28, 4)
I don't need the last dimension. How is this coming? I get the similar results even with "from resizeimage import resizeimage".
There are a number of issues with your code...
If you are expecting a greyscale image, make sure that is what you get. So, change this:
image = Image.open('6.tif')
to:
image = Image.open('6.tif').convert('L')
When you resize an image, you need to use one of the correct resampling methods:
PIL.Image.NEAREST
PIL.Image.BOX
PIL.Image.BILINEAR
PIL.Image.HAMMING
PIL.Image.BICUBIC
PIL.Image.LANCZOS
So, you need to replace the ANTI_ALIAS with something from the above list on this line:
image = image.resize((28, 28), Image.ANTIALIAS)
When you save as PNG, it is always loss-less. The quality factor does not work the same as for JPEG images, so you should omit it unless you have a good understanding of how it affects the PNG encoder.
If you make these changes, specifically the first, I think your problem will go away. Bear in mind though that the PNG encoder may take an RGB image and save it as a palletised image, or it may take a greyscale image and encode it as RGB, or RGB alpha.
I was looking at this Tensorflow tutorial.
In the tutorial the images are magically read like this:
mnist = learn.datasets.load_dataset("mnist")
train_data = mnist.train.images
My images are placed in two directories:
../input/test/
../input/train/
They all have a *.jpg ending.
So how can read them into my program?
I don't think I can use learn.datasets.load_dataset because this seems to take in a specialized dataset structure, while I only have folders with images.
mnist.train.images is essentially a numpy array of shape [55000, 784]. Where, 55000 is the number of images and 784 is the number of pixels in each image (each image is 28x28)
You need to create a similar numpy array from your data in case you want to run this exact code. So, you'll need to iterate over all your images, read image as a numpy array, flatten it and create a matrix of size [num_examples, image_size]
The following code snippet should do it:
import os
import cv2
import numpy as np
def load_data(img_dir):
return np.array([cv2.imread(os.path.join(img_dir, img)).flatten() for img in os.listdir(img_dir) if img.endswith(".jpg")])
A more comprehensive code to enable debugging:
import os
list_of_imgs = []
img_dir = "../input/train/"
for img in os.listdir("."):
img = os.path.join(img_dir, img)
if not img.endswith(".jpg"):
continue
a = cv2.imread(img)
if a is None:
print "Unable to read image", img
continue
list_of_imgs.append(a.flatten())
train_data = np.array(list_of_imgs)
Note:
If your images are not 28x28x1 (B/W images), you will need to change the neural network architecture (defined in cnn_model_fn). The architecture in the tutorial is a toy architecture which only works for simple images like MNIST. Alexnet may be a good place to start for RGB images.
You can check the answers given in How do I convert a directory of jpeg images to TFRecords file in tensorflow?. Easiest way is to use the utility provided by tensor flow :build_image_data.py, which does exactly the thing you want to do.
I am having issue in feeding my own image into LeNet using Caffe library. I have deployed and initialised the weight obtained through training with no difficulties. As the net is trained using input size of 28x28, I tried resizing the input image to 28x28 and feed into the deployed LeNet, but it gave me "unhashable numpy array" error.
Not only that, I also tried to transpose it with img = img.transpose(img, (2,0,1)) after resizing but it gave me "TypeError: only length-1 arrays can be converted to Python scalars"
Below are the python codes I tried so far in pre-processing my image :
img = caffe.io.load_image('number5.png')
img = caffe.io.resize_image(img, (28,28), interp_order=3)
img = img.transpose(img, (2,0,1))
I am a beginner in using Caffe and still in the process of learning. I hope someone can give me some example or insights in how to pre-process an image before feeding into the net.
Thank You.
Best regards.
Just use this instead:
img = img.transpose((2,0,1))
You can use caffe.io.Transformer
This is used to preprocess caffe 'data' blob.
Define it as
transformer = caffe.io.Transformer({'data':net.blobs['data'].data.shape})
transformer.set_transpose('data',(2,0,1))
then you can,
img = caffe.io.load_image('number5.png')
img = caffe.io.resize_image(img, (28,28), interp_order=3)
img_transposed=transformer.preprocess('data',img)