so I have this code in python using TensorFlow which opens an image and resizes it:
def parse_image(filename):
label = filename
image = tf.io.read_file(path+'/'+filename)
print(image.shape)
image = tf.image.decode_png(image)
image = tf.image.convert_image_dtype(image, tf.float32)
image = tf.image.resize(image, [340, 340])
print(image.shape)
return image, label
When I open a test grayscale image using OpenCV:
img = cv2.imread(path+"/test.png")
print(img.shape) #returns (2133,3219,3)
But when I call the above function on the same image:
image, label = parse_image('test.png')
print(image.shape) #returns (340,340,1)
I know the 340 x 340 is the width and height I just set but why did the channels change? I'm trying to calculate the structural similarity but the test image and the image I want to compare have different channels which raises an error. The worst part is, it's this specific test image. Other grayscale images work fine.
Related
I'm trying to print an image of 15x15 dimension pixel using skimage in python.
Here is the image
Note: I resize this image using img.resize((15,15))
After I print(image.shape) the result is: (15,15,3), how can I remove the 3 so it can display the structure of an image?
After A while I tried with the image I resize using https://www.resizepixel.com/edit and got this image
Then I tried to print(img.shape) the result is now (15,15)
Here is the code I used:
image = skimage.io.imread(fname="15x15.png")
print(image.shape)
print(image)
I've converted some images from RGB to Grayscale for ML purpose.
However the shape of the converted grayscale image is still 3, the same as the color image.
The code for the Conversion:
from PIL import Image
img = Image.open('path/to/color/image')
imgGray = img.convert('L')
imgGray.save('path/to/grayscale/image')
The code to check the shape of the images:
import cv2
im_color = cv2.imread('path/to/color/image')
print(im_color.shape)
im_gray2 = cv2.imread('path/to/grayscale/image')
print(im_gray2.shape)
You did
im_gray2 = cv2.imread('path/to/grayscale/image')
OpenCV does not inspect colorness of image - it does assume image is color and desired output is BGR 8-bit format. You need to inform OpenCV you want output to be grayscale (2D intensity array) as follows
im_gray2 = cv2.imread('path/to/grayscale/image', cv2.IMREAD_GRAYSCALE)
If you want to know more about reading images read OpenCV: Getting Started with Images
cv.imread, without any flags, will always convert any image content to BGR, 8 bits per channel.
If you want any image file, grayscale or color, to be read as grayscale, you can pass the cv.IMREAD_GRAYSCALE flag.
If you want to read the file as it really is, then you need to use cv.IMREAD_UNCHANGED.
im_color = cv2.imread('path/to/color/image', cv2.IMREAD_UNCHANGED)
print(im_color.shape)
im_gray2 = cv2.imread('path/to/grayscale/image', cv2.IMREAD_UNCHANGED)
print(im_gray2.shape)
I'm converting a .png image to float32 the following way and I'm obtaining a broken image as shown below. If I remove the tf.image.convert_image_dtype call, everything goes well.
image = tf.io.read_file(filename)
image = tf.image.decode_png(image, channels=3)
image = tf.image.convert_image_dtype(image, tf.float32)
I've also tried different images with different formats like .bmp and .jpg but same thing happens. The code I use to visualize the image generated the above way is just:
a = a.numpy()
a = Image.fromarray(a, 'RGB')
As I've said, if I just remove the tf.image.convert_image_dtype call everything goes well.
Here are the download links of both images (I have less than 10 reputation here so I can't upload photos yet).
original_image
obtained_image
You can convert it back to integer like this
import tensorflow as tf
import numpy as np
from PIL import Image
image = tf.io.read_file("C:\\<your file>.png")
image = tf.image.decode_png(image, channels=3)
image = tf.image.convert_image_dtype(image, tf.float32)
a = image.numpy()
a = (a * 255 / np.max(a)).astype('uint8')
a = Image.fromarray(a, 'RGB')
a.show()
I'm using PIL to resize a JPG. I'm expecting the same image, resized as output, but instead I get a correctly sized black box. The new image file is completely devoid of any information, just an empty file. Here is an excerpt for my script:
basewidth = 300
img = Image.open(path_to_image)
wpercent = (basewidth/float(img.size[0]))
hsize = int((float(img.size[1])*float(wpercent)))
img = img.resize((basewidth,hsize))
img.save(dir + "/the_image.jpg")
I've tried resizing with Image.LANCZOS as the second argument, (defaults to Image.NEAREST with 1 argument), but it didn't make a difference. I'm running Python3 on Ubunutu 16.04. Any ideas on why the image file is empty?
I also encountered the same issue when trying to resize an image with transparent background. The "resize" works after I add a white background to the image.
Code to add a white background then resize the image:
from PIL import Image
im = Image.open("path/to/img")
if im.mode == 'RGBA':
alpha = im.split()[3]
bgmask = alpha.point(lambda x: 255-x)
im = im.convert('RGB')
im.paste((255,255,255), None, bgmask)
im = im.resize((new_width, new_height), Image.ANTIALIAS)
ref:
Other's code for making thumbnail
Python: Image resizing: keep proportion - add white background
The simplest way to get to the bottom of this is to post your image! Failing that, we can check the various aspects of your image.
So, import Numpy and PIL, open your image and convert it to a Numpy ndarray, you can then inspect its characteristics:
import numpy as np
from PIL import Image
# Open image
img = Image.open('unhappy.jpg')
# Convert to Numpy Array
n = np.array(img)
Now you can print and inspect the following things:
n.shape # we are expecting something like (1580, 1725, 3)
n.dtype # we expect dtype('uint8')
n.max() # if there's white in the image, we expect 255
n.min() # if there's black in the image, we expect 0
n.mean() # we expect some value between 50-200 for most images
i have some problem with my task. i need get text from image using python+tesseract. But quality of image it's not highter - it`s screenshoot.
I'm using OpenCV lib and have two variant`s:
COLOR_BGR2GRAY
ADAPTIVE_THRESH_GAUSSIAN_C | THRESH_BINARY
and this variant`s working incorrect.
When i binarize image
def binarize_image(img_path, threshold=195):
"""Binarize an image."""
image_file = Image.open(img_path)
image = image_file.convert('L') # convert image to monochrome
image = np.array(image)
image = binarize_array(image, threshold)
im = Image.fromarray(image)
im.save(img_path)
# imsave(target_path, image)
def binarize_array(numpy_array, threshold=250):
"""Binarize a numpy array."""
for i in range(len(numpy_array)):
for j in range(len(numpy_array[0])):
if numpy_array[i][j] > threshold:
numpy_array[i][j] = 255
else:
numpy_array[i][j] = 0
return numpy_array
tesseract doesn`t usualy get text.
How i can resolve it`s problem ?
screenshot example
UPD: solved my problem, i need to add some pixels between two letters. on screenshot letter is white and background is black. How i can do it using numpy ?