I'm new to opencv so don't mind me!
I want to convert an image which is a black and white image to gray scale image and save it by using cv2.imwrite(). The problem is that after I saved it to my local drive and read it back it returned as a 3 channels image. What is the problem here?
Here is my code
import cv2
image = cv2.imread("path/to/image/image01.jpg")
print(image.shape) # return (128, 128, 3)
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
print(gray_image.shape) # return (128, 128)
cv2.imwrite("path/to/dir/gray01.jpg", gray_image)
new_gray_img = cv2.imread("path/to/dir/gray01.jpg")
print(new_gray_img.shape) # return (128, 128, 3)
here is the image i want to convert to gray.
cv2.imread loads images with 3 channels by default
You can replace the last two lines of code using:
new_gray_img = cv2.imread("path/to/dir/gray01.jpg",cv2.CV_LOAD_IMAGE_GRAYSCALE)
print(new_gray_img.shape)
Another method is to load image with scipy:
from scipy.ndimage import imread
new_gray_image=imread("path/to/dir/gray01.jpg")
Try to read the image directly in grayscale
cv2.imread("path/to/dir/gray01.jpg", 0)
import cv2
image = cv2.imread("path/to/image/image01.jpg")
print(image.shape) # return (128, 128, 3)
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
print(gray_image.shape) # return (128, 128)
cv2.imwrite("path/to/dir/gray01.jpg", gray_image)
# here is the change
new_gray_img = cv2.imread("path/to/dir/gray01.jpg", 0)
print(new_gray_img.shape) # return (128, 128)
Related
# organizing imports
import cv2
import numpy as np
import os
from PIL import ImageTk, Image
from os import listdir
from PIL import Image as PImage
# path to input image is specified and
# image is loaded with imread command
imagesdog = np.array([])
new_array = np.array([])
rootdirectory2 = os.listdir('alldata/data2/')
## Saves all images in directory to array from RGB values and resizes
for allimages in rootdirectory2:
image = Image.open('alldata/data2/' + allimages)
imagesave = np.asarray(image) ## Filters the image
smaller = cv2.resize(imagesave, (500, 500))
data = Image.fromarray(smaller)
IMG_SIZE = 500
xyz = cv2.resize(smaller, (IMG_SIZE, IMG_SIZE))
new_array = np.append(new_array, xyz)
## resized and appended, need to be added as a list
#cv2.imshow('Filtered', new_array[3])
for elements in new_array:
print(elements)
image1 = cv2.imread('alldata/data/image3.jpeg')
image1 = cv2.resize(image1, (500,500))
# cv2.cvtColor is applied over the
# image input with applied parameters
# to convert the image in grayscale
img = cv2.cvtColor(image1, cv2.COLOR_BGR2GRAY)
# applying different thresholding
# techniques on the input image
# all pixels value above 120 will
# be set to 255
ret, thresh1 = cv2.threshold(img, 120, 255, cv2.THRESH_BINARY)
ret, thresh2 = cv2.threshold(img, 120, 255, cv2.THRESH_BINARY_INV)
ret, thresh3 = cv2.threshold(img, 120, 255, cv2.THRESH_TRUNC)
ret, thresh4 = cv2.threshold(img, 120, 255, cv2.THRESH_TOZERO)
ret, thresh5 = cv2.threshold(img, 120, 255, cv2.THRESH_TOZERO_INV)
# the window showing output images
# with the corresponding thresholding
# techniques applied to the input images
print(len(new_array))
cv2.imshow('Binary Threshold', new_array[499])
cv2.imshow('Binary Threshold Inverted', thresh2)
cv2.imshow('Truncated Threshold', thresh3)
cv2.imshow('Set to 0', thresh4)
cv2.imshow('Set to 0 Inverted', thresh5)
# De-allocate any associated memory usage
if cv2.waitKey(0) & 0xff == 27:
cv2.destroyAllWindows()
I am working on a project where I intend to load in images, reduce them to their RGB values in a list and resize them, then add them to an array later to be filtered. Then reload them later on and be filtered.
Right know, I am struggling to get images separated. I have reduced to this for getting this accomplished. I keep getting an output of all the RGB values separated into 40600 elements and can't pull them out after initial appending.
Any help figuring out how I can get them to be added in [] and separated while being appending the resized versions, and returning the separated images in a multidimensional array?
Loading in using the same libraries, I couldn't resize them correctly. I reduced down to this method.
I have a high resolution image with a size of 1024x1024.
I am trying to downscale it to 256x256 with a code below but it gives unwanted artifacts in the top row.
Here is the python code I tried.
img = np.array(Image.open('image.png'))
res = cv2.resize(img, dsize=(256, 256), interpolation=cv2.INTER_AREA)
Lanczos gives the same results below.
img = np.array(Image.open('image.png'))
res = cv2.resize(img, dsize=(256, 256), interpolation=cv2.INTER_LANCZOS4)
How should I remove this artifacts?
Image comes from the front end in PIL I preprocess it but it giving me a different shape than expected.
my code is
def preprocess(img):
img = np.array(img)
resized = cv2.resize(img, (254, 254))
img = tf.keras.preprocessing.image.img_to_array(resized)/255
img = np.array([img])
return img
this is a pil image
<PIL.JpegImagePlugin.JpegImageFile image mode=L size=2144x1805 at
0x229615E2488>
And when I preprocess it, it gives this shape
the shape of the test image is (1, 254, 254, 1)
and when I try the preprocess code outside of my project it works fine.
This means that your Network needs an image not in grayscale colors and you're unconsciously trimming that info. I suggest you to try resized = cv2.resize(img, (254, 254, 3)) instead of resized = cv2.resize(img, (254, 254)). Because otherwise your telling to the resize method implicitly that you want an image on grayscale and you'll miss that last dimension.
I have an image that looks like this:
And this is the processed image
I have tried pretty much everything. I processed the image like this:
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #Converting to GrayScale
(h, w) = gray.shape[:2]
gray = cv2.resize(gray, (w*2, h*2))
thresh = cv2.threshold(gray, 150, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
gray = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, rectKernel)
blur = cv2.GaussianBlur(gray,(1,1),cv2.BORDER_DEFAULT)
text = pytesseract.image_to_string(blur, config="--oem 1 --psm 6")
But Tesseract doesnt print out anything. I am using this version of tesseract
5.0.0-alpha.20201127
How do I improve it's performance? Its highly unreliable.
Edit:
The answer below did a wonderful job on the said image.
But when I apply this technique to image like this one I get wrong output
Why is that? They seem roughly the same.
The problem is characters are not in center of the image.
Sometimes, tesseract have difficulty recognizing the characters or digit if they are not on the center.
Therefore my suggestion is:
Center the characters
Up-sample and convert to gray-scale
Centering the characters:
cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
50 is just a padding variable, you can set to any other value.
The background turns blue because of the value. OpenCV read the image in BGR fashion. giving 255 as an input is same as [255, 0, 0] which is display blue channel, but not green and red respectively.
You can try with other values. For me it won't matter, since I'll convert it to gray-scale on the next step.
Up-sampling and converting to gray-scale:
The same steps you have done. The first three-line of your code.
Now when you read:
MEHVISH MUQADDAS
Code:
import cv2
import pytesseract
# Load the image
img = cv2.imread("onf0D.jpg")
# Center the image
img = cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
# Up-sample
img = cv2.resize(img, (0, 0), fx=2, fy=2)
# Convert to gray-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# OCR
txt = pytesseract.image_to_string(gry, config="--psm 6")
print(txt)
Read more tesseract-improve-quality.
You don't need to do threshold, GaussianBlur or morphologyEx.
The reasons are:
Simple-Threshold is used to get the features of the image. Input images' features are already available.
You don't have to smooth the image, there is no illumination effect on the image.
You don't need to do segmentation, since background is plain-white.
Update-1
The second image requires pre-processing. However, applying simple-threshold won't work on this image. You need to remove the background using a binary mask, then you can apply OCR.
Result of the binary-mask:
Now, if you apply OCR:
IRUM FEROZ
Code:
import cv2
import numpy as np
import pytesseract
# Load the image
img = cv2.imread("jCMft.jpg")
# Center the image
img = cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
# Up-sample
img = cv2.resize(img, (0, 0), fx=2, fy=2)
# Convert to HSV color-space
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Adaptive-Threshold
msk = cv2.inRange(hsv, np.array([0, 0, 0]), np.array([179, 255, 130]))
# OCR
txt = pytesseract.image_to_string(msk, config="--psm 6")
print(txt)
Q:How do I find the lower and upper bounds of the cv2.inRange method?
A: You can use the following script.
Q: What did you change in the second image?
A: First I converted image to the HSV format, instead of gray-scale. The reason is I wanted remove the background. If you experiment with adaptiveThreshold you will see there are a lot of artifacts on the background limits the tesseract recognition. Then I used cv2.inRange to get a binary mask. Feeding binary-mask to the input gave me the desired result.
I have the following code to load an image:
img = imread(os.path.join('./Faces/','10.png'))
print(img.shape)
img = np.mean(img, axis=2)
img = img.astype(int)
print(img.shape)
The output of this code is as follows:
(200, 180, 3)
(200, 180)
I understand that I'm averaging out the RGB layers into a greyscale value, but I have my Keras input layer defined with shape (200, 280, 1). Is there a way to have the shape changed to this? Is there even a functional difference between having a matrix of the two shapes as outputted above?
You could use the expand_dims function in numpy (see documentation).
It works as follows in your case:
img = img.astype(int)
print(img.shape)
# Prints (100, 100)
img = np.expand_dims(img, axis=2)
print(img.shape)
# Prints (100, 100, 1)
You shouldn't average out the channels. There's a particular balance between the RGB channels to transform a picture to grayscale, and it's not conveniently 0.33% each. It's that:
((0.3*R) + (0.59*G) + (0.11*B))
Instead of averaging or doing it manually, I suggest that you use:
import cv2
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
Then add a dimension:
img = img[..., np.newaxis]
or
img = np.expand_dims(img, -1)
The functional difference is that obviously, your CNN will not see color if you turn it into grayscale. So it won't be able to use this information to classify.