I am trying to OCR the picture of documents and my current approach is
Read an image as a grayscale
Binarize it thresholding
Wrap perspective along the contours obtained from cv2.findContours()
The above works well if image is not shadowed. Now I want to get contours of shadowed pictures. My first attempt is to use cv2.adaptiveThreshold for step 2. The adaptive threshold successfully weakened the shadow but the resulted image lost the contrast between the paper and the background. That made cv2 impossible to find contours of the paper. So I need to use other method to remove the shadow.
Is there any way to remove shadow maintaining the background colour?
For reference here is the sample picture I am processing with various approaches. From left, I did
grayscale
thresholding
adaptive thresholdin
normalization
My goal is to obtain the second picture without shadow.
Please note that I actually have a temporary solution specifically to the picture which is to process the part of the picture with shadow separately. Yet, it is not the general solution to shadowed picture as its performance depends on the size, shape and position of a shadow so please use other methods.
This is the original picture.
Here is one way in Python/OpenCV using division normalization, optionally followed by sharpening and/or thresholding.
Input:
import cv2
import numpy as np
import skimage.filters as filters
# read the image
img = cv2.imread('receipt.jpg')
# convert to gray
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# blur
smooth = cv2.GaussianBlur(gray, (95,95), 0)
# divide gray by morphology image
division = cv2.divide(gray, smooth, scale=255)
# sharpen using unsharp masking
sharp = filters.unsharp_mask(division, radius=1.5, amount=1.5, multichannel=False, preserve_range=False)
sharp = (255*sharp).clip(0,255).astype(np.uint8)
# threshold
thresh = cv2.threshold(sharp, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# save results
cv2.imwrite('receipt_division.png',division)
cv2.imwrite('receipt_division_sharp.png',sharp)
cv2.imwrite('receipt_division_thresh.png',thresh)
# show results
cv2.imshow('smooth', smooth)
cv2.imshow('division', division)
cv2.imshow('sharp', sharp)
cv2.imshow('thresh', thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()
Division:
Sharpened:
Thresholded:
Related
Using the threshold functions in open CV on an image to get a binary image, with Otsu's thresholding I get a image that has white spots due to different lighting conditions in parts of the image
or with adaptive threshold to fix the lighting conditions, it fails to accurately represent the pencil-filled bubbles that Otsu actually can represent.
How can I get both the filled bubbles represented and a fixed lighting conditions without patches?
Here's the original image
Here is my code
#binary image conversion
thresh2 = cv2.adaptiveThreshold(papergray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, 21, 13)
thresh = cv2.threshold(papergray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
cv2.imshow("Binary", thresh) #Otsu's
cv2.imshow("Adpative",thresh2)
An alternative approach would be to apply a morphological closing, which would remove all the drawing, yielding an estimate of the illumination level. Dividing the image by the illumination level gives you an image of the sheet corrected for illumination:
In this image we can easily apply a global threshold:
I used the following code:
import diplib as dip
img = dip.ImageRead('tlCw6.jpg')(1)
corrected = img / dip.Closing(img, dip.SE(40, 'rectangular'))
out = dip.Threshold(corrected, method='triangle')[0]
This can be done with cv2.ADAPTIVE_THRESH_MEAN_C:
import cv2
img = cv2.imread("omr.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, 31, 10)
cv2.imshow("Mean Adaptive Thresholding", thresh)
cv2.waitKey(0)
The output is:
Problems with your approach:
The methods you have tried out:
Otsu threshold is decided based on all the pixel values in the entire image (global technique). If you look at the bottom-left of your image, there is a gray shade which can have an adverse effect in deciding the threshold value.
Adaptive threshold: here is a recent answer on why it isn't helpful. In short, it acts like an edge detector for smaller kernel sizes
What you can try:
OpenCV's ximgproc module has specialized binary image generation methods. One such method is the popular Niblack threshold technique.
This is a local threshold technique that depends on statistical measures. It divides the image into blocks (sub-images) of size predefined by the user. A threshold is set based on the mean minus k times standard deviation of pixel values for each block. The k is decided by the user.
Code:
img =cv2.imread('omr_sheet.jpg')
blur = cv2.GaussianBlur(img, (3, 3), 0)
gray = cv2.cvtColor(blur, cv2.COLOR_BGR2GRAY)
niblack = cv2.ximgproc.niBlackThreshold(gray, 255, cv2.THRESH_BINARY, 41, -0.1, binarizationMethod=cv2.ximgproc.BINARIZATION_NICK)
Result:
Links:
To know more about cv2.ximgproc.niBlackThreshold
There are other binarization techniques available that you may want to explore. It also contains links to research papers that explain each of these techniques on detail.
Edit:
Adaptive threshold actually works if you know what you are working with. You can decide the kernel size beforehand.
See Prashant's answer.
I'm working in a script using different OpenCV operations for processing an image with solar panels in a house roof. My original image is the following:
After processing the image, I get the edges of the panels as follows:
It can be seen how some rectangles are broken due to reflection of the Sun in the picture.
I would like to know if it's possible to fix those broken rectangles, maybe by using the pattern of those which are not broken.
My code is the following:
# Load image
color_image = cv2.imread("google6.jpg")
cv2.imshow("Original", color_image)
# Convert to gray
img = cv2.cvtColor(color_image, cv2.COLOR_BGR2GRAY)
# Apply various filters
img = cv2.GaussianBlur(img, (5, 5), 0)
img = cv2.medianBlur(img, 5)
img = img & 0x88 # 0x88
img = cv2.fastNlMeansDenoising(img, h=10)
# Invert to binary
ret, thresh = cv2.threshold(img, 127, 255, 1)
# Perform morphological erosion
kernel = np.ones((5, 5),np.uint8)
erosion = cv2.morphologyEx(thresh, cv2.MORPH_ERODE, kernel, iterations=2)
# Invert image and blur it
ret, thresh1 = cv2.threshold(erosion, 127, 255, 1)
blur = cv2.blur(thresh1, (10, 10))
# Perform another threshold on blurred image to get the central portion of the edge
ret, thresh2 = cv2.threshold(blur, 145, 255, 0)
# Perform morphological erosion to thin the edge by ellipse structuring element
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
contour = cv2.morphologyEx(thresh2, cv2.MORPH_ERODE, kernel1, iterations=2)
# Get edges
final = cv2.Canny(contour, 249, 250)
cv2.imshow("final", final)
I have tried to modify all the filters I'm using in order to reduce as much as possible the effect of the Sun in the original picture, but that is as far as I have been able to go.
I'm in general happy with the result of all those filters (although any advice is welcome), so I'd like to work on the black/white imaged I showed, which is already smooth enough for the post-processing I need to do.
Thansk!
The pattern is not broken in the original image, so it being broken in your binarized result must mean your binarization is not optimal.
You apply threshold() to binarize the image, and then Canny() to the binary image. The problems here are:
Thresholding removes a lot of information, this should always be the last step of any processing pipeline. Anything you lose here, you've lost for good.
Canny() should be applied to a gray-scale image, not a binary image.
The Canny edge detector is an edge detector, but you want to detect lines, not edges. See here for the difference.
So, I suggest starting from scratch.
The Laplacian of Gaussian is a very simple line detector. I took these steps:
Read in image, convert to grayscale.
Apply Laplacian of Gaussian with sigma = 2.
Invert (negate) the result and then set negative values to 0.
This is the output:
From here, it should be relatively straight-forward to identify the grid pattern.
I don't post code because I used MATLAB for this, but you can accomplish the same result in Python with OpenCV, here is a demo for applying the Laplacian of Gaussian in OpenCV.
This is Python + OpenCV code to replicate the above:
import cv2
color_image = cv2.imread("/Users/cris/Downloads/L3RVh.jpg")
img = cv2.cvtColor(color_image, cv2.COLOR_BGR2GRAY)
out = cv2.GaussianBlur(img, (0, 0), 2) # Note! Specify size of Gaussian by the sigma, not the kernel size
out = cv2.Laplacian(out, cv2.CV_32F)
_, out = cv2.threshold(-out, 0, 1e9, cv2.THRESH_TOZERO)
However, it looks like OpenCV doesn't linearize (apply gamma correction) when converting from BGR to gray, as the conversion function does that I used when creating the image above. I think this gamma correction might have improved the results a bit by reducing the response to the roof tiles.
I have some cropped images and I need images that have black texts on white background. Firstly I apply adaptive thresholding and then I try to remove noise. Although I tried a lot of noise removal techniques but when the image changed, the techniques I used failed.
The best method for converting image color to binary for my images is Adaptive Gaussian Thresholding. Here is my code:
im_gray = cv2.imread("image.jpg", cv2.IMREAD_GRAYSCALE)
image = cv2.GaussianBlur(im_gray, (5,5), 1)
th = cv2.adaptiveThreshold(image,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,3,2)
I need smooth values, Decimal separator(dot) and postfix letters. How can I do this?
Before binarization, it is necessary to correct the nonuniform illumination of the background. For example, like this:
import cv2
image = cv2.imread('9qBsB.jpg')
image=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
se=cv2.getStructuringElement(cv2.MORPH_RECT , (8,8))
bg=cv2.morphologyEx(image, cv2.MORPH_DILATE, se)
out_gray=cv2.divide(image, bg, scale=255)
out_binary=cv2.threshold(out_gray, 0, 255, cv2.THRESH_OTSU )[1]
cv2.imshow('binary', out_binary)
cv2.imwrite('binary.png',out_binary)
cv2.imshow('gray', out_gray)
cv2.imwrite('gray.png',out_gray)
Result:
You can do slightly better using division normalization in Python/OpenCV.
Input:
import cv2
import numpy as np
# load image
img = cv2.imread("license_plate.jpg")
# convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# blur
blur = cv2.GaussianBlur(gray, (0,0), sigmaX=33, sigmaY=33)
# divide
divide = cv2.divide(gray, blur, scale=255)
# otsu threshold
thresh = cv2.threshold(divide, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]
# apply morphology
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
morph = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
# write result to disk
cv2.imwrite("hebrew_text_division.jpg", divide)
cv2.imwrite("hebrew_text_division_threshold.jpg", thresh)
cv2.imwrite("hebrew_text_division_morph.jpg", morph)
# display it
cv2.imshow("gray", gray)
cv2.imshow("divide", divide)
cv2.imshow("thresh", thresh)
cv2.imshow("morph", morph)
cv2.waitKey(0)
cv2.destroyAllWindows()
Division Image:
Thresholded Image:
Morphology Cleaned Image:
Im assuming that you are preprocessing the image for OCR(Optical Character Recognition)
I had a project to detect license plates and these were the steps I did, you can apply them to your project. After greying the image try applying equalize histogram to the image, this allows the area's in the image with lower contrast to gain a higher contrast. Then blur the image to reduce the noise in the background. Next apply edge detection on the image, make sure that noise is sufficiently removed as ED is susceptible to it. Lastly, apply closing(dilation then erosion) on the image to close all the small holes inside the words.
Instead of erode and dilate, you can check this, that is basically both in one.
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,2))
morphology_img = cv2.morphologyEx(img_grey, cv2.MORPH_OPEN, kernel,iterations=1)
plt.imshow(morphology_img,'Greys_r')
MORPHOLOGICAL_TRANSFORMATIONS
I have tried to get the edge of the mask image with the following code:
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread('ISIC_0000000_segmentation.png',0)
edges = cv2.Canny(img,0,255)
plt.subplot(121), plt.imshow(img, cmap='gray')
plt.title('Original Image'), plt.xticks([]), plt.yticks([])
plt.subplot(122), plt.imshow(edges, cmap='gray')
plt.title('Edge Image'), plt.xticks([]), plt.yticks([])
plt.show
What I get is this:
But the edge isn't smooth for some reason.
My plan was to use the edge image to crop the following picture:
Does anyone know how I could make the edge image better and how I could use this to crop the normal image?
EDIT: #Mark Setchell made a good point: If i could use the mask image directly to crop the image that would be great.
Also: It is maybe possible to lay the normal image precisely on the mask image so that the black area on the mask would cover the blue-ish area on the normal picture.
EDIT: #Mark Setchell introduced the idea of multiplying the normale image with the mask image so what the background would result in 0(black) and the rest would keep its color. Would it be a problem when my mask image is .png and my normal picture is .jpg when multiplying?
EDIT:
I have written the following code to try to multiply two pictures:
# Importing Image and ImageChops module from PIL package
from PIL import Image, ImageChops
# creating a image1 object
im1 = Image.open("ISIC_0000000.jpg")
# creating a image2 object
im2 = Image.open("ISIC_0000000_segmentation.png")
# applying multiply method
im3 = ImageChops.multiply(im1, im2)
im3.show()
But I get the error:
ValueError: images do not match
Does anyone know how I could solve this?
If I understand correctly, you want to extract the object and remove the background. To do this, you can just do a simple cv2.bitwise_and() with the mask and the original input image.
Does anyone know how I could make the edge image better and how I could use this to crop the normal image?
To extract the background from the image, you don't need an edge image, the thresholded image can be used to remove only the desired parts of the image. You can use the mask image to directly drop the image and remove the background. Other approaches of obtaining a binary mask include using a fixed threshold value, adaptive threshold, or Canny edge detection. Here's a simple example using Otsu's threshold to obtain a binary mask followed by a bitwise-and operation.
Here's the result with the removed background
You can also turn all pixels on the mask to white if you wanted the removed background to be white
Note: Depending on how "smooth" you want the result, you can apply any blur to the image before thresholding to smooth out the edges. This can include averaging, Gaussian, median, or bilaterial filtering.
Code
import cv2
# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove background using bitwise-and operation
result = cv2.bitwise_and(image, image, mask=thresh)
result[thresh==0] = [255,255,255] # Turn background white
cv2.imshow('thresh', thresh)
cv2.imshow('result', result)
cv2.waitKey()
The detected edge isn't smooth because the actual edge in the image isn't smooth. You can try filtering the original image first with low-pass filters.
If you can use contours, the following will work:
import numpy as np
import cv2
from matplotlib import pyplot as plt
# Read in image
imgRaw = cv2.imread('./Misc/edgesImg.jpg',0)
# Blur image
blurSize = 25
blurredImg = cv2.blur(imgRaw,(blurSize,blurSize))
# Convert to Binary
thrImgRaw, binImgRaw = cv2.threshold(imgRaw, 0, 255, cv2.THRESH_OTSU)
thrImgBlur, binImgBlur = cv2.threshold(blurredImg, 0, 255, cv2.THRESH_OTSU)
# Detect the contours in the image
contoursRaw = cv2.findContours(binImgRaw,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
contoursBlur = cv2.findContours(binImgBlur,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
# Draw all the contours
contourImgOverRaw = cv2.drawContours(imgRaw, contoursRaw[0], -1, (0,255,0),5)
contourImgOverBlur = cv2.drawContours(blurredImg, contoursBlur[0], -1, (0,255,0),5)
# Plotting
plt.figure()
plt.subplot(121)
plt.imshow(contourImgOverRaw)
plt.title('Raw Edges'), plt.xticks([]), plt.yticks([])
plt.subplot(122)
plt.imshow(contourImgOverBlur)
plt.title('Edges with {}px Blur'.format(blurSize)), plt.xticks([]), plt.yticks([])
plt.show()
EDIT: Here's more info on getting a mask of an image from contours.
You can your morphological operations to get the edge.
Sorry for using MATLAB:
I = imbinarize(rgb2gray(imread('I.png'))); %Load input image, and convert to binary image.
%Erode the image with mask 3x3
J = imerode(I, ones(3));
%Pefrom XOR operation (1 xor 1 = 0, 0 xor 0 = 0, 0 xor 1 = 1, 1 xor 0 = 1)
J = xor(I, J);
%Use "skeleton" operation to make sure eage thikness is 1 pixel.
K = bwskel(J);
Result:
As Mark mentioned, you don't need the edges for cropping (unless your are using special cropping method that I am not aware of).
I have an small project where I need to calculate the area of hair portions and tell which one covers greater area among two images. I have another code for hair extraction. However it is not also giving result as expected.You may have guessed already from image below. I will work on it later.
I am trying to calculate the area from contours which is giving me error like:
OpenCV(3.4.4) C:\projects\opencv-python\opencv\modules\imgproc\src\contours.cpp:195: error: (-210:Unsupported format or combination of formats) [Start]FindContours supports only CV_8UC1 images when mode != CV_RETR_FLOODFILL otherwise supports CV_32SC1 images only in function 'cvStartFindContours_Impl'
So, why is findContours not supporting my image?
Another approach:
I only need to find the area of hair portion. So, I thought of calculating area covered by all the white pixels and then subtract it from area of whole image too. In this case, I do not know how to calculate area covered by all white pixels. I thought this way because, hair color can vary, but background will always be white.
So, is this technique possible? Or please suggest some solution for above mentioned error?
My image:
My code:
import cv2
import numpy as np
img = cv2.imread("Hair.jpg")
_, contours, _ = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
c = max(contours, key = cv2.contourArea)
cv2.drawContours(img, [c], -1, (255,255, 255), -1)
area = cv2.contourArea(c)
print(area)
cv2.imshow("contour", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Your error already tells what is wrong, specifically this part:
FindContours supports only CV_8UC1 images when mode
This means that it has to be a greyscale image. You pass an image loaded with:
img = cv2.imread("Hair.jpg")
Which by default returns the image in CV_8UC3 or in simple words, BGR colorspace. Even if your image only has black and white. Solution, load as greyscale:
img = cv2.imread("Hair.jpg", cv2.IMREAD_GRAYSCALE)
Also, I notice that this is a .jpg file, which may introduce some artifacts that you may not like/want. To remove them, use threshold:
ret,thresh1 = cv.threshold(img,127,255,cv.THRESH_BINARY)
I hope this helps you, if not, leave a comment
Update:
findContours function takes black as background and white as foreground. In your case is the other way around. But there is an easy way to solve this, just invert the image when it is being passed:
_, contours, _ = cv2.findContours(255-img, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
255 is the max value the image can have, and this will turn black into white and white into black, giving you the correct contour.