How do i remove highlight from a printed text book in python? - python

I'm working on extracting the highlighted text from a text book. I have already done locating the highlight and extracting the text inside. To deal with the highlight, I converted the image to grayscale and used OTSU threshold to remove the background highlighted color.
This works great when the highlight is a light color like yellow or green but when the highlight is a dark color, the thresholding fails and i get black background covering most of the text which hinders the ocr reading.
I have tried normalising the brightness but it does not seem to work.
What I need is some way to determine the foreground and background color and then remove the background color. Or I need some way to dynamically threshold the image to get black text and white background.
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
normalized_gray = cv2.equalizeHist(gray)
(thresh, processed_image) = cv2.threshold(normalized_gray, 127, 255, cv2.THRESH_OTSU)
The test image:
https://ibb.co/856YtMx
Some test result:
When i run equalizeHist before thresholding.
https://ibb.co/HT0jpKW
When i run equalizeHist after thresholding.
https://ibb.co/ZXSz97J
When i use a Binary threshold, the text are blown away:
https://ibb.co/DLXywXz

e.g. something like this should work:
import cv2
import numpy as np
image = cv2.imread('photo-2019-08-12-12-44-59.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# adjust contrast
gray_contract = cv2.multiply(gray, 1.5)
# create a kernel for the erode
kernel = np.ones((2, 2), np.uint8)
img_eroded = cv2.erode(gray_contract, kernel, iterations=1)
# binarize with otsu
(thresh, otsu) = cv2.threshold(img_eroded, 127, 255,
cv2.THRESH_BINARY+cv2.THRESH_OTSU)
Also you can have a look at post How to remove shadow from scanned images using OpenCV

Adaptive threshold is what you need here.
My output with code. Can be fine tuned.
import cv2
import numpy as np
img = cv2.imread("high.jpg")
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gaus = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 20)
cv2.imshow("Gaussian", gaus)
cv2.waitKey(0)
cv2.imwrite('output.png', gaus)
UPDATE
Changed parameters to the adaptiveThreshold function, the second image you posted.
gaus = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 8)

Related

tesseract detects only 4 words from image

I have very simple python code:
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'C:\\Tesseract-OCR\\tesseract.exe'
img = cv2.imread('1.png')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
hImg,wImg,_ = img.shape
#detecting words
boxes = pytesseract.image_to_data(img)
for x,b in enumerate(boxes.splitlines()):
if x!=0:
b = b.split()
if len(b) == 12:
x,y,w,h = int(b[6]), int(b[7]), int(b[8]), int(b[9])
cv2.rectangle(img, (x,y), (w+x,h+y), (0,0,255), 3)
cv2.imshow('result', img)
cv2.waitKey(0)
But result was interesting. It detected only 4 words. what could it be the reason?
You'll have better OCR results if you improve the quality of the image you are giving Tesseract.
While tesseract version 3.05 (and older) handle inverted image (dark background and light text) without problem, for 4.x version use dark text on light background.
Convert from BGR to HLS to later remove background colors from the numbers in the top half of the image. Then, create a "blue" mask with cv2.inRange and replace anything that's not "blue" with the color white.
hls=cv2.cvtColor(img,cv2.COLOR_BGR2HLS)
# Define lower and upper limits for the number colors.
blue_lo=np.array([114, 70, 70])
blue_hi=np.array([154, 225, 225])
# Mask image to only select "blue"
mask=cv2.inRange(hls,blue_lo,blue_hi)
# copy original image
img1 = img.copy()
img1[mask==0]=(255,255,255)
Help pytesseract by converting the image to black and white
This is converting an image to black and white. Tesseract does this internally (Otsu algorithm), but the result can be suboptimal, particularly if the page background is of uneven darkness.
rgb = cv2.cvtColor(img1, cv2.COLOR_HLS2RGB)
gray = cv2.cvtColor(rgb, cv2.COLOR_RGB2GRAY)
_, img1 = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
cv2.imshow('img_to_binary',img1)
Use image_to_data over the previously created img1 and continue applying your existing code.
...
hImg,wImg,_ = img.shape
#detecting words
boxes = pytesseract.image_to_data(img1)
for x,b in enumerate(boxes.splitlines()):
...
...

How to improve Tesseract's output

I have an image that looks like this:
And this is the processed image
I have tried pretty much everything. I processed the image like this:
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #Converting to GrayScale
(h, w) = gray.shape[:2]
gray = cv2.resize(gray, (w*2, h*2))
thresh = cv2.threshold(gray, 150, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
gray = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, rectKernel)
blur = cv2.GaussianBlur(gray,(1,1),cv2.BORDER_DEFAULT)
text = pytesseract.image_to_string(blur, config="--oem 1 --psm 6")
But Tesseract doesnt print out anything. I am using this version of tesseract
5.0.0-alpha.20201127
How do I improve it's performance? Its highly unreliable.
Edit:
The answer below did a wonderful job on the said image.
But when I apply this technique to image like this one I get wrong output
Why is that? They seem roughly the same.
The problem is characters are not in center of the image.
Sometimes, tesseract have difficulty recognizing the characters or digit if they are not on the center.
Therefore my suggestion is:
Center the characters
Up-sample and convert to gray-scale
Centering the characters:
cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
50 is just a padding variable, you can set to any other value.
The background turns blue because of the value. OpenCV read the image in BGR fashion. giving 255 as an input is same as [255, 0, 0] which is display blue channel, but not green and red respectively.
You can try with other values. For me it won't matter, since I'll convert it to gray-scale on the next step.
Up-sampling and converting to gray-scale:
The same steps you have done. The first three-line of your code.
Now when you read:
MEHVISH MUQADDAS
Code:
import cv2
import pytesseract
# Load the image
img = cv2.imread("onf0D.jpg")
# Center the image
img = cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
# Up-sample
img = cv2.resize(img, (0, 0), fx=2, fy=2)
# Convert to gray-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# OCR
txt = pytesseract.image_to_string(gry, config="--psm 6")
print(txt)
Read more tesseract-improve-quality.
You don't need to do threshold, GaussianBlur or morphologyEx.
The reasons are:
Simple-Threshold is used to get the features of the image. Input images' features are already available.
You don't have to smooth the image, there is no illumination effect on the image.
You don't need to do segmentation, since background is plain-white.
Update-1
The second image requires pre-processing. However, applying simple-threshold won't work on this image. You need to remove the background using a binary mask, then you can apply OCR.
Result of the binary-mask:
Now, if you apply OCR:
IRUM FEROZ
Code:
import cv2
import numpy as np
import pytesseract
# Load the image
img = cv2.imread("jCMft.jpg")
# Center the image
img = cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
# Up-sample
img = cv2.resize(img, (0, 0), fx=2, fy=2)
# Convert to HSV color-space
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Adaptive-Threshold
msk = cv2.inRange(hsv, np.array([0, 0, 0]), np.array([179, 255, 130]))
# OCR
txt = pytesseract.image_to_string(msk, config="--psm 6")
print(txt)
Q:How do I find the lower and upper bounds of the cv2.inRange method?
A: You can use the following script.
Q: What did you change in the second image?
A: First I converted image to the HSV format, instead of gray-scale. The reason is I wanted remove the background. If you experiment with adaptiveThreshold you will see there are a lot of artifacts on the background limits the tesseract recognition. Then I used cv2.inRange to get a binary mask. Feeding binary-mask to the input gave me the desired result.

is it possible to extracting text inside colored background region using opencv in python?

I having the following table area from the original image:
I'm trying extract the text,from this table.But when using threshold the whole gray regions get darkening.For example like below,
The threshold type which i did used,
thresh_value = cv2.threshold(original_gray, 128, 255, cv2.THRESH_BINARY_INV +cv2.THRESH_OTSU)[1]
is it possible to extract and change gray background into white and lets remain text pixel as it is if black then?
You should use adaptive thresholding in Python/OpenCV.
Input:
import cv2
import numpy as np
# read image
img = cv2.imread("text_table.jpg")
# convert img to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# do adaptive threshold on gray image
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 11)
# write results to disk
cv2.imwrite("text_table_thresh.jpg", thresh)
# display it
cv2.imshow("thresh", thresh)
cv2.waitKey(0)
Result

Python OpenCV mask shows a black image

I'm experimenting with opencv's find contours for object detection. I want it to keep the blue portion of the python logo while masking away everything else, however my masked image is just a black window
I've tried different lower and upper boundary colours, different image forms, but everything results in a black window
import cv2
import numpy as np
img = cv2.imread('python logo.png')
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) #IMG now displaying in hsv
lower_sand = np.array([84, 89, 56]) #RGB FORMAT
upper_sand = np.array([178, 175, 104]) # RGB FORMAT
mask = cv2.inRange(hsv, lower_sand, upper_sand)
cv2.imshow("Masked logo", mask)
cv2.imshow("Python Logo", img)

removing pixels less than n size(noise) in an image - open CV python

i am trying to remove noise in an image less and am currently running this code
import numpy as np
import argparse
import cv2
from skimage import morphology
# Construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required = True,
help = "Path to the image")
args = vars(ap.parse_args())
# Load the image, convert it to grayscale, and blur it slightly
image = cv2.imread(args["image"])
cv2.imshow("Image", image)
cv2.imwrite("image.jpg", image)
greenLower = np.array([50, 100, 0], dtype = "uint8")
greenUpper = np.array([120, 255, 120], dtype = "uint8")
green = cv2.inRange(image, greenLower, greenUpper)
#green = cv2.GaussianBlur(green, (3, 3), 0)
cv2.imshow("green", green)
cv2.imwrite("green.jpg", green)
cleaned = morphology.remove_small_objects(green, min_size=64, connectivity=2)
cv2.imshow("cleaned", cleaned)
cv2.imwrite("cleaned.jpg", cleaned)
cv2.waitKey(0)
However, the image does not seem to have changed from "green" to "cleaned" despite using the remove_small_objects function. why is this and how do i clean the image up? Ideally i would like to isolate only the image of the cabbage.
My thought process is after thresholding to remove pixels less than 100 in size, then smoothen the image with blur and fill up the black holes surrounded by white - that is what i did in matlab. If anybody could direct me to get the same results as my matlab implementation, that would be greatly appreciated. Thanks for your help.
Edit: made a few mistakes when changing the code, updated to what it currently is now and display the 3 images
image:
green:
clean:
my goal is to get somthing like this picture below from matlab implementation:
Preprocessing
A good idea when you're filtering an image is to lowpass the image or blur it a bit; that way neighboring pixels become a little more uniform in color, so it will ease brighter and darker spots on the image and keep holes out of your mask.
img = cv2.imread('image.jpg')
blur = cv2.GaussianBlur(img, (15, 15), 2)
lower_green = np.array([50, 100, 0])
upper_green = np.array([120, 255, 120])
mask = cv2.inRange(blur, lower_green, upper_green)
masked_img = cv2.bitwise_and(img, img, mask=mask)
cv2.imshow('', masked_img)
cv2.waitKey()
Colorspace
Currently, you're trying to contain an image by a range of colors with different brightness---you want green pixels, regardless of whether they are dark or light. This is much more easily accomplished in the HSV colorspace. Check out my answer here going in-depth on the HSV colorspace.
img = cv2.imread('image.jpg')
blur = cv2.GaussianBlur(img, (15, 15), 2)
hsv = cv2.cvtColor(blur, cv2.COLOR_BGR2HSV)
lower_green = np.array([37, 0, 0])
upper_green = np.array([179, 255, 255])
mask = cv2.inRange(hsv, lower_green, upper_green)
masked_img = cv2.bitwise_and(img, img, mask=mask)
cv2.imshow('', masked_img)
cv2.waitKey()
Removing noise in a binary image/mask
The answer provided by ngalstyan shows how to do this nicely with morphology. What you want to do is called opening, which is the combined process of eroding (which more or less just removes everything within a certain radius) and then dilating (which adds back to any remaining objects however much was removed). In OpenCV, this is accomplished with cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel). The tutorials on that page show how it works nicely.
img = cv2.imread('image.jpg')
blur = cv2.GaussianBlur(img, (15, 15), 2)
hsv = cv2.cvtColor(blur, cv2.COLOR_BGR2HSV)
lower_green = np.array([37, 0, 0])
upper_green = np.array([179, 255, 255])
mask = cv2.inRange(hsv, lower_green, upper_green)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (15, 15))
opened_mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
masked_img = cv2.bitwise_and(img, img, mask=opened_mask)
cv2.imshow('', masked_img)
cv2.waitKey()
Filling in gaps
In the above, opening was shown as the method to remove small bits of white from your binary mask. Closing is the opposite operation---removing chunks of black from your image that are surrounded by white. You can do this with the same idea as above, but using cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel). This isn't even necessary after the above in your case, as the mask doesn't have any holes. But if it did, you could close them up with closing. You'll notice my opening step actually removed a small bit of the plant at the bottom. You could actually fill those gaps with closing first, and then opening to remove the spurious bits elsewhere, but it's probably not necessary for this image.
Trying out new values for thresholding
You might want to get more comfortable playing around with different colorspaces and threshold levels to get a feel for what will work best for a particular image. It's not complete yet and the interface is a bit wonky, but I have a tool you can use online to try out different thresholding values in different colorspaces; check it out here if you'd like. That's how I quickly found values for your image.
Although, the above problem is solved using cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel). But, if somebody wants to use morphology.remove_small_objects to remove area less than a specified size, for those this answer may be helpful.
Code I used to remove noise for above image is:
import numpy as np
import cv2
from skimage import morphology
# Load the image, convert it to grayscale, and blur it slightly
image = cv2.imread('im.jpg')
cv2.imshow("Image", image)
#cv2.imwrite("image.jpg", image)
greenLower = np.array([50, 100, 0], dtype = "uint8")
greenUpper = np.array([120, 255, 120], dtype = "uint8")
green = cv2.inRange(image, greenLower, greenUpper)
#green = cv2.GaussianBlur(green, (3, 3), 0)
cv2.imshow("green", green)
cv2.imwrite("green.jpg", green)
imglab = morphology.label(green) # create labels in segmented image
cleaned = morphology.remove_small_objects(imglab, min_size=64, connectivity=2)
img3 = np.zeros((cleaned.shape)) # create array of size cleaned
img3[cleaned > 0] = 255
img3= np.uint8(img3)
cv2.imshow("cleaned", img3)
cv2.imwrite("cleaned.jpg", img3)
cv2.waitKey(0)
Cleaned image is shown below:
To use morphology.remove_small_objects, first labeling of blobs is essential. For that I use imglab = morphology.label(green). Labeling is done like, all pixels of 1st blob is numbered as 1. similarly, all pixels of 7th blob numbered as 7 and so on. So, after removing small area, remaining blob's pixels values should be set to 255 so that cv2.imshow() can show these blobs. For that I create an array img3 of the same size as of cleaned image. I used img3[cleaned > 0] = 255 line to convert all pixels which value is more than 0 to 255.
It seems what you want to remove is a disconnected group of small blobs.
I think erode() will do a good job remove them with the right kernel.
Given an nxn kernel, erode moves the kernel through the image and replaces the center pixel by the minimum pixel in the kernel.
Then you can dilate() the resulting image to restore eroded edges of the green part.
Another option would be to use fastndenoising
##### option 1
kernel_size = (5,5) # should roughly have the size of the elements you want to remove
kernel_el = cv2.getStructuringElement(cv2.MORPH_RECT, kernel_size)
eroded = cv2.erode(green, kernel_el, (-1, -1))
cleaned = cv2.dilate(eroded, kernel_el, (-1, -1))
##### option 2
cleaned = cv2.fastNlMeansDenoisingColored(green, h=10)

Categories