Is it possible to somehow make it so that all text in a document is black on white after thresholding. I've been looking online alot but I haven't been able to come to a solution. My current thresholded image is: https://i.ibb.co/Rpqcp7v/thresh.jpg
The document needs to be read by an OCR and for that I need to have the areas that are currently white on black, to be inverted. How would I go about doing this? my current code:
# thresholding
def thresholding(image):
# thresholds the image into a binary image (black and white)
return cv2.threshold(image, 120, 255, cv2.THRESH_BINARY)[1]
Use a median filter to estimate the dominant color (background).
Then subtract the image from that... you'll get white text on black background. I'm using the absolute difference. Invert for black on white.
im = cv.imread("thresh.jpg", cv.IMREAD_GRAYSCALE)
im = cv.pyrDown(cv.pyrDown(im)) # picture too large for stack overflow
bg = cv.medianBlur(im, 51) # suitably large kernel to cover all text
out = 255 - cv.absdiff(bg, im)
Related
How can I grab an image from a region and properly use tesseract to translate to text? I got this currently:
img = ImageGrab.grab(bbox =(1341,182, 1778, 213))
tesstr = pytesseract.image_to_string(np.array(img), lang ='eng')
print (tesstr)
Issue is that it translates it incredibly wrong because the region it's getting the text from is in red with blue background, how can I improve its accuracy? Example of what it's trying to turn from image to text:
*Issue is that it translates it incredibly wrong because the region it's getting the text from is in red with blue background, how can I improve its accuracy? *
You should know the Improving the quality of the output. You need to try each of the suggested method listed. If you still can't achieve the desired result, you should look at the other methods:
Thresholding Operations using inRange
Changing Colorspaces
Image segmentation
To get the desired result, you need to get the binary mask of the image. Both simple threshold, and adaptive-threshold won't work for the input image.
To get the binary mask
Up-sample and convert input image to the HSV color-space
Set lower and higher color boundaries.
Result:
The OCR output for 0.37 version will be:
Day 20204, 16:03:12: Your ‘Metal Triangle Foundation’
was destroved!
Code:
import cv2
import numpy as np
import pytesseract
# Load the image
img = cv2.imread("b.png")
# Up-sample
img = cv2.resize(img, (0, 0), fx=2, fy=2)
# Convert to HSV color-space
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Get the binary mask
msk = cv2.inRange(hsv, np.array([0, 0, 123]), np.array([179, 255, 255]))
# OCR
txt = pytesseract.image_to_string(msk)
print(txt)
# Display
cv2.imshow("msk", msk)
cv2.waitKey(0)
There is an option in the Tesseract API such that you are able to increase the DPI at which you examine the image to detect text. Higher the DPI, hihger the precision, till diminishing returns set in. More processing power is required. DPI should not exceed original image DPI.
So I am trying to make a neural network that categorizes resistor strength by recognizing the color bands. Before I get to that step I want to use OpenCV to threshold all the colors except the resistor bands so that it is easier for the neural network to categorize. However I do not know what threshold type is best suited for this.
I tried several ranges of HLS, RGB, and HSV, but they all do not get rid of the background of the resistor.
Note: I have already used contours to get rid of the background, so now all that is left is the resistor with the colored lines on it.
HLS in my case got rid of the colors, but kept the resistor background, as shown in the code below
frame_HLS = cv2.cvtColor(masked_data, cv2.COLOR_BGR2HLS)
frame_threshold = cv2.inRange(frame_HLS, (50, 0, 0), (139, 149, 255))
Here is an image of the original image, and the HLS output
So overall, I am just wondering if anyone knows if the other color modes like LUV work well for this, or whether or not I will just have to use contours or other methods to separate them.
You're on the right track and color thresholding is a great approach to segmenting the resistor. Currently, the thresholding is performing correctly, you just need to do a few simple steps to remove the background.
I tried several ranges of HLS, RGB, and HSV, but they all do not get rid of the background of the resistor.
To remove the background we can make use of the binary mask that cv2.inRange() generated. We simply use cv2.bitwise_and() and convert all black pixels on the mask to white with these two lines
result = cv2.bitwise_and(original, original, mask=frame_threshold)
result[frame_threshold==0] = (255,255,255)
Here's the masked image of what you currently have (left) and after removing the background (right)
import cv2
image = cv2.imread('1.png')
original = image.copy()
frame_HLS = cv2.cvtColor(image, cv2.COLOR_BGR2HLS)
frame_threshold = cv2.inRange(frame_HLS, (50, 0, 0), (139, 149, 255))
result = cv2.bitwise_and(original, original, mask=frame_threshold)
result[frame_threshold==0] = (255,255,255)
cv2.imshow('result', result)
cv2.waitKey()
However I do not know what threshold type is best suited for this.
Right now you're using color thresholding, you could continue using this method and experiment with other ranges in the HLS, RGB, or HSV color space. In all of these cases, you can remove the background by converting in all black pixels on the mask to white. If you decide to pivot to another thresholding method, take a look at Otsu's threshold or Adaptive thresholding which automatically calculates the threshold value.
I am inspired by the following blogpost, however I am struggling with step 2/3.
I want to creates a binary image from a gray image based on the threshold values and ultimately displaying all white lines on the image. My desired output looks as follows:
First, I want to isolate the soccer field by using colour-thresholding and morphology.
def isolate_field(img):
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# find green pitch
light_green = np.array([40, 40, 40])
dark_green = np.array([70, 255, 255])
mask = cv2.inRange(hsv, light_green, dark_green)
# removing small noises
kernel = np.ones((5, 5), np.uint8)
opening = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
# apply mask over original frame
return cv2.bitwise_and(frame, frame, mask=opening)
This gives the following output:
I am happy with the results so far, but because of the large shadow I am struggling with the image-processing when I grayscale the picture. As a result, the binary thresholding is based on the sunny part in the upper-left corner instead of the white lines around the soccer field.
Following the methodology on the tutorials I get the following output for the simple thresholding:
and adaptive thresholding:
and finally, Otsu's thresholding:
How can I make sure that the white lines become more visible? I was thinking about cropping the frame so I only see the field and then use a mask based on the color white. That didn't work out unfortunately.
Help is much appreciated,
You can modify inRange to also exclude saturated colors (meaning the greens). I don't have your original image, so I used your intermediate result:
The result of inRange is the binary image you want. I expect you can achieve better results with the original image. I used this script in the image - which makes it easy to search for good HSV values.
i am working on a puzzle, my final task here is to identify edge type of the puzzle piece.
as shown in the above image i have mange to rotate and crop out every edge of the piece in same angle. my next step is to separate the edge line into a separate image like as shown in the image bellow
then to fill up one side of the line with with a color and try to process it to decide what type of edge it is.
i dont see a proper way to separate the edge line from the image for now.
my approach::
one way to do is scan pixel by pixel and find the black pixels where there is a nun black pixel next to it. this is a code that i can implement. but it feels like a primitive and a time consuming approach.
so if there you can offer any help or ideas, or any completely different way to detect the hollows and humps.
thanks in advance..
First convert your color image to grayscale. Then apply a threshold, say zero to obtain a binary image. You may have to use morphological operations to further process the binary image if there are holes. Then find the contours of this image and draw them to a new image.
A simple code is given below, using opencv 4.0.1 in python 2.7.
bgr = cv2.imread('puzzle.png')
gray = cv2.cvtColor(bgr, cv2.COLOR_BGR2GRAY)
_, roi = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY)
cv2.imwrite('/home/dhanushka/stack/roi.png', roi)
cont = cv2.findContours(roi, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
output = np.zeros(gray.shape, dtype=np.uint8)
cv2.drawContours(output, cont[0], -1, (255, 255, 255))
# removing boundary
boundary = 255*np.ones(gray.shape, dtype=np.uint8)
boundary[1:boundary.shape[0]-1, 1:boundary.shape[1]-1] = 0
toremove = output & boundary
output = output ^ toremove
Is there a way to tell whether an image as a white background using python and what could be a good strategy to get a "percentage of confidence" about this question? Seems like the literature on internet doesn't cover exactly this case and I can't find anything strictly related.
The images I want to analyze are typical e-commerce website product pictures, so they should have a single focused object in the middle and white background only at the borders.
Another information that could be available is the max percentage of image space the object should occupy.
I would go with something like this.
Reduce the contrast of the image by making the brightest, whitest pixel something like 240 instead of 255 so that the whites generally found within the image and within parts of the product are no longer pure white.
Put a 1 pixel wide white border around your image - that will allow the floodfill in the next step to "flow" all the way around the edge (even if the "product" touches the edges of the frame) and "seep" into the image from all borders/edges.
Floofdill your image starting at the top-left corner (which is necessarily pure white after step 2) and allow a tolerance of 10-20% when matching the white in case the background is off-white or slightly shadowed, and the white will flow into your image all around the edges until it reaches the product in the centre.
See how many pure white pixels you have now - these are the background ones. The percentage of pure white pixels will give you an indicator of confidence in the image being a product on a whitish background.
I would use ImageMagick from the command line like this:
convert product.jpg +level 5% -bordercolor white -border 1 \
-fill white -fuzz 25% -draw "color 0,0 floodfill" result.jpg
I will put a red border around the following 2 pictures just so you can see the edges on StackOverflow's white background, and show you the before and after images - look at the amount of white in the resulting images (there is none in the second one because it didn't have a white background) and also at the shadow under the router to see the effect of the -fuzz.
Before
After
If you want that as a percentage, you can make all non-white pixels black and then calculate the percentage of white pixels like this:
convert product.jpg -level 5% \
-bordercolor white -border 1 \
-fill white -fuzz 25% -draw "color 0,0 floodfill" -shave 1 \
-fuzz 0 -fill black +opaque white -format "%[fx:int(mean*100)]" info:
62
Before
After
ImageMagick has Python bindings so you could do the above in Python - or you could use OpenCV and Python to implement the same algorithm.
This question may be years ago but I just had a similar task recently. Sharing my answer here might help others that will encounter the same task too and I might also improve my answer by having the community look at it.
import cv2 as cv
import numpy as np
THRESHOLD_INTENSITY = 230
def has_white_background(img):
# Read image into org_img variable
org_img = cv.imread(img, cv.IMREAD_GRAYSCALE)
# cv.imshow('Original Image', org_img)
# Create a black blank image for the mask
mask = np.zeros_like(org_img)
# Create a thresholded image, I set my threshold to 200 as this is the value
# I found most effective in identifying light colored object
_, thres_img = cv.threshold(org_img, 200, 255, cv.THRESH_BINARY_INV)
# Find the most significant contours
contours, hierarchy = cv.findContours(thres_img, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_NONE)
# Get the outermost contours
outer_contours_img = max(contours, key=cv.contourArea)
# Get the bounding rectangle of the contours
x,y,w,h = cv.boundingRect(outer_contours_img)
# Draw a rectangle base on the bounding rectangle of the contours to our mask
cv.rectangle(mask,(x,y),(x+w,y+h),(255,255,255),-1)
# Invert the mask so that we create a hole for the detected object in our mask
mask = cv.bitwise_not(mask)
# Apply mask to the original image to subtract it and retain only the bg
img_bg = cv.bitwise_and(org_img, org_img, mask=mask)
# If the size of the mask is similar to the size of the image then the bg is not white
if h == org_img.shape[0] and w == org_img.shape[1]:
return False
# Create a np array of the
np_array = np.array(img_bg)
# Remove the zeroes from the "remaining bg image" so that we dont consider the black part,
# and find the average intensity of the remaining pixels
ave_intensity = np_array[np.nonzero(np_array)].mean()
if ave_intensity > THRESHOLD_INTENSITY:
return True
else:
return False
These are the images of the steps from the code above:
Here is the Original Image. No copyright infringement intended.
(Cant find the url of the actual imagem from unsplash)
First step is to convert the image to grayscale.
Apply thresholding to the image.
Get the contours of the "thresholded" image and get the contours. Drawing the contours is optional only.
From the contours, get the values of the outer contour and find its bounding rectangle. Optionally draw the rectangle to the image so that you'll see if your assumed thresholding value fits the object in the rectangle.
Create a mask out of the bounding rectangle.
Lastly, subtract the mask to the greyscale image. What will remain is the background image minus the mask.
To Finally identify if the background is white, find the average intensity values of the background image excluding the 0 values of the image array. And base on a certain threshold value, categorize it if its white or not.
Hope this helps. If you think it can still be improve, or if there are flaws with my solution pls comment below.
The most popular image format is .png. PNG image can have a transparent color (alpha). Often match with the white background page. With pillow is easy to find out which pixels are transparent.
A good starting point:
from PIL import Image
img = Image.open('image.png')
img = img.convert("RGBA")
pixdata = img.load()
for y in xrange(img.size[1]):
for x in xrange(img.size[0]):
pixel = pixdata[x, y]
if pixel[3] == 255:
# tranparent....
Or maybe it's enough if you check if top-left pixel it's white:
pixel = pixdata[0, 0]
if item[0] == 255 and item[1] == 255 and item[2] == 255:
# it's white