Remove features from binarized image - python

I wrote a little script to transform pictures of chalkboards into a form that I can print off and mark up.
I take an image like this:
Auto-crop it, and binarize it. Here's the output of the script:
I would like to remove the largest connected black regions from the image. Is there a simple way to do this?
I was thinking of eroding the image to eliminate the text and then subtracting the eroded image from the original binarized image, but I can't help thinking that there's a more appropriate method.

Sure you can just get connected components (of certain size) with findContours or floodFill, and erase them leaving some smear. However, if you like to do it right you would think about why do you have the black area in the first place.
You did not use adaptive thresholding (locally adaptive) and this made your output sensitive to shading. Try not to get the black region in the first place by running something like this:
Mat img = imread("desk.jpg", 0);
Mat img2, dst;
pyrDown(img, img2);
adaptiveThreshold(255-img2, dst, 255, ADAPTIVE_THRESH_MEAN_C,
THRESH_BINARY, 9, 10); imwrite("adaptiveT.png", dst);
imshow("dst", dst);
waitKey(-1);
In the future, you may read something about adaptive thresholds and how to sample colors locally. I personally found it useful to sample binary colors orthogonally to the image gradient (that is on the both sides of it). This way the samples of white and black are of equal size which is a big deal since typically there are more background color which biases estimation. Using SWT and MSER may give you even more ideas about text segmentation.

I tried this:
import numpy as np
import cv2
im = cv2.imread('image.png')
gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
grayout = 255*np.ones((im.shape[0],im.shape[1],1), np.uint8)
blur = cv2.GaussianBlur(gray,(5,5),1)
thresh = cv2.adaptiveThreshold(blur,255,1,1,11,2)
contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
wcnt = 0
for item in contours:
area =cv2.contourArea(item)
print wcnt,area
[x,y,w,h] = cv2.boundingRect(item)
if area>10 and area<200:
roi = gray[y:y+h,x:x+w]
cntd = 0
for i in range(x,x+w):
for j in range(y,y+h):
if gray[j,i]==0:
cntd = cntd + 1
density = cntd/(float(h*w))
if density<0.5:
for i in range(x,x+w):
for j in range(y,y+h):
grayout[j,i] = gray[j,i];
wcnt = wcnt + 1
cv2.imwrite('result.png',grayout)
You have to balance two things, removing the black spots but balance that with not losing the contents of what is on the board. The output I got is this:

Here is a Python numpy implementation (using my own mahotas package) of the method for the top answer (almost the same, I think):
import mahotas as mh
import numpy as np
Imported mahotas & numpy with standard abbreviations
im = mh.imread('7Esco.jpg', as_grey=1)
Load the image & convert to gray
im2 = im[::2,::2]
im2 = mh.gaussian_filter(im2, 1.4)
Downsample and blur (for speed and noise removal).
im2 = 255 - im2
Invert the image
mean_filtered = mh.convolve(im2.astype(float), np.ones((9,9))/81.)
Mean filtering is implemented "by hand" with a convolution.
imc = im2 > mean_filtered - 4
You might need to adjust the number 4 here, but it worked well for this image.
mh.imsave('binarized.png', (imc*255).astype(np.uint8))
Convert to 8 bits and save in PNG format.

Related

Performing OCR of Seven Segment Display images

I'm working on performing OCR of energy meter displays: example 1 example 2 example 3
I tried to use tesseract-ocr with the letsgodigital trained data. But the performance is very poor.
I'm fairly new to the topic and this is what I've done:
import numpy as np
import cv2
import imutils
from skimage import exposure
from pytesseract import image_to_string
import PIL
def process_image(orig_image_arr):
gry_disp_arr = cv2.cvtColor(orig_image_arr, cv2.COLOR_BGR2GRAY)
gry_disp_arr = exposure.rescale_intensity(gry_disp_arr, out_range= (0,255))
#thresholding
ret, thresh = cv2.threshold(gry_disp_arr,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
return thresh
def ocr_image(orig_image_arr):
otsu_thresh_image = process_image(orig_image_arr)
cv2_imshow(otsu_thresh_image)
return image_to_string(otsu_thresh_image, lang="letsgodigital", config="--psm 8 -c tessedit_char_whitelist=.0123456789")
img1 = cv2.imread('test2.jpg')
cnv = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
text = ocr_image(cnv)
This gives very poor results with the example images. I have a couple of questions:
How can I identify the four corners of the display? (Edge detection doesn’t seem to work very well)
Is there any futher preprocessing that I can do to improve the performance?
Thanks for any help.
Notice how your power meters either use blue or green LEDs to light up the display; I suggest you use this color display to your advantage. What I'd do is select only one RGB channel based on the LED color. Then I can threshold it based on some algorithm or assumption. After that, you can do the downstream steps of cropping / resizing / transformation / OCR etc.
For example, using your example image 1, look at its histogram here.
Notice how there is a small peak of green to the right of the 150 mark.
I take advantage of this, and set anything below 150 to zero. My assumption being that the green peak is the bright green LED in the image.
img = cv2.imread('example_1.jpg', 1)
# Get only green channel
img_g = img[:,:,1]
# Set threshold for green value, anything less than 150 becomes zero
img_g[img_g < 150] = 0
This is what I get.
This should be much easier for downstream OCR now.
# You should also set anything >= 150 to max value as well, but I didn't in this example
img_g[img_g >= 150] = 255
The above steps should replace this step
_ret, thresh = cv2.threshold(img_g, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
Here's the output of this step.

Proper image thresholding to prepare it for OCR in python using opencv

I am really new to opencv and a beginner to python.
I have this image:
I want to somehow apply proper thresholding to keep nothing but the 6 digits.
The bigger picture is that I intend to try to perform manual OCR to the image for each digit separately, using the k-nearest neighbours algorithm on a per digit level (kNearest.findNearest)
The problem is that I cannot clean up the digits sufficiently, especially the '7' digit which has this blue-ish watermark passing through it.
The steps I have tried so far are the following:
I am reading the image from disk
# IMREAD_UNCHANGED is -1
image = cv2.imread(sys.argv[1], cv2.IMREAD_UNCHANGED)
Then I'm keeping only the blue channel to get rid of the blue watermark around digit '7', effectively converting it to a single channel image
image = image[:,:,0]
# openned with -1 which means as is,
# so the blue channel is the first in BGR
Then I'm multiplying it a bit to increase contrast between the digits and the background:
image = cv2.multiply(image, 1.5)
Finally I perform Binary+Otsu thresholding:
_,thressed1 = cv2.threshold(image,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
As you can see the end result is pretty good except for the digit '7' which has kept a lot of noise.
How to improve the end result? Please supply the image example result where possible, it is better to understand than just code snippets alone.
You can try to medianBlur the gray(blur) image with different kernels(such as 3, 51), divide the blured results, and threshold it. Something like this:
#!/usr/bin/python3
# 2018/09/23 17:29 (CST)
# (中秋节快乐)
# (Happy Mid-Autumn Festival)
import cv2
import numpy as np
fname = "color.png"
bgray = cv2.imread(fname)[...,0]
blured1 = cv2.medianBlur(bgray,3)
blured2 = cv2.medianBlur(bgray,51)
divided = np.ma.divide(blured1, blured2).data
normed = np.uint8(255*divided/divided.max())
th, threshed = cv2.threshold(normed, 100, 255, cv2.THRESH_OTSU)
dst = np.vstack((bgray, blured1, blured2, normed, threshed))
cv2.imwrite("dst.png", dst)
The result:
Why not just keep values in the image that are above a certain threshold?
Like this:
import cv2
import numpy as np
img = cv2.imread("./a.png")[:,:,0] # the last readable image
new_img = []
for line in img:
new_img.append(np.array(list(map(lambda x: 0 if x < 100 else 255, line))))
new_img = np.array(list(map(lambda x: np.array(x), new_img)))
cv2.imwrite("./b.png", new_img)
Looks great:
You could probably play with the threshold even more and get better results.
It doesn't seem easy to completely remove the annoying stamp.
What you can do is flattening the background intensity by
computing a lowpass image (Gaussian filter, morphological closing); the filter size should be a little larger than the character size;
dividing the original image by the lowpass image.
Then you can use Otsu.
As you see, the result isn't perfect.
I tried a slightly different approach then Yves on the blue channel:
Apply median filter (r=2):
Use Edge detection (e.g. Sobel operator):
Automatic thresholding (Otsu)
Closing of the image
This approach seems to make the output a little less noisy. However, one has to address the holes in the numbers. This can be done by detecting black contours which are completely surrounded by white pixels and simply filling them with white.

Finding bright spots in a image using opencv

I want to find the bright spots in the above image and tag them using some symbol. For this i have tried using the Hough Circle Transform algorithm that OpenCV already provides. But it is giving some kind of assertion error when i run the code. I also tried the Canny edge detection algorithm which is also provided in OpenCV but it is also giving some kind of assertion error. I would like to know if there is some method to get this done or if i can prevent those error messages.
I am new to OpenCV and any help would be really appreciated.
P.S. - I can also use Scikit-image if necessary. So if this can be done using Scikit-image then please tell me how.
Below is my preprocessing code:
import cv2
import numpy as np
image = cv2.imread("image1.png")
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
binary_image = np.where(gray_image > np.mean(gray_image),1.0,0.0)
binary_image = cv2.Laplacian(binary_image, cv2.CV_8UC1)
If you are just going to work with simple images like your example where you have black background, you can use same basic preprocessing/thresholding then find connected components. Use this example code to draw a circle inside all circles in the image.
import cv2
import numpy as np
image = cv2.imread("image1.png")
# constants
BINARY_THRESHOLD = 20
CONNECTIVITY = 4
DRAW_CIRCLE_RADIUS = 4
# convert to gray
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# extract edges
binary_image = cv2.Laplacian(gray_image, cv2.CV_8UC1)
# fill in the holes between edges with dilation
dilated_image = cv2.dilate(binary_image, np.ones((5, 5)))
# threshold the black/ non-black areas
_, thresh = cv2.threshold(dilated_image, BINARY_THRESHOLD, 255, cv2.THRESH_BINARY)
# find connected components
components = cv2.connectedComponentsWithStats(thresh, CONNECTIVITY, cv2.CV_32S)
# draw circles around center of components
#see connectedComponentsWithStats function for attributes of components variable
centers = components[3]
for center in centers:
cv2.circle(thresh, (int(center[0]), int(center[1])), DRAW_CIRCLE_RADIUS, (255), thickness=-1)
cv2.imwrite("res.png", thresh)
cv2.imshow("result", thresh)
cv2.waitKey(0)
Here is resulting image:
Edit: connectedComponentsWithStats takes a binary image as input, and returns connected pixel groups in that image. If you would like to implement that function yourself, naive way would be:
1- Scan image pixels from top left to bottom right until you encounter a non-zero pixel that does not have a label (id).
2- When you encounter a non-zero pixel, search all its neighbours recursively( If you use 4 connectivity you check UP-LEFT-DOWN-RIGHT, with 8 connectivity you also check diagonals) until you finish that region. Assign each pixel a label. Increase your label counter.
3- Continue scanning from where you left.

Remove background of the image using opencv Python

I have two images, one with only background and the other with background + detectable object (in my case its a car). Below are the images
I am trying to remove the background such that I only have car in the resulting image. Following is the code that with which I am trying to get the desired results
import numpy as np
import cv2
original_image = cv2.imread('IMG1.jpg', cv2.IMREAD_COLOR)
gray_original = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
background_image = cv2.imread('IMG2.jpg', cv2.IMREAD_COLOR)
gray_background = cv2.cvtColor(background_image, cv2.COLOR_BGR2GRAY)
foreground = np.absolute(gray_original - gray_background)
foreground[foreground > 0] = 255
cv2.imshow('Original Image', foreground)
cv2.waitKey(0)
The resulting image by subtracting the two images is
Here is the problem. The expected resulting image should be a car only.
Also, If you take a deep look in the two images, you'll see that they are not exactly same that is, the camera moved a little so background had been disturbed a little. My question is that with these two images how can I subtract the background. I do not want to use grabCut or backgroundSubtractorMOG algorithm right now because I do not know right now whats going on inside those algorithms.
What I am trying to do is to get the following resulting image
Also if possible, please guide me with a general way of doing this not only in this specific case that is, I have a background in one image and background+object in the second image. What could be the best possible way of doing this. Sorry for such a long question.
I solved your problem using the OpenCV's watershed algorithm. You can find the theory and examples of watershed here.
First I selected several points (markers) to dictate where is the object I want to keep, and where is the background. This step is manual, and can vary a lot from image to image. Also, it requires some repetition until you get the desired result. I suggest using a tool to get the pixel coordinates.
Then I created an empty integer array of zeros, with the size of the car image. And then I assigned some values (1:background, [255,192,128,64]:car_parts) to pixels at marker positions.
NOTE: When I downloaded your image I had to crop it to get the one with the car. After cropping, the image has size of 400x601. This may not be what the size of the image you have, so the markers will be off.
Afterwards I used the watershed algorithm. The 1st input is your image and 2nd input is the marker image (zero everywhere except at marker positions). The result is shown in the image below.
I set all pixels with value greater than 1 to 255 (the car), and the rest (background) to zero. Then I dilated the obtained image with a 3x3 kernel to avoid losing information on the outline of the car. Finally, I used the dilated image as a mask for the original image, using the cv2.bitwise_and() function, and the result lies in the following image:
Here is my code:
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load the image
img = cv2.imread("/path/to/image.png", 3)
# Create a blank image of zeros (same dimension as img)
# It should be grayscale (1 color channel)
marker = np.zeros_like(img[:,:,0]).astype(np.int32)
# This step is manual. The goal is to find the points
# which create the result we want. I suggest using a
# tool to get the pixel coordinates.
# Dictate the background and set the markers to 1
marker[204][95] = 1
marker[240][137] = 1
marker[245][444] = 1
marker[260][427] = 1
marker[257][378] = 1
marker[217][466] = 1
# Dictate the area of interest
# I used different values for each part of the car (for visibility)
marker[235][370] = 255 # car body
marker[135][294] = 64 # rooftop
marker[190][454] = 64 # rear light
marker[167][458] = 64 # rear wing
marker[205][103] = 128 # front bumper
# rear bumper
marker[225][456] = 128
marker[224][461] = 128
marker[216][461] = 128
# front wheel
marker[225][189] = 192
marker[240][147] = 192
# rear wheel
marker[258][409] = 192
marker[257][391] = 192
marker[254][421] = 192
# Now we have set the markers, we use the watershed
# algorithm to generate a marked image
marked = cv2.watershed(img, marker)
# Plot this one. If it does what we want, proceed;
# otherwise edit your markers and repeat
plt.imshow(marked, cmap='gray')
plt.show()
# Make the background black, and what we want to keep white
marked[marked == 1] = 0
marked[marked > 1] = 255
# Use a kernel to dilate the image, to not lose any detail on the outline
# I used a kernel of 3x3 pixels
kernel = np.ones((3,3),np.uint8)
dilation = cv2.dilate(marked.astype(np.float32), kernel, iterations = 1)
# Plot again to check whether the dilation is according to our needs
# If not, repeat by using a smaller/bigger kernel, or more/less iterations
plt.imshow(dilation, cmap='gray')
plt.show()
# Now apply the mask we created on the initial image
final_img = cv2.bitwise_and(img, img, mask=dilation.astype(np.uint8))
# cv2.imread reads the image as BGR, but matplotlib uses RGB
# BGR to RGB so we can plot the image with accurate colors
b, g, r = cv2.split(final_img)
final_img = cv2.merge([r, g, b])
# Plot the final result
plt.imshow(final_img)
plt.show()
If you have a lot of images you will probably need to create a tool to annotate the markers graphically, or even an algorithm to find markers automatically.
The problem is that you're subtracting arrays of unsigned 8 bit integers. This operation can overflow.
To demonstrate
>>> import numpy as np
>>> a = np.array([[10,10]],dtype=np.uint8)
>>> b = np.array([[11,11]],dtype=np.uint8)
>>> a - b
array([[255, 255]], dtype=uint8)
Since you're using OpenCV, the simplest way to achieve your goal is to use cv2.absdiff().
>>> cv2.absdiff(a,b)
array([[1, 1]], dtype=uint8)
I recommend using OpenCV's grabcut algorithm. You first draw a few lines on the foreground and background, and keep doing this until your foreground is sufficiently separated from the background. It is covered here: https://docs.opencv.org/trunk/d8/d83/tutorial_py_grabcut.html
as well as in this video: https://www.youtube.com/watch?v=kAwxLTDDAwU

Convert string of particles into one particle openCV python

I am doing a bit of image processing, and I want to be able to follow a trend in an image. I want to be able to link the individual particles together in that image in order to create one curve. I do have a bit of code* that I use to threshold the original image.
I have tried editing the thresholds in order to get the particles to "mesh together," as a varied threshold does cause the particles to grow and thus almost bridge the gap between them.
If there is a way to detect a correlation between the particles and connect them, I would like to know. I have looked into a concept called blob detection, but I haven't been able to understand it too well.
Any help on how to tackle this problem is greatly appreciated
here's the code from that link:
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread('imgs/image.jpg',0)
#img = cv2.medianBlur(img,5)
ret,th1 = cv2.threshold(img,117,255,cv2.THRESH_BINARY)
th2 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C,\
cv2.THRESH_BINARY,11,2)
th3 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv2.THRESH_BINARY,11,2)
titles = ['Original Image', 'Global Thresholding (v = 127)']
images = [img, th1]
for i in xrange(2):
plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
plt.title(titles[i])
plt.xticks([]),plt.yticks([])
plt.show()
cv2.imwrite("test_thresh2.png", th1)
Take look on OpenCV Eroding and Dilating where you can use Dilating.
See the example.
Mat src=imread("src.png",1);
int dilation_size = 3;
Mat element = getStructuringElement( MORPH_RECT,
Size( 2*dilation_size + 1, 2*dilation_size+1 ),
Point( dilation_size, dilation_size ) );
Mat dst;
dilate( src, dst, element );
imshow("dst",dst);

Categories