I have been working on a binary image on opencv python. I need to get the largest region. I have used following code, but I am not getting desired output.
edged = cv2.Canny(im_bw, 35, 125)
(cnts, _) = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
c = max(cnts, key = cv2.contourArea)
You don't need to use the canny output to do this. Just do findContours on im_bw directly and you should get the desired results. If still not what you want, try to use different threshold values (given that your original image isn't BW itself)
(_, im_bw) = threshold(frame, 100, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
(cnts, _) = cv2.findContours(im_bw.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
c = max(cnts, key = cv2.contourArea)
You really didn't explain what are you looking for,"largest region"? The code you posted will give you the largest contour found but you need to understand what is an OpenCV contour here. Now depending of your image you can have a lot of noise and that makes OpenCV gives you not the "region" you are expecting, so you need to reduce the noise. Before apply the Canny or the threshold you can apply BLUR to the image, EROTION and/or DILATION.
The algorithm should be like this:
Get the frame / image
Grayscale it
Apply Blur / Erode / Dilate to reduce noise
Apply Canny or threshold
Find contours
Get the largest
Do what you need
Here you'll find good documentation in Python.
I am using the scikit-image package of python which measures the area of the islands and chooses the largest area as follows -
import skimage
from skimage import measure
labels_mask = measure.label(input_mask)
regions = measure.regionprops(labels_mask)
regions.sort(key=lambda x: x.area, reverse=True)
if len(regions) > 1:
for rg in regions[1:]:
labels_mask[rg.coords[:,0], rg.coords[:,1]] = 0
labels_mask[labels_mask!=0] = 1
mask = labels_mask
Input image -
Output image -
Related
I want to detect and count the objects inside an image that touch while ignoring what could be considered as a single object. I have the basic image, on which i tried applying a cv2.HoughCircles() method to try and identify some circles. I then parsed the returned array and tried using cv2.circle() to draw them on the image.
However, I seem to always get too many circles returned by cv2.HoughCircles() and couldn't figure out how to only count the objects that are touching.
This is the image i was working on
My code so far:
import numpy
import matplotlib.pyplot as pyp
import cv2
segmentet = cv2.imread('photo')
houghCircles = cv2.HoughCircles(segmented, cv2.HOUGH_GRADIENT, 1, 80, param1=450, param2=10, minRadius=30, maxRadius=200)
houghArray = numpy.uint16(houghCircles)[0,:]
for circle in houghArray:
cv2.circle(segmented, (circle[0], circle[1]), circle[2], (0, 250, 0), 3)
And this is the image i get, which is quite a far shot from want i really want.
How can i properly identify and count said objects?
Here is one way in Python OpenCV by getting contour areas and the convex hull area of the contours. The take the ratio (area/convex_hull_area). If small enough, then it is a cluster of blobs. Otherwise it is an isolated blob.
Input:
import cv2
import numpy as np
# read input image
img = cv2.imread('blobs_connected.jpg')
# convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold to binary
thresh = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY)[1]
# find contours
#label_img = img.copy()
contour_img = img.copy()
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
index = 1
isolated_count = 0
cluster_count = 0
for cntr in contours:
area = cv2.contourArea(cntr)
convex_hull = cv2.convexHull(cntr)
convex_hull_area = cv2.contourArea(convex_hull)
ratio = area / convex_hull_area
#print(index, area, convex_hull_area, ratio)
#x,y,w,h = cv2.boundingRect(cntr)
#cv2.putText(label_img, str(index), (x,y), cv2.FONT_HERSHEY_COMPLEX_SMALL, 1, (0,0,255), 2)
if ratio < 0.91:
# cluster contours in red
cv2.drawContours(contour_img, [cntr], 0, (0,0,255), 2)
cluster_count = cluster_count + 1
else:
# isolated contours in green
cv2.drawContours(contour_img, [cntr], 0, (0,255,0), 2)
isolated_count = isolated_count + 1
index = index + 1
print('number_clusters:',cluster_count)
print('number_isolated:',isolated_count)
# save result
cv2.imwrite("blobs_connected_result.jpg", contour_img)
# show images
cv2.imshow("thresh", thresh)
#cv2.imshow("label_img", label_img)
cv2.imshow("contour_img", contour_img)
cv2.waitKey(0)
Clusters in Red, Isolated blobs in Green:
Textual Information:
number_clusters: 4
number_isolated: 81
approach it in steps.
label connected components. two connected blobs get the same label because they're connected. so far so good.
now separate your blobs. use watershed (first comment) or whatever other method gives you results. I can't fully predict the watershed approach. it might deal with touching blobs of dissimilar size or it might do something silly. the sample/tutorial also assumes a minimum size (0.7 * max peak); plug in something absolute in pixels maybe.
then, for each separated blob, check which label it sits on (take coordinates of centroid to be safe), and note down a +1 for that label (a histogram).
any label that has more than one separated blob sitting on it, would be what you are looking for.
I am working on a task to extract the account number from cheque images. My current approach can be divided into 2 steps
Localize account number digits (Printed digits)
Perform OCR using OCR libraries like Tesseract OCR
The second step is straight forward assuming we have properly localized the account number digits
I tried to localize account number digits using OpenCV contours methods and using MSER (Maximally stable extremal regions) but didn’t get useful results. It’s difficult to generalize pattern because
Different bank cheques have variations in template
Account number position is not fixed
How can we approach this problem. Do I have to look for some deep learning based approaches.
Sample Images
Assuming the account number has the unique purple text color, we can use color thresholding. The idea is to convert the image to HSV color space then define a lower/upper color range and perform color thresholding using cv2.inRange(). From here we filter by contour area to remove small noise. Finally we invert the image since we want the text in black with the background in white. One last step is to Gaussian blur the image before throwing it into Pytesseract. Here's the result:
Result from Pytesseract
30002010108841
Code
import numpy as np
import pytesseract
import cv2
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
image = cv2.imread('1.png')
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
lower = np.array([103,79,60])
upper = np.array([129,255,255])
mask = cv2.inRange(hsv, lower, upper)
cnts = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
if area < 10:
cv2.drawContours(mask, [c], -1, (0,0,0), -1)
mask = 255 - mask
mask = cv2.GaussianBlur(mask, (3,3), 0)
data = pytesseract.image_to_string(mask, lang='eng',config='--psm 6')
print(data)
cv2.imshow('mask', mask)
cv2.waitKey()
Thanks, everyone for the suggestions, I ended up training deep learning object detection method to localize Account number and it gave very good results as compared to OpenCV based methods
I'm currently working on an image where I have to find the box outer region. But I failed to find the white and black boxes regions.
input image:
https://i.imgur.com/gec9eP5.png
output image:
https://i.imgur.com/Giz1DAW.png
Update edit:
if I use HLS instead of HSV I can find 3 more box region but 2 is still missing.
here is new output:
https://i.imgur.com/eUqltKI.png
and here is my code:
import cv2
import numpy as np
img = cv2.imread("1.png")
imghsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
lower_blue = np.array([0,50,0])
upper_blue = np.array([255,255,255])
mask_blue = cv2.inRange(imghsv, lower_blue, upper_blue)
_, contours, _ = cv2.findContours(mask_blue, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
im = np.copy(img)
cv2.drawContours(im, contours, -1, (0, 255, 0), 2)
cv2.imwrite("contours_blue.png", im)
The mask you're generating with
mask_blue = cv2.inRange(imghsv, lower_blue, upper_blue)
does not include the bottom row at all, so it's impossible to detect these outlines with
_, contours, _ = cv2.findContours(mask_blue, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
You could try to work with multiple masks / thresholds to account for the different color ranges and merge the detected contours.
Color channel threshold is not the optimal solution in cases when you are dealing with objects of many different colors (that are not known in advance) and with background color that is not necessarily distinctly different from all object colors. Combination of multiple thresholds/conditions could solve the job for this particular image but this same combination can fail for slightly different input, so I think this approach is generally not too good.
I think the problem is very elementary in nature so I would recommend sticking to a simple approach. For example, if you apply Sobel operator on your image, you get result like one below. The intensity of the result is weak on some borders, so I inverted the image colors to make it better visible.
There are tons of tutorials on Sobel operator on the web so I won't go into detail here. On your input image there is no noise, so the intensity outside and within the boxes is zero. I would therefore suggest masking-out all zero values. If you do contour detection after that, you will end up with two contours per square - one will be on the inner side of the border and one on the outer side of the border. If you only want to extract contours on the outer border, see how contour hirearchy works in the OpenCV documentation. If you want to have contour exactly on the border, help yourself with outer contour and erosion.
I am trying to do OCR from this toy example of Receipts. Using Python 2.7 and OpenCV 3.1.
Grayscale + Blur + External Edge Detection + Segmentation of each area in the Receipts (for example "Category" to see later which one is marked -in this case cash-).
I find complicated when the image is "skewed" to be able to properly transform and then "automatically" segment each segment of the receipts.
Example:
Any suggestion?
The code below is an example to get until the edge detection, but when the receipt is like the first image. My issue is not the Image to text. Is the pre-processing of the image.
Any help more than appreciated! :)
import os;
os.chdir() # Put your own directory
import cv2
import numpy as np
image = cv2.imread("Rent-Receipt.jpg", cv2.IMREAD_GRAYSCALE)
blurred = cv2.GaussianBlur(image, (5, 5), 0)
#blurred = cv2.bilateralFilter(gray,9,75,75)
# apply Canny Edge Detection
edged = cv2.Canny(blurred, 0, 20)
#Find external contour
(_,contours, _) = cv2.findContours(edged, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
A great tutorial on the first step you described is available at pyimagesearch (and they have great tutorials in general)
In short, as described by Ella, you would have to use cv2.CHAIN_APPROX_SIMPLE. A slightly more robust method would be to use cv2.RETR_LIST instead of cv2.RETR_EXTERNAL and then sort the areas, as it should decently work even in white backgrounds/if the page inscribes a bigger shape in the background, etc.
Coming to the second part of your question, a good way to segment the characters would be to use the Maximally stable extremal region extractor available in OpenCV. A complete implementation in CPP is available here in a project I was helping out in recently. The Python implementation would go along the lines of (Code below works for OpenCV 3.0+. For the OpenCV 2.x syntax, check it up online)
import cv2
img = cv2.imread('test.jpg')
mser = cv2.MSER_create()
#Resize the image so that MSER can work better
img = cv2.resize(img, (img.shape[1]*2, img.shape[0]*2))
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
vis = img.copy()
regions = mser.detectRegions(gray)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions[0]]
cv2.polylines(vis, hulls, 1, (0,255,0))
cv2.namedWindow('img', 0)
cv2.imshow('img', vis)
while(cv2.waitKey()!=ord('q')):
continue
cv2.destroyAllWindows()
This gives the output as
Now, to eliminate the false positives, you can simply cycle through the points in hulls, and calculate the perimeter (sum of distance between all adjacent points in hulls[i], where hulls[i] is a list of all points in one convexHull). If the perimeter is too large, classify it as not a character.
The diagnol lines across the image are coming because the border of the image is black. that can simply be removed by adding the following line as soon as the image is read (below line 7)
img = img[5:-5,5:-5,:]
which gives the output
The option on the top of my head requires the extractions of 4 corners of the skewed image. This is done by using cv2.CHAIN_APPROX_SIMPLE instead of cv2.CHAIN_APPROX_NONE when finding contours. Afterwards, you could use cv2.approxPolyDP and hopefully remain with the 4 corners of the receipt (If all your images are like this one then there is no reason why it shouldn't work).
Now use cv2.findHomography and cv2.wardPerspective to rectify the image according to source points which are the 4 points extracted from the skewed image and destination points that should form a rectangle, for example the full image dimensions.
Here you could find code samples and more information:
OpenCV-Geometric Transformations of Images
Also this answer may be useful - SO - Detect and fix text skew
EDIT: Corrected the second chain approx to cv2.CHAIN_APPROX_NONE.
Preprocessing the image by converting the desired text in the foreground to black while turning unwanted background to white can help to improve OCR accuracy. In addition, removing the horizontal and vertical lines can improve results. Here's the preprocessed image after removing unwanted noise such as the horizontal/vertical lines. Note the removed border and table lines
import cv2
# Load in image, convert to grayscale, and threshold
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Find and remove horizontal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (35,2))
detect_horizontal = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(detect_horizontal, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(thresh, [c], -1, (0,0,0), 3)
# Find and remove vertical lines
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,35))
detect_vertical = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel, iterations=2)
cnts = cv2.findContours(detect_vertical, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(thresh, [c], -1, (0,0,0), 3)
# Mask out unwanted areas for result
result = cv2.bitwise_and(image,image,mask=thresh)
result[thresh==0] = (255,255,255)
cv2.imshow('thresh', thresh)
cv2.imshow('result', result)
cv2.waitKey()
Try using Stroke Width Transform. Python 3 implementation of the algorithm is present here at SWTloc
EDIT : v2.0.0 onwards
Install the Library
pip install swtloc
Transform The Image
import swtloc as swt
imgpath = 'images/path_to_image.jpeg'
swtl = swt.SWTLocalizer(image_paths=imgpath)
swtImgObj = swtl.swtimages[0]
# Perform SWT Transformation with numba engine
swt_mat = swtImgObj.transformImage(text_mode='lb_df', gaussian_blurr=False,
minimum_stroke_width=3, maximum_stroke_width=12,
maximum_angle_deviation=np.pi/2)
Localize Letters
localized_letters = swtImgObj.localizeLetters(minimum_pixels_per_cc=10,
localize_by='min_bbox')
Localize Words
localized_words = swtImgObj.localizeWords(localize_by='bbox')
There are multiple parameters in the of the .transformImage, .localizeLetters and .localizeWords function sthat you can play around with to get the desired results.
Full Disclosure : I am the author of this library
I am working on Retinal fundus images.The image consists of a circular retina on a black background. With OpenCV, I have managed to get a contour which surrounds the whole circular Retina. What I need is to crop out the circular retina from the black background.
It is unclear in your question whether you want to actually crop out the information that is defined within the contour or mask out the information that isn't relevant to the contour chosen. I'll explore what to do in both situations.
Masking out the information
Assuming you ran cv2.findContours on your image, you will have received a structure that lists all of the contours available in your image. I'm also assuming that you know the index of the contour that was used to surround the object you want. Assuming this is stored in idx, first use cv2.drawContours to draw a filled version of this contour onto a blank image, then use this image to index into your image to extract out the object. This logic masks out any irrelevant information and only retain what is important - which is defined within the contour you have selected. The code to do this would look something like the following, assuming your image is a grayscale image stored in img:
import numpy as np
import cv2
img = cv2.imread('...', 0) # Read in your image
# contours, _ = cv2.findContours(...) # Your call to find the contours using OpenCV 2.4.x
_, contours, _ = cv2.findContours(...) # Your call to find the contours
idx = ... # The index of the contour that surrounds your object
mask = np.zeros_like(img) # Create mask where white is what we want, black otherwise
cv2.drawContours(mask, contours, idx, 255, -1) # Draw filled contour in mask
out = np.zeros_like(img) # Extract out the object and place into output image
out[mask == 255] = img[mask == 255]
# Show the output image
cv2.imshow('Output', out)
cv2.waitKey(0)
cv2.destroyAllWindows()
If you actually want to crop...
If you want to crop the image, you need to define the minimum spanning bounding box of the area defined by the contour. You can find the top left and lower right corner of the bounding box, then use indexing to crop out what you need. The code will be the same as before, but there will be an additional cropping step:
import numpy as np
import cv2
img = cv2.imread('...', 0) # Read in your image
# contours, _ = cv2.findContours(...) # Your call to find the contours using OpenCV 2.4.x
_, contours, _ = cv2.findContours(...) # Your call to find the contours
idx = ... # The index of the contour that surrounds your object
mask = np.zeros_like(img) # Create mask where white is what we want, black otherwise
cv2.drawContours(mask, contours, idx, 255, -1) # Draw filled contour in mask
out = np.zeros_like(img) # Extract out the object and place into output image
out[mask == 255] = img[mask == 255]
# Now crop
(y, x) = np.where(mask == 255)
(topy, topx) = (np.min(y), np.min(x))
(bottomy, bottomx) = (np.max(y), np.max(x))
out = out[topy:bottomy+1, topx:bottomx+1]
# Show the output image
cv2.imshow('Output', out)
cv2.waitKey(0)
cv2.destroyAllWindows()
The cropping code works such that when we define the mask to extract out the area defined by the contour, we additionally find the smallest horizontal and vertical coordinates which define the top left corner of the contour. We similarly find the largest horizontal and vertical coordinates that define the bottom left corner of the contour. We then use indexing with these coordinates to crop what we actually need. Note that this performs cropping on the masked image - that is the image that removes everything but the information contained within the largest contour.
Note with OpenCV 3.x
It should be noted that the above code assumes you are using OpenCV 2.4.x. Take note that in OpenCV 3.x, the definition of cv2.findContours has changed. Specifically, the output is a three element tuple output where the first image is the source image, while the other two parameters are the same as in OpenCV 2.4.x. Therefore, simply change the cv2.findContours statement in the above code to ignore the first output:
_, contours, _ = cv2.findContours(...) # Your call to find contours
Here's another approach to crop out a rectangular ROI. The main idea is to find the edges of the retina using Canny edge detection, find contours, and then extract the ROI using Numpy slicing. Assuming you have an input image like this:
Extracted ROI
import cv2
# Load image, convert to grayscale, and find edges
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY)[1]
# Find contour and sort by contour area
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
# Find bounding box and extract ROI
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
ROI = image[y:y+h, x:x+w]
break
cv2.imshow('ROI',ROI)
cv2.imwrite('ROI.png',ROI)
cv2.waitKey()
This is a pretty simple way. Mask the image with transparency.
Read the image
Make a grayscale version.
Otsu Threshold
Apply morphology open and close to thresholded image as a mask
Put the mask into the alpha channel of the input
Save the output
Input
Code
import cv2
import numpy as np
# load image as grayscale
img = cv2.imread('retina.jpeg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold input image using otsu thresholding as mask and refine with morphology
ret, mask = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
kernel = np.ones((9,9), np.uint8)
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
# put mask into alpha channel of result
result = img.copy()
result = cv2.cvtColor(result, cv2.COLOR_BGR2BGRA)
result[:, :, 3] = mask
# save resulting masked image
cv2.imwrite('retina_masked.png', result)
Output