Techniques to merge overlapping bounding boxes not working - python

I am trying to detect text regions in an image with OpenCV (3.0) and Python. So far, I am able to detect individual text characters and draw a bounding box or rectangle around them. My ultimate goal is merging individual text characters into words/text lines and eventually into text blocks or paragraphs.
My approach is to finding neighbouring text regions, and then form a bounding box around these regions. So, in my code, I have expanded the bounding boxes or rectangles around each text characters a bit, so that they overlap each other, forming a chain of overlapping bounding boxes. (please refer to the image). Now, I would like to merge these overlapping rectangles to form a single bounding box based on overlap ratio between all bounding box pairs.
I am having very hard time figuring out how to merge overlapping rectangles into a single one. For the last 24 hours, I have tried different techniques without any luck. Among them is cv2.groupRectangles. I think we need an array of rectangles (i.e. rectList) returned by cv2.cascade.detectMultiScale(); we also need haar cascades to detect objects in an image (i.e. in our case rectangles), and there are a number of haar cascades in the "data folder" and I am confused which one to use in my case (detecting rectangles).
If cv2.groupRectangles is not applicable in my case, then what might be other options I would like to know. If you have encountered similar problems, please share.
Here is my working Python code
import numpy as np
import cv2
im = cv2.imread('headintext.png')
grayImage = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
# grayImage = cv2.imread('gates.jpg', cv2.IMREAD_GRAYSCALE)
# cv2.imwrite('gates.png', grayImage)
# cv2.imshow('image', grayImage)
# cv2.waitKey(0)
# cv2.destroyAllWindows()
_,thresh = cv2.threshold(grayImage, 150, 255, cv2.THRESH_BINARY_INV)
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
dilated = cv2.dilate(thresh, kernel, iterations = 1) # dilate
_,contours0,_ = cv2.findContours(dilated,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE) # get contours
contours = [cv2.approxPolyDP(cnt, 3, True) for cnt in contours0]
# contours, hierarchy = cv2.findContours(thresh, 1, 2)
# for each contour found, draw a rectangle around it on original image
for contour in contours:
# get rectangle bounding contour
[x,y,w,h] = cv2.boundingRect(contour)
# discard areas that are too large
if h>50 and w>50:
continue
# discard areas that are too small
if h<5 or w<5:
continue
# draw rectangle around contour on original image
# slightly expand the rectangles to form a chain of overlapping bounding boxes
pad_w, pad_h = int(0.05*w), int(0.15*h)
cv2.rectangle(im,(x-pad_w,y-pad_h),(x+w+pad_w,y+h+pad_h),(255,0,255),1,shift=0)
# write original image with added contours to disk
cv2.imwrite("rectangle.png", im)
cv2.imshow('image', im)
cv2.waitKey(0)
cv2.destroyAllWindows()
I have tried the following cv2.groupRectangles code, but it is not producing any effects at all. I have noticed cascade.detectMultiScale do not find any rectangles.
import numpy as np
import cv2
cascade = cv2.CascadeClassifier('data/haarcascades/haarcascade_frontalface_alt.xml')
img = cv2.imread('rectangle.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
rawrects = cascade.detectMultiScale(gray, scaleFactor=1.2, minNeighbors=1, minSize=(5, 5), flags = cv2.CASCADE_SCALE_IMAGE)
# rects, weights = cv2.groupRectangles(np.array(rawrects).tolist(), 1, 0.2)
print "nrects %d" % len(rawrects)
for i, r in enumerate(rawrects):
weight = weights[i]
rectline = "%f %f %f %f %d\n" % (r[0], r[1], r[2], r[3], weight)
print rectline
p1 = (r[0], r[1])
p2 = (r[0]+r[2], r[1]+r[3])
cv2.rectangle(img, p1, p2, (0,0,255))
cv2.imshow('image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

Related

Quantify longest axis and width of irregular shapes within a single image

Original question
I have about 80-100 images such as (A). Each image is composed of shapes that were filled with black color after marking the outline in ImageJ. I need to extract each individual shape as in (B). I then need to find the widths of the longest axis as in (C).
And I then need to calculate all the possible widths at regular points within the black shape and return a .csv file containing these values.
I have been doing this manually and there must be a quicker way to do it. Any idea how I can go about doing this?
Partial solution as per Epsi's answer
import matplotlib.pyplot as plt
import cv2
import numpy as np
image = cv2.imread("Desktop/Analysis/NormalCol0.tif")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Find contours, find rotated rectangle, obtain four verticies, and draw
contours, hierarchy = cv2.findContours(thresh_img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
lengths = []
for i in contours:
x_y,width_height,angle_of_rotation = cv2.minAreaRect(i)
Height = print(width_height[1])
rect = cv2.minAreaRect(i)
box = np.int0(cv2.boxPoints(rect))
image_contours = np.zeros(image.shape)
# draw the contours on the empty image
cv2.drawContours(image, [box], 0, (36,255,12), 3) # OR
# cv2.polylines(image, [box], True, (36,255,12), 3)
plt.imshow(image)
cv2.imwrite(output,image)
Output:
4031.0
20.877727508544922
51.598445892333984
23.852108001708984
21.0
21.21320343017578
19.677398681640625
43.0
I am able to get the length of each shape. I next need to figure out how to get width of the shape at regular points along the length of the shape?
As I'm not that good with python, I wont post code, but will post links to explanations and documentary.
How I would do it is as follows:
Invert the image, so that all the white becomes black and black becomes white. https://www.delftstack.com/howto/python/opencv-invert-image/#invert-images-using-numpy.invert-method-in-python
Use findContours to find all the (now) white contours. https://pythonexamples.org/python-opencv-cv2-find-contours-in-image/
use minAreaRect to create bounding boxes around the contours that is positioned in a way that the area of the box is as small as possible. https://theailearner.com/tag/cv2-minarearect/
from the created bounding boxes, take the longest side, this will represent the length of your contour.
You can also get the angle of rotation from the bounding boxes

How to detect columns of rectangles in mask image?

I have found contours of some rectangles in an image and created a mask as shown below. What I am trying to do is finding those two columns of rectangles as highlighted in the image.
The source image:
Columns highlighted:
Desired output:
I'm not sure if one can achieve that with simple algorithm. If those inline-vertical rectangle does not change their position, You could just define (hardcoded) ROI (region of interest) for them, based on pixel coordinates.
If not using any machine learning to solve this, defining ROI is Your best option.
Feel free to ask, if needed.
Assuming the desired columns have significantly more contours than the other columns – as shown in the given example image – simple dilation with some vertical structuring element might be sufficient to find those columns:
Load image as grayscale, get rid of JPG artifacts.
Dilation with "small" vertical structuring element to combine all contours within a column.
Opening with "large" vertical structuring element to neglect all smaller contours. The two columns in question will now have quite large contours.
For safety reasons: Again, dilation with "small" vertical structuring element. Since the columns are slightly rotated, skewed, ... single pixels might've been falsely removed by the opening.
Find remaining contours.
For each contour: Get the bounding boxes in two steps, since we have to refine the bounding box by using the original image (because of step 4).
That'd be the full Python code:
import cv2
import numpy as np
# Read image, get rid of JPG artifacts
img = cv2.imread('gXylF.jpg', cv2.IMREAD_GRAYSCALE)
img = cv2.threshold(img, 128, 255, cv2.THRESH_BINARY)[1]
# Dilating using vertical structuring element to combine contours
mask = cv2.dilate(img, kernel=np.ones((21, 1)))
# Opening using vertical structuring element to neglect small contours
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel=np.ones((201, 1)))
# Dilating using rectangular structuring element; for safety reasons:
# Single pixels might've been falsely removed by the opening
mask = cv2.dilate(mask, kernel=np.ones((11, 11)))
# Find remaining contours w.r.t. the OpenCV version
cnts = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
# Iterate remaining contours
for i, cnt in enumerate(cnts):
# Get bounding rectangle of contour, and region of interest (ROI)
(x, y, w, h) = cv2.boundingRect(cnt)
roi = img[y:y+h, x:x+w]
# Get bounding rectangle of actual values in ROI; the contour of the
# mask is larger than the actual contours in the original image
(x, y, w, h) = cv2.boundingRect(roi)
roi = roi[y:y+h, x:x+w]
cv2.imwrite('{}.png'.format(i), roi)
The two exported images look like this:
If you, instead, wanted to mark the regions of interest (ROIs) within the original image, pay attention to use the correct coordinates of the bounding boxes (refined coordinates are w.r.t. to the first cut ROI).
If your other images show more rotation, skew, ... you'd might need to re-draw the contours w.r.t. the original image, if neighbouring, small contours should be neglected completely.
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
NumPy: 1.20.1
OpenCV: 4.5.1
----------------------------------------

Detecting and counting blobs/connected objects with opencv

I want to detect and count the objects inside an image that touch while ignoring what could be considered as a single object. I have the basic image, on which i tried applying a cv2.HoughCircles() method to try and identify some circles. I then parsed the returned array and tried using cv2.circle() to draw them on the image.
However, I seem to always get too many circles returned by cv2.HoughCircles() and couldn't figure out how to only count the objects that are touching.
This is the image i was working on
My code so far:
import numpy
import matplotlib.pyplot as pyp
import cv2
segmentet = cv2.imread('photo')
houghCircles = cv2.HoughCircles(segmented, cv2.HOUGH_GRADIENT, 1, 80, param1=450, param2=10, minRadius=30, maxRadius=200)
houghArray = numpy.uint16(houghCircles)[0,:]
for circle in houghArray:
cv2.circle(segmented, (circle[0], circle[1]), circle[2], (0, 250, 0), 3)
And this is the image i get, which is quite a far shot from want i really want.
How can i properly identify and count said objects?
Here is one way in Python OpenCV by getting contour areas and the convex hull area of the contours. The take the ratio (area/convex_hull_area). If small enough, then it is a cluster of blobs. Otherwise it is an isolated blob.
Input:
import cv2
import numpy as np
# read input image
img = cv2.imread('blobs_connected.jpg')
# convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold to binary
thresh = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY)[1]
# find contours
#label_img = img.copy()
contour_img = img.copy()
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
index = 1
isolated_count = 0
cluster_count = 0
for cntr in contours:
area = cv2.contourArea(cntr)
convex_hull = cv2.convexHull(cntr)
convex_hull_area = cv2.contourArea(convex_hull)
ratio = area / convex_hull_area
#print(index, area, convex_hull_area, ratio)
#x,y,w,h = cv2.boundingRect(cntr)
#cv2.putText(label_img, str(index), (x,y), cv2.FONT_HERSHEY_COMPLEX_SMALL, 1, (0,0,255), 2)
if ratio < 0.91:
# cluster contours in red
cv2.drawContours(contour_img, [cntr], 0, (0,0,255), 2)
cluster_count = cluster_count + 1
else:
# isolated contours in green
cv2.drawContours(contour_img, [cntr], 0, (0,255,0), 2)
isolated_count = isolated_count + 1
index = index + 1
print('number_clusters:',cluster_count)
print('number_isolated:',isolated_count)
# save result
cv2.imwrite("blobs_connected_result.jpg", contour_img)
# show images
cv2.imshow("thresh", thresh)
#cv2.imshow("label_img", label_img)
cv2.imshow("contour_img", contour_img)
cv2.waitKey(0)
Clusters in Red, Isolated blobs in Green:
Textual Information:
number_clusters: 4
number_isolated: 81
approach it in steps.
label connected components. two connected blobs get the same label because they're connected. so far so good.
now separate your blobs. use watershed (first comment) or whatever other method gives you results. I can't fully predict the watershed approach. it might deal with touching blobs of dissimilar size or it might do something silly. the sample/tutorial also assumes a minimum size (0.7 * max peak); plug in something absolute in pixels maybe.
then, for each separated blob, check which label it sits on (take coordinates of centroid to be safe), and note down a +1 for that label (a histogram).
any label that has more than one separated blob sitting on it, would be what you are looking for.

Identifying multiple rectangles and draw bounding box around them using OpenCV

I am attempting to draw a bounding box around the two rectangles that are found in the image, but not including the curvy lined 'noise'
I have tried multiple methods, including Hough Line Transform and trying to extract coordinates, but to no avail. My methods seemed too arbitrary, and I tried to find the black space between the true rectangles and the noise at the top of the frames but couldn't get a solid general algorithm that could fit that in.
This is not so simple to do, you can try to isolate the vertical lines which are quite distinguishable, dilate/erode to make the rectangle a rectangle, and find the contours of what it is left and filter them accordingly... The code would look like:
import numpy as np
import cv2
minArea = 20 * 20 # area of 20 x 20 pixels
# load image and threshold it
original = cv2.imread("a.png")
img = cv2.cvtColor(original, cv2.COLOR_BGR2GRAY)
ret, thres = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY )
# Get the vertical lines
verticalStructure = cv2.getStructuringElement(cv2.MORPH_RECT, (1, 10))
vertical = cv2.erode(thres, verticalStructure)
vertical = cv2.dilate(vertical, verticalStructure)
# close holes to make it solid rectangle
kernel = np.ones((45,45),np.uint8)
close = cv2.morphologyEx(vertical, cv2.MORPH_CLOSE, kernel)
# get contours
im2, contours, hierarchy = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# draw the contours with area bigger than a minimum and that is almost rectangular
for cnt in contours:
x,y,w,h = cv2.boundingRect(cnt)
area = cv2.contourArea(cnt)
if area > (w*h*.60) and area > minArea:
original = cv2.rectangle(original, (x,y),(x+w,y+h), (0,0,255), 3)
cv2.imshow("image", original)
cv2.waitKey(0)
cv2.destroyAllWindows()
And the result is:
If it does not work with other images, try adjusting the parameters.

Detecting text in an image using python

I have around 100+ images with 2 different texts on it. The images are below. one is occupied and the other is unoccupied.
So is there any way in python to differentiate these images using some code to detect the text in it?
If so I wanted to identify the occupied images and delete unoccupied images.
Since I am new to python can anyone help me in doing this?
This answer is based on the assumption that that there are only two different texts on the images as you posted in the question. So I assume that the number of characters and the color of the text is always the same ("Room status: Unoccupied" and "Room status" Occupied" in red color). That being said, I would try a more simple way to differentiate between these two different types. These images contain caracters that are very near to each other so in my opinion is that it would be very difficult to seperate each character and identify it with an OCR. I would try a more simple approach like finding the area containg the text and find the pure lenght of the text - "unoccupied" has two more characters in the text as "occupied" and hence has a bigger distance in lenght. So you can transform the image to HSV color space and use the cv2.inRange() function to extract the text (red color). Then you can merge the characters to one contour with cv2.morphologyEx() and get its lenght with cv2.minAreaRect(). Hope it helps or at least gives you a new perspective on how to find your solution. Cheers!
Example code:
import cv2
import numpy as np
# Read the image and transform to HSV colorspace.
img = cv2.imread('ocupied.jpg')
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Extract the red text.
lower_red = np.array([0,150,50])
upper_red = np.array([40,255,255])
mask_red = cv2.inRange(hsv, lower_red, upper_red)
# Search for contours on the mask.
_, contours, hierarchy = cv2.findContours(mask_red,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
# Create a new mask for further processing.
mask = np.ones(img.shape, np.uint8)*255
# Draw contours on the mask with size and ratio of borders for threshold (to remove other noises from the image).
for cnt in contours:
size = cv2.contourArea(cnt)
x,y,w,h = cv2.boundingRect(cnt)
if 10000 > size > 50 and w*2.5 > h:
cv2.drawContours(mask, [cnt], -1, (0,0,0), -1)
# Connect neighbour contours and select the biggest one (text).
kernel = np.ones((50,50),np.uint8)
opening = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
gray_op = cv2.cvtColor(opening, cv2.COLOR_BGR2GRAY)
_, threshold_op = cv2.threshold(gray_op, 150, 255, cv2.THRESH_BINARY_INV)
_, contours_op, hierarchy_op = cv2.findContours(threshold_op, cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
cnt = max(contours_op, key=cv2.contourArea)
# Create rotated rectangle to get the 4 points of the rectangle.
rect = cv2.minAreaRect(cnt)
# Create bounding and calculate the "lenght" of the text.
box = cv2.boxPoints(rect)
a, b, c, d = box = np.int0(box)
bound =[]
bound.append(a)
bound.append(b)
bound.append(c)
bound.append(d)
bound = np.array(bound)
(x1, y1) = (bound[:,0].min(), bound[:,1].min())
(x2, y2) = (bound[:,0].max(), bound[:,1].max())
# Draw the rectangle.
cv2.rectangle(img,(x1,y1),(x2,y2),(0,255,0),1)
# Identify the room status.
if x2 - x1 > 200:
print('unoccupied')
else:
print('occupied')
# Display the result
cv2.imshow('img', img)
Result:
occupied
unoccupied
Using the tesseract OCR Engine and the python wrapper pytesseract, it is only a few lines' task:
import pytesseract
from PIL import Image
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"
img = Image.open('D:\\tmp2.jpg').crop((0,0,250,35))
print(pytesseract.image_to_string(img, config='--psm 7'))
I have tested this on Windows 7. Of course, I have assumed that the text appears at the same position in every image (from your example, it does seem to be the case). Else, you need to find a better cropping mechanism.

Categories