Related
I'm trying to crop both columns from several pages like this in order to later OCR, looking at splitting the page along the vertical line
What I've got so far is finding the header, so that it can be cropped out:
image = cv2.imread('014-page1.jpg')
im_h, im_w, im_d = image.shape
base_image = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7,7), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Create rectangular structuring element and dilate
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (50,10))
dilate = cv2.dilate(thresh, kernel, iterations=1)
# Find contours and draw rectangle
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=lambda x: cv2.boundingRect(x)[1])
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
if h < 20 and w > 250:
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
How could I split the page vertically, and grab the text in sequence from the columns? Or alternatively, is there a better way to go about this?
Here's my take on the problem. It involves selecting a middle portion of the image, assuming the vertical line is present through all the image (or at least passes through the middle of the page). I process this Region of Interest (ROI) and then reduce it to a row. Then, I get the starting and ending horizontal coordinates of the crop. With this information and then produce the final cropped images.
I tried to made the algorithm general. It can split all the columns if you have more than two columns in the original image. Let's check out the code:
# Imports:
import numpy as np
import cv2
# Image path
path = "D://opencvImages//"
fileName = "pmALU.jpg"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# To grayscale:
grayImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Otsu Threshold:
_, binaryImage = cv2.threshold(grayImage, 0, 255, cv2.THRESH_OTSU)
# Get image dimensions:
(imageHeight, imageWidth) = binaryImage.shape[:2]
# Set middle ROI dimensions:
middleVertical = 0.5 * imageHeight
roiWidth = imageWidth
roiHeight = int(0.1 * imageHeight)
middleRoiVertical = 0.5 * roiHeight
roiY = int(0.5 * imageHeight - middleRoiVertical)
The first portion of the code gets the ROI. I set it to crop around the middle of the image. Let's just visualize the ROI that will be used for processing:
The next step is to crop this:
# Slice the ROI:
middleRoi = binaryImage[roiY:roiY + roiHeight, 0:imageWidth]
showImage("middleRoi", middleRoi)
writeImage(path+"middleRoi", middleRoi)
This produces the following crop:
Alright. The idea is to reduce this image to one row. If I get the maximum value of all columns and store them in one row, I should get a big white portion where the vertical line passes through.
Now, there's a problem here. If I directly reduce this image, this would be the result (the following is an image of the reduced row):
The image is a little bit small, but you can see the row produces two black columns at the sides, followed by two white blobs. That's because the image has been scanned, additionally the text seems to be justified and some margins are produced at the sides. I only need the central white blob with everything else in black.
I can solve this in two steps: draw a white rectangle around the image before reducing it - this will take care of the black columns. After this, I can Flood-filling with black again at both sides of the reduced image:
# White rectangle around ROI:
rectangleThickness = int(0.01 * imageHeight)
cv2.rectangle(middleRoi, (0, 0), (roiWidth, roiHeight), 255, rectangleThickness)
# Image reduction to a row:
reducedImage = cv2.reduce(middleRoi, 0, cv2.REDUCE_MIN)
# Flood fill at the extreme corners:
fillPositions = [0, imageWidth - 1]
for i in range(len(fillPositions)):
# Get flood-fill coordinate:
x = fillPositions[i]
currentCorner = (x, 0)
fillColor = 0
cv2.floodFill(reducedImage, None, currentCorner, fillColor)
Now, the reduced image looks like this:
Nice. But there's another problem. The central black line produced a "gap" at the center of the row. Not a problem really, because I can fill that gap with an opening:
# Apply Opening:
kernel = np.ones((3, 3), np.uint8)
reducedImage = cv2.morphologyEx(reducedImage, cv2.MORPH_CLOSE, kernel, iterations=2)
This is the result. No more central gap:
Cool. Let's get the vertical positions (indices) where the transitions from black to white and vice versa occur, starting at 0:
# Get horizontal transitions:
whiteSpaces = np.where(np.diff(reducedImage, prepend=np.nan))[1]
I now know where to crop. Let's see:
# Crop the image:
colWidth = len(whiteSpaces)
spaceMargin = 0
for x in range(0, colWidth, 2):
# Get horizontal cropping coordinates:
if x != colWidth - 1:
x2 = whiteSpaces[x + 1]
spaceMargin = (whiteSpaces[x + 2] - whiteSpaces[x + 1]) // 2
else:
x2 = imageWidth
# Set horizontal cropping coordinates:
x1 = whiteSpaces[x] - spaceMargin
x2 = x2 + spaceMargin
# Clamp and Crop original input:
x1 = clamp(x1, 0, imageWidth)
x2 = clamp(x2, 0, imageWidth)
currentCrop = inputImage[0:imageHeight, x1:x2]
cv2.imshow("currentCrop", currentCrop)
cv2.waitKey(0)
You'll note I calculate a margin. This is to crop the margins of the columns. I also use a clamp function to make sure the horizontal cropping points are always within image dimensions. This is the definition of that function:
# Clamps an integer to a valid range:
def clamp(val, minval, maxval):
if val < minval: return minval
if val > maxval: return maxval
return val
These are the results (resized for the post, open them in a new tab to see the full image):
Let's check out how this scales to more than two columns. This is a modification of the original input, with more columns added manually, just to check out the results:
These are the four images produced:
In order to separate out the two columns you have to find the dividing line in the center.
You can use Sobel derivative filter in the x-axis to find the black vertical line. Follow this tutorial for more details on the Sobel filter operator.
sobel_vertical = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=3) # (1,0) for x direction derivatives
Extract the line position by thresholding the sobel result:
ret, sobel_thresh = cv.threshold(sobel_vertical,127,255,cv.THRESH_BINARY)
Then scanning the center columns for a column with high concentration of white values.
One way to do this would be to do a column-wise sum and then find the column with the maximum values. But there are other ways to do it.
sum_cols = np.add.reduce(sobel_thresh, axis = 1)
max_col = np.argmax(sum_cols)
In a case where there is no black dividing line you can skip the sobel. Just resize aggressively and search for the columns in the center with high concentration of white pixels.
I want to write a tool for finding the number of angles, curves and straight lines within each bounded object in an image.
All input images will be black on white background and all will represent characters.
As illustrated in the image, for each bounded region, each shape occurrence is noted. It would be preferable to be able to have a threshold for how curvy a curve must be to be considered a curve and not an angle etc. And the same for straight lines and angles.
I have used Hough Line Transform for detecting straight lines on other images and it might work in combination with something here I thought.
Am open to other libraries than opencv - this is just what I have some experience with.
Thanks in advance
EDIT:
So based on the answer from Markus, I made a program using the findContours() with CHAIN_APPROX_SIMPLE.
It produces a somewhat wierd result inputting a 'k' where it correctly identifies some points around the angles but then the 'leg' (the lower diagonal part) has many many points on it. I am unsure how to go about segmenting this to group into straights, angles and curves.
Code:
import numpy as np
img = cv2.imread('Helvetica-K.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (3, 3), 0)
edges = cv2.Canny(blurred, 50, 150, apertureSize=3)
ret, thresh = cv2.threshold(gray, 180, 255, cv2.THRESH_BINARY_INV)
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
#cv2.drawContours(img, contours, 0, (0,255,0), 1)
#Coordinates of each contour
for i in range(len(contours[0])):
print(contours[0][i][0][0])
print(contours[0][i][0][1])
cv2.circle(img, (contours[0][i][0][0], contours[0][i][0][1]), 2, (0,0,255), -1)
cv2.imshow('img', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Img example:
You can use findContours with option CHAIN_APPROX_SIMPLE.
A point with an angle less than some threshold is a corner.
A point with an angle more than some threshold is on a straight line and should be removed.
Two adjacent points with a distance of more than some threshold are the ends of a straight line.
Two adjacent points that are identified to be corners are the ends of a straight line.
All other points belong to some curvy detail.
Update:
Here is some code you can start with. It shows how to smoothen the straight lines, how you can merge several corner points into one, and how to calculate distances and angles at each point. There is still some work to be done for you to achieve the required result but I hope it leads in the right direction.
import numpy as np
import numpy.linalg as la
import cv2
def get_angle(p1, p2, p3):
v1 = np.subtract(p2, p1)
v2 = np.subtract(p2, p3)
cos = np.inner(v1, v2) / la.norm(v1) / la.norm(v2)
rad = np.arccos(np.clip(cos, -1.0, 1.0))
return np.rad2deg(rad)
def get_angles(p, d):
n = len(p)
return [(p[i], get_angle(p[(i-d) % n], p[i], p[(i+d) % n])) for i in range(n)]
def remove_straight(p):
angles = get_angles(p, 2) # approximate angles at points (two steps in path)
return [p for (p, a) in angles if a < 170] # remove points with almost straight angles
def max_corner(p):
angles = get_angles(p, 1) # get angles at points
j = 0
while j < len(angles): # for each point
k = (j + 1) % len(angles) # and its successor
(pj, aj) = angles[j]
(pk, ak) = angles[k]
if la.norm(np.subtract(pj, pk)) <= 4: # if points are close
if aj > ak: # remove point with greater angle
angles.pop(j)
else:
angles.pop(k)
else:
j += 1
return [p for (p, a) in angles]
def main():
img = cv2.imread('abc.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray, 180, 255, cv2.THRESH_BINARY_INV)
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
for c in contours: # for each contour
pts = [v[0] for v in c] # get pts from contour
pts = remove_straight(pts) # remove almost straight angles
pts = max_corner(pts) # remove nearby points with greater angle
angles = get_angles(pts, 1) # get angles at points
# draw result
for (p, a) in angles:
if a < 120:
cv2.circle(img, p, 3, (0, 0, 255), -1)
else:
cv2.circle(img, p, 3, (0, 255, 0), -1)
cv2.imwrite('out.png', img)
cv2.destroyAllWindows()
main()
I am trying to take an image of a license plate so that I can then do some image processing to draw contours around the plate, which I can then use to warp the perspective to then view the plate face on. Unfortunately, I am getting an error which occurs when I am trying to draw contours around an image I have processed. Specifically, I get an Invalid shape (4, 1, 2) for the image data error. I am not too sure how I can go about solving this as I know that all the other images I have processed are fine. It's just when I try to draw contours something is going wrong.
import cv2
import numpy as np
from matplotlib import pyplot as plt
kernel = np.ones((3,3))
image = cv2.imread('NoPlate0.jpg')
def getContours(img):
biggest = np.array([])
maxArea = 0
contours, hierarchy = cv2.findContours(img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
for cnt in contours:
area = cv2.contourArea(cnt)
if area > 500:
cv2.drawContours(imgContour, cnt, -1, (255, 0, 0), 3)
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt,0.02*peri, True)
if area > maxArea and len(approx) == 4:
biggest = approx
maxArea = area
return biggest
imgGray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
imgBlur = cv2.GaussianBlur(imgGray,(5,5),1)
imgCanny = cv2.Canny(imgBlur,150,200)
imgDial = cv2.dilate(imgCanny,kernel,iterations=2)
imgThres = cv2.erode(imgDial,kernel,iterations=2)
imgContour = image.copy()
titles = ['original', 'Blur', 'Canny', 'Dialte', 'Threshold', 'Contours' ]
images = [image, imgBlur, imgCanny, imgDial, imgThres, getContours(imgThres)]
for i in range(6):
plt.subplot(3, 3, i+1), plt.imshow(images[i], 'gray')
plt.title(titles[i])
plt.show()
The exact error I am getting is this:
TypeError: Invalid shape (4, 1, 2) for image data
I am using the following image below as my input:
Your function only returns the actual points along the contour, which you then try to call plt.imshow on. This is why you are getting this error. What you need to do is use cv2.drawContour with this contour to get what you want. In this case, we should restructure your getContours function so that it returns the both the coordinates (so you can use this for later) and the actual contours drawn on the image itself. Instead of mutating imgContour and treating it like a global variable, only draw to this image once which will be the largest contour found in the loop:
def getContours(img):
biggest = np.array([])
maxArea = 0
imgContour = img.copy() # Change - make a copy of the image to return
contours, hierarchy = cv2.findContours(img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
index = None
for i, cnt in enumerate(contours): # Change - also provide index
area = cv2.contourArea(cnt)
if area > 500:
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt,0.02*peri, True)
if area > maxArea and len(approx) == 4:
biggest = approx
maxArea = area
index = i # Also save index to contour
if index is not None: # Draw the biggest contour on the image
cv2.drawContours(imgContour, contours, index, (255, 0, 0), 3)
return biggest, imgContour # Change - also return drawn image
Finally we can use this in your overall code in the following way:
import cv2
import numpy as np
from matplotlib import pyplot as plt
kernel = np.ones((3,3))
image = cv2.imread('NoPlate0.jpg')
imgGray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
imgBlur = cv2.GaussianBlur(imgGray,(5,5),1)
imgCanny = cv2.Canny(imgBlur,150,200)
imgDial = cv2.dilate(imgCanny,kernel,iterations=2)
imgThres = cv2.erode(imgDial,kernel,iterations=2)
biggest, imgContour = getContours(imgThres) # Change
titles = ['original', 'Blur', 'Canny', 'Dilate', 'Threshold', 'Contours']
images = [image, imgBlur, imgCanny, imgDial, imgThres, imgContour] # Change
for i in range(6):
plt.subplot(3, 3, i+1), plt.imshow(images[i], 'gray')
plt.title(titles[i])
plt.show()
As a final note, if you want to warp the license plate image so that it's parallel to the image plane, you can use cv2.getPerspectiveTransform to define a homography going from the original source image (the source points) to the warped image (the destination points), then use cv2.warpPerspective to finally warp the image. Take note that the way the source and destination points is such that they need to be ordered so that their corresponding locations match in perspective. That is, if the first point of the set of points defining the quadrilateral of your region was the top left, the source and destination points should both be defining the top left corner. You can do this by finding the centroid of the quadrilaterals for both the source and destination, then finding the angle subtended from the centroid to each of the corners and ordering both of them that way by sorting the angles.
Here's the following function I wrote that does this called order_points:
def order_points(pts):
# Step 1: Find centre of object
center = np.mean(pts)
# Step 2: Move coordinate system to centre of object
shifted = pts - center
# Step #3: Find angles subtended from centroid to each corner point
theta = np.arctan2(shifted[:, 0], shifted[:, 1])
# Step #4: Return vertices ordered by theta
ind = np.argsort(theta)
return pts[ind]
Finally, with the corner points you returned, try doing:
src = np.squeeze(biggest).astype(np.float32) # Source points
height = image.shape[0]
width = image.shape[1]
# Destination points
dst = np.float32([[0, 0], [0, height - 1], [width - 1, 0], [width - 1, height - 1]])
# Order the points correctly
src = order_points(src)
dst = order_points(dst)
# Get the perspective transform
M = cv2.getPerspectiveTransform(src, dst)
# Warp the image
img_shape = (width, height)
warped = cv2.warpPerspective(img, M, img_shape, flags=cv2.INTER_LINEAR)
src are the four corners of the source polygon that encompasses the license plate. Take note because they're returned from cv2.approxPolyDP, they will be a 4 x 1 x 2 NumPy array of integers. You will need to remove the singleton second dimension and convert these into 32-bit floating-point so that they can be used with cv2.getPerspectiveTransform. dst are the destination points where each of the corners in the source polygon get mapped to the corner points of actual output image dimensions, which will be the same size as the input image. One last thing to remember is that with cv2.warpPerspective, you specify the size of the image as (width, height).
If you finally want to integrate this all together and make the getContours function return the warped image, we can do this very easily. We have to modify a few things to get this to work as intended:
getContours will also take in the original RGB image so that we can properly visualise the contour and get a better perspective on how the license plate is being localised.
Add in the logic to warp the image inside getContours as I showed above.
Change the plotting code to also include this warped image as well as return the warped image from getContours.
Modify the plotting code slightly for showing the original image in Matplotlib, as cv2.imread reads in images in BGR format, but Matplotlib expects images to be in RGB format.
Therefore:
import cv2
import numpy as np
from matplotlib import pyplot as plt
def order_points(pts):
# Step 1: Find centre of object
center = np.mean(pts)
# Step 2: Move coordinate system to centre of object
shifted = pts - center
# Step #3: Find angles subtended from centroid to each corner point
theta = np.arctan2(shifted[:, 0], shifted[:, 1])
# Step #4: Return vertices ordered by theta
ind = np.argsort(theta)
return pts[ind]
def getContours(img, orig): # Change - pass the original image too
biggest = np.array([])
maxArea = 0
imgContour = orig.copy() # Make a copy of the original image to return
contours, hierarchy = cv2.findContours(img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
index = None
for i, cnt in enumerate(contours): # Change - also provide index
area = cv2.contourArea(cnt)
if area > 500:
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt,0.02*peri, True)
if area > maxArea and len(approx) == 4:
biggest = approx
maxArea = area
index = i # Also save index to contour
warped = None # Stores the warped license plate image
if index is not None: # Draw the biggest contour on the image
cv2.drawContours(imgContour, contours, index, (255, 0, 0), 3)
src = np.squeeze(biggest).astype(np.float32) # Source points
height = image.shape[0]
width = image.shape[1]
# Destination points
dst = np.float32([[0, 0], [0, height - 1], [width - 1, 0], [width - 1, height - 1]])
# Order the points correctly
biggest = order_points(src)
dst = order_points(dst)
# Get the perspective transform
M = cv2.getPerspectiveTransform(src, dst)
# Warp the image
img_shape = (width, height)
warped = cv2.warpPerspective(orig, M, img_shape, flags=cv2.INTER_LINEAR)
return biggest, imgContour, warped # Change - also return drawn image
kernel = np.ones((3,3))
image = cv2.imread('NoPlate0.jpg')
imgGray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
imgBlur = cv2.GaussianBlur(imgGray,(5,5),1)
imgCanny = cv2.Canny(imgBlur,150,200)
imgDial = cv2.dilate(imgCanny,kernel,iterations=2)
imgThres = cv2.erode(imgDial,kernel,iterations=2)
biggest, imgContour, warped = getContours(imgThres, image) # Change
titles = ['Original', 'Blur', 'Canny', 'Dilate', 'Threshold', 'Contours', 'Warped'] # Change - also show warped image
images = [image[...,::-1], imgBlur, imgCanny, imgDial, imgThres, imgContour, warped] # Change
# Change - Also show contour drawn image + warped image
for i in range(5):
plt.subplot(3, 3, i+1)
plt.imshow(images[i], cmap='gray')
plt.title(titles[i])
plt.subplot(3, 3, 6)
plt.imshow(images[-2])
plt.title(titles[-2])
plt.subplot(3, 3, 8)
plt.imshow(images[-1])
plt.title(titles[-1])
plt.show()
The figure I get is now:
You need to reshape biggest which is returned by getContours() to (4, 2). And also if you want to have the warped image then you need to import imutils. So to solve your issue, please do the followings:
import the four_point_transform function by adding:
from imutils.perspective import four_point_transform
And change the return statement of getContours() function like below:
return four_point_transform(img, biggest.reshape(4, 2))
One of the first processing steps in a tool I'm coding is to find the coordinates of the outside corners of 4 big black squares. They will then be used to do a homographic transform, in order to deskew / unrotate the image (a.k.a perspective transform), to finally get a rectangular image. Here is an example of - rotated and noisy - input (download link here):
To keep the big squares only, I'm using morphological transformations like closing/opening:
import cv2, numpy as np
img = cv2.imread('rotatednoisy-cropped.png', cv2.IMREAD_GRAYSCALE)
kernel = np.ones((30, 30), np.uint8)
img = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)
cv2.imwrite('output.png', img)
Input file (download link):
Output, after morphological transform:
Problem: the output squares are not square anymore, and therefore the coordinates of the top left corner of the square will be not precise at all!
I could reduce the kernel size, but then it would keep more unwanted small elements.
Question: how to get a better detection of the corners of the squares?
Note:
As a morphological closing is just a dilatation + an erosion, I found the culprit:
import cv2, numpy as np
img = cv2.imread('rotatednoisy-cropped.png', cv2.IMREAD_GRAYSCALE)
kernel = np.ones((30, 30), np.uint8)
img = cv2.dilate(img, kernel, iterations = 1)
After this step, it's still ok:
Then
img = cv2.erode(img, kernel, iterations = 1)
gives
and it's not ok anymore!
See this link for detailed explanation on how to de-skew an image.
import cv2
import numpy as np
def corners(box):
cx,cy,w,h,angle = box[0][0],box[0][1],box[1][0],box[1][1],box[2]
CV_PI = 22./7.
_angle = angle*CV_PI/180.;
b = np.cos(_angle)*0.5;
a = np.sin(_angle)*0.5;
pt = []
pt.append((int(cx - a*h - b*w),int(cy + b*h - a*w)));
pt.append((int(cx + a*h - b*w),int(cy - b*h - a*w)));
pt.append((int(2*cx - pt[0][0]),int(2*cy - pt[0][1])));
pt.append((int(2*cx - pt[1][0]),int(2*cy - pt[1][1])));
return pt
if __name__ == '__main__':
image = cv2.imread('image.jpg',cv2.IMREAD_UNCHANGED)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
n = 3
sigma = 0.3 * (n/2 - 1) + 0.8
gray = cv2.GaussianBlur(gray, ksize=(n,n), sigmaX=sigma)
ret,binary = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU+cv2.THRESH_BINARY)
_,contours,_ = cv2.findContours(binary, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
contours.sort(key=lambda x: len(x), reverse=True)
points = []
for i in range(0,4):
shape = cv2.approxPolyDP(contours[i], 0.05*cv2.arcLength(contours[i],True), True)
if len(shape) == 4:
points.append(shape)
points = np.array(points,dtype=np.int32)
points = np.reshape(points, (-1,2))
box = cv2.minAreaRect(points)
pt = corners(box)
for i in range(0,4):
image = cv2.line(image, (pt[i][0],pt[i][1]), (pt[(i+1)%4][0],pt[(i+1)%4][1]), (0,0,255))
(h,w) = image.shape[:2]
(center) = (w//2,h//2)
angle = box[2]
if angle < -45:
angle = (angle+90)
else:
angle = -angle
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (w,h), flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_CONSTANT)
cv2.imshow('image', image)
cv2.imshow('rotated', rotated)
cv2.waitKey(0)
cv2.destroyAllWindows()
You could try by searching and filtering out your specific contours (black rectangles) and sorting them with a key. Then select the extreme point for each contour (left, right, top, bottom) and you will get the points. Note that this approach is ok for this picture only and if the picture was roteted in other direction, you would have to change the code accordingly. I am not an expert but I hope this helps a bit.
import numpy as np
import cv2
img = cv2.imread("rotate.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, threshold = cv2.threshold(gray,150,255,cv2.THRESH_BINARY)
im, contours, hierarchy = cv2.findContours(threshold,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
contours.sort(key=lambda c: np.min(c[:,:,1]))
j = 1
if len(contours) > 0:
for i in range(0, len(contours)):
size = cv2.contourArea(contours[i])
if 90 < size < 140:
if j == 1:
c1 = contours[i]
j += 1
elif j == 2:
c2 = contours[i]
j += 1
elif j == 3:
c3 = contours[i]
j += 1
elif j == 4:
c4 = contours[i]
break
Top = tuple(c1[c1[:, :, 1].argmin()][0])
Right = tuple(c2[c2[:, :, 0].argmax()][0])
Left = tuple(c3[c3[:, :, 0].argmin()][0])
Bottom = tuple(c4[c4[:, :, 1].argmax()][0])
cv2.circle(img, Top, 2, (0, 255, 0), -1)
cv2.circle(img, Right, 2, (0, 255, 0), -1)
cv2.circle(img, Left, 2, (0, 255, 0), -1)
cv2.circle(img, Bottom, 2, (0, 255, 0), -1)
cv2.imshow("Image", img)
cv2.waitKey(0)
Result:
You can extract the squares as single blobs after binarization with a suitable threshold, and select the appropriate ones based on size. You can also first denoise with a median filter if you want.
Then a tight rotated bounding rectangle will give you the corners (you can obtain it by running Rotating Calipers on the Convex Hull).
This is my first question here so I'm asking for understanding. I have to process hundreds of the satellites images.
I try to find contour of the area of the useful data located on the image - only the largest one.
Then I want to save the coordinates of the few points (x,y) corresponding to this contour. In simplest case, the area is a square and can be represented by 4 points, but for more complicated shapes the contour will be approximated by a little more points (preferably no more than ~ fifteen). However I am still not be able to find the areas on my images. Sometimes the area touches the edge of the image. Therefore, in this script I enlarge the pictures and add additional boundaries filled by the background color. Examples of pictures you will find here satellite1,satellite2,satellite3
As you see the images can have different background colors and in addition they contain countries borders and legend. I have tried to use Aidenhjj tips OpenCV - using cv2.approxPolyDP() correctly and prepared my script. I tried many approaches, filtering and tune parameters but still can't succeed with my data. I am asking you for help.
import numpy as np
import cv2
import matplotlib.pyplot as plt
image = cv2.imread('image1.jpg')
image = cv2.resize(image, None,fx=0.25, fy=0.25, interpolation = cv2.INTER_CUBIC)
ysize, xsize, channels = image.shape
print("Image size: {} x {}".format(xsize, ysize))
#calculate the histograms in r,g,b channels, measure background color
r, g, b = cv2.split(image)
image_data = image
histr = cv2.calcHist([r],[0],None,[256],[0,256])
for y in range(0,len(histr)):
elem = histr[y]
if elem == histr.max():
break
else:
y = none
R=y
histr = cv2.calcHist([g],[0],None,[256],[0,256])
for y in range(0,len(histr)):
elem = histr[y]
if elem == histr.max():
break
else:
y = none
G=y
histr = cv2.calcHist([b],[0],None,[256],[0,256])
for y in range(0,len(histr)):
elem = histr[y]
if elem == histr.max():
break
else:
y = none
B=y
color = (R, G, B)
#add borders around the image colorized as background. This will allow me to find closed contour around area with data.
bordersize=100
new_xsize = xsize + bordersize*2
new_ysize = ysize + bordersize*2
#image_border.show()
image_border=cv2.copyMakeBorder(image, top=bordersize, bottom=bordersize, left=bordersize, right=bordersize, borderType= cv2.BORDER_CONSTANT, value=[R,G,B] )
#ysizeb, xsizeb, channelsb = image_border.shape
# get a blank canvas for drawing contour on and convert image to grayscale
canvas = np.zeros(image_border.shape, np.uint8)
#imgc = cv2.medianBlur(img,21)
img2gray = cv2.cvtColor(image_border,cv2.COLOR_BGR2GRAY)
# filter out country borders
kernel = np.ones((5,5),np.float32)/25
img2gray = cv2.filter2D(img2gray,-1,kernel)
# threshold the image and extract contours
thresh = cv2.adaptiveThreshold(img2gray,255,cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY,11,11)
contours,hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
plt.subplot(111),plt.imshow(thresh,'gray')
plt.show()
# find the biggest area
cnt = contours[0]
max_area = cv2.contourArea(cnt)
for cont in contours:
if cv2.contourArea(cont) > max_area:
cnt = cont
max_area = cv2.contourArea(cont)
perimeter = cv2.arcLength(cnt,True)
epsilon = 0.01*cv2.arcLength(cnt,True)
approx = cv2.approxPolyDP(cnt,epsilon,True)
hull = cv2.convexHull(cnt)
# cv2.isContourConvex(cnt)
cv2.drawContours(canvas, cnt, -1, (0, 255, 0), 3)
cv2.drawContours(canvas, approx, -1, (0, 0, 255), 3)
#cv2.drawContours(canvas, [hull], -1, (0, 0, 255), 3)
cv2.imshow("Contour", canvas)
k = cv2.waitKey(0)
if k == 27: # wait for ESC key to exit
cv2.destroyAllWindows()