How to detect color of balls in the given image using python-opencv?
Introduction
I will dismantle the question in the following three sections
Obtain the English name of a color from a RGB or Hex value
Locate the circles on the image
Obtain the English name on per circle
Obtain color name from RGB or Hex
Using the following answer:
Convert RGB color to English color name, like 'green' with Python
We are almost done, except for the small change that cv2 uses BGR instead of RGB, therefore we take RGB[2] (the blue channel) to match the red channel of the webcolors.
def color_rgb_to_name(rgb: tuple[int, int, int]) -> str:
"""
Translates an rgb value to the closest English color name known
Args:
rgb: The rgb value that has to be translated to the color name.
Returns:
The name of the colors that most closely defines the rgb value in CSS3.
"""
min_colours = {}
for key, name in webcolors.CSS3_HEX_TO_NAMES.items():
r_c, g_c, b_c = webcolors.hex_to_rgb(key)
rd = (r_c - rgb[2]) ** 2
gd = (g_c - rgb[1]) ** 2
bd = (b_c - rgb[0]) ** 2
min_colours[(rd + gd + bd)] = name
return min_colours[min(min_colours.keys())]
Which is already enough to solve the question if you only care about the colors that are used in the image.
image = cv2.imread('image.jpg')
colors = set([color_rgb_to_name(val) for val in np.unique(image.reshape(-1, 3), axis=0)])
Colors:
{'firebrick', 'cadetblue', 'peru', 'indianred', 'darkturquoise', 'cyan', 'darkviolet', 'darkorange', 'midnightblue', 'indigo', 'lightseagreen', 'mediumturquoise', 'blue', 'brown', 'chocolate', 'saddlebrown', 'mediumblue', 'darkslateblue', 'turquoise', 'blueviolet', 'sienna', 'black', 'orangered', 'slateblue'}
Notes:
This uses the webcolors package, but you can create your own dictionary. This gives you a higher control on the colors that you allow / disallow.
Locate the Circles
The colors that we found above are all the unique colors that are contained in the image. This is often not really what we want. Instead we want to find the color that is most commonly used inside the circle.
In order to define the color in a circle there are several sources that we can use:
https://www.tutorialspoint.com/find-circles-in-an-image-using-opencv-in-python
https://www.pyimagesearch.com/2014/07/21/detecting-circles-images-using-opencv-hough-circles/
How to find the circle in the given images using opencv python (hough circles )?
Which combines to the following code:
def locate_circles(img: np.ndarray, vmin=10, vmax=30) -> np.ndarray:
"""
Locates circles on a gray image.
Args:
img: a gray image with black background.
vmin: The minimum radius value of the circles.
vmax: The maximum radius value of the circles.
Returns:
A numpy array containing the center location of the circles and the radius.
"""
img = cv2.medianBlur(img, 5)
circles = cv2.HoughCircles(img, cv2.HOUGH_GRADIENT, 1, 20, param1=50, param2=20, minRadius=vmin, maxRadius=vmax)
circles = np.round(circles[0, :]).astype("int")
return circles
I added the medianBlur to increase the consistency in locating the circles, alternatively you could play a bit more with the param values or radius sizes.
Test code:
image = cv2.imread('image.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
for (x, y, r) in locate_circles(gray, vmin=10, vmax=30):
print(x, y, r)
Answers:
262 66 12
186 74 12
136 60 12
Obtain the English name per circle
Now that we know where the circle is located, we can get the average color per circle and combine this with the above code obtain the final result.
The following code locates all x and y values that are inside the circle.
def coordinates(x: int, y: int, r: int, width: int, height: int) -> np.ndarray:
"""
Locates all valid x and y coordinates inside a circle.
Args:
x: Center column position.
y: Center row position.
r: Radius of the circle.
width: the maximum width value that is still valid (in bounds)
height: the maximum height values that is still valid (in bounds)
Returns:
A numpy array with all valid x and y coordinates that fall within the circle.
"""
indices = []
for dx in range(-r, r):
for dy in range(-r, r):
if 0 <= x + dx < width and 0 <= y + dy < height:
indices.append([x + dx, y + dy])
return np.array(indices).T.reshape(2, -1)
Which can then be used to obtain the average color value per circle.
image = cv2.imread('image.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
for (x, y, r) in locate_circles(gray, vmin=10, vmax=30):
columns, rows = coordinates(x, y, r, *gray.shape[:2])
color = np.average(image[rows, columns], axis=0).astype(np.uint8)
name = color_rgb_to_name(color)
# Draw the information on the screen
cv2.putText(image, name, (x - 20, y - 20), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (255, 0, 0), 1)
Answer:
indigo
firebrick
darkturquoise
TL;DR
import cv2
import numpy as np
import webcolors
def imshow(img, delay=0):
cv2.imshow('Test', img)
cv2.waitKey(delay)
def locate_circles(img: np.ndarray, vmin=10, vmax=30) -> np.ndarray:
"""
https://www.tutorialspoint.com/find-circles-in-an-image-using-opencv-in-python
https://www.pyimagesearch.com/2014/07/21/detecting-circles-images-using-opencv-hough-circles/
https://stackoverflow.com/questions/67764821/how-to-find-the-circle-in-the-given-images-using-opencv-python-hough-circles
Locates circles on a gray image.
Args:
img: a gray image with black background.
vmin: The minimum radius value of the circles.
vmax: The maximum radius value of the circles.
Returns:
A numpy array containing the center location of the circles and the radius.
"""
img = cv2.medianBlur(img, 5)
circles = cv2.HoughCircles(img, cv2.HOUGH_GRADIENT, 1, 20, param1=50, param2=20, minRadius=vmin, maxRadius=vmax)
circles = np.round(circles[0, :]).astype("int")
return circles
def coordinates(x: int, y: int, r: int, width: int, height: int) -> np.ndarray:
"""
Locates all valid x and y coordinates inside a circle.
Args:
x: Center column position.
y: Center row position.
r: Radius of the circle.
width: the maximum width value that is still valid (in bounds)
height: the maximum height values that is still valid (in bounds)
Returns:
A numpy array with all valid x and y coordinates that fall within the circle.
"""
indices = []
for dx in range(-r, r):
for dy in range(-r, r):
if 0 <= x + dx < width and 0 <= y + dy < height:
indices.append([x + dx, y + dy])
return np.array(indices).T.reshape(2, -1)
def draw_circles(img: np.ndarray, x: int, y: int, r: int):
"""
draw the circle in the output image, then draw a rectangle corresponding to the center of the circle
Args:
img: Image on which to draw the circle location and center.
x: Center column position.
y: Center row position.
r: Radius of the circle.
Modifies:
The input image by drawing a circle on it and a rectangle on the image.
"""
cv2.circle(img, (x, y), r, (0, 255, 0), 4)
cv2.rectangle(img, (x - 2, y - 2), (x + 2, y + 2), (0, 128, 255), -1)
def color_rgb_to_name(rgb: tuple[int, int, int]) -> str:
"""
https://stackoverflow.com/questions/9694165/convert-rgb-color-to-english-color-name-like-green-with-python
Translates an rgb value to the closest English color name known
Args:
rgb: The rgb value that has to be translated to the color name.
Returns:
The name of the colors that most closely defines the rgb value in CSS3.
"""
min_colours = {}
for key, name in webcolors.CSS3_HEX_TO_NAMES.items():
r_c, g_c, b_c = webcolors.hex_to_rgb(key)
rd = (r_c - rgb[2]) ** 2
gd = (g_c - rgb[1]) ** 2
bd = (b_c - rgb[0]) ** 2
min_colours[(rd + gd + bd)] = name
return min_colours[min(min_colours.keys())]
if __name__ == '__main__':
image = cv2.imread('image.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
for (x, y, r) in locate_circles(gray, vmin=10, vmax=30):
columns, rows = coordinates(x, y, r, *gray.shape[:2])
color = np.average(image[rows, columns], axis=0).astype(np.uint8)
name = color_rgb_to_name(color)
print(name)
# Draw extra information on the screen
# draw_circles(image, x, y, r)
cv2.putText(image, name, (x - 20, y - 20), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (255, 0, 0), 1)
# show the output image
imshow(image)
Related
I have image with with many cars, every car has coordinates of polygon and keypoints. I use this code to crop object by polygon and get new keypoints.
x,y,w,h = cv2.boundingRect(points_poly_int)
cropped_img = img[y:y+h,x:x+w]
head_coords_after_crop = np.asarray([head_coords_old[0] - x, head_coords_old[1] -y])
center_coords_after_crop = np.asarray([center_coords_old[0] - x, center_coords_old[1] -y])
Here example of cropped image and keypoints:
What I need is rotate the whole image by any angle and remap coordinates of polygons and keypoints for every object
Here method which return rotated image and matrix of transformation:
def rotate_image(mat, angle):
"""
Rotates an image (angle in degrees) and expands image to avoid cropping
"""
height, width = mat.shape[:2] # image shape has 3 dimensions
image_center = (width/2, height/2) # getRotationMatrix2D needs coordinates in reverse order (width, height) compared to shape
rotation_mat = cv2.getRotationMatrix2D(image_center, angle, 1.)
# rotation calculates the cos and sin, taking absolutes of those.
abs_cos = abs(rotation_mat[0,0])
abs_sin = abs(rotation_mat[0,1])
# find the new width and height bounds
bound_w = int(height * abs_sin + width * abs_cos)
bound_h = int(height * abs_cos + width * abs_sin)
# subtract old image center (bringing image back to origo) and adding the new image center coordinates
rotation_mat[0, 2] += bound_w/2 - image_center[0]
rotation_mat[1, 2] += bound_h/2 - image_center[1]
# rotate image with the new bounds and translated rotation matrix
rotated_mat = cv2.warpAffine(mat, rotation_mat, (bound_w, bound_h))
return rotated_mat, rotation_mat
What I do next is multiplying old coordinates with matrix of transformation. Here code:
img_roated, C = rotate_image(img, 180)
#Remap polygons coordinates
ones = np.ones((points_poly.shape[0], 1))
new_poly = np.hstack((points_poly,ones))
new_poly = (C # new_poly.T).T
new_poly = new_poly.astype(np.int32)
#Crop by new polygons
x,y,w,h = cv2.boundingRect(new_poly)
cropped_img = img_roated[y:y+h,x:x+w]
#Reamp keypoints coordinates
head_coords_new = np.asarray([756.600, 1687.900, 1])
center_coords_new = np.asarray([762.300, 1708.400, 1])
head_coords_new = (C # head_coords_new.T).T
center_coords_new = (C # center_coords_new.T).T
head_coords_new = np.asarray([head_coords_old[0] - x, head_coords_old[1] - y])
center_coords_new = np.asarray([center_coords_old[0] - x, center_coords_old[1] - y])
head_coords_new = head_coords_new.astype(np.int32)
center_coords_new = center_coords_new.astype(np.int32)
But result is differnt from first picture, Here new picture:
Somehow keypoints shift, and it happens with every angle. And I don't know how to fix it.
Here the source image: https://drive.google.com/file/d/14K_MQHMwtWlw-QCQbaB5ecrREbWwyKhO/view?usp=sharing
And polygons with keypoints:
{'keypoints': [{'id': 'head', 'pos': '756.600;1687.900'},
{'id': 'roof_center', 'pos': '762.300;1708.400'}],
'polygon': '{(759.700;1717.300);(770.000;1714.200);(762.000;1687.400);(756.600;1687.900);(751.200;1690.700);(759.700;1717.300)}'}
If you wish to reproduce the issue.
Thanks in advnced
Here the differnce. Right pic is first image rotated in pic viewer. Left is transformed pic
Hi all as the title said is there a way for this? For example, I want to crop fourth quadrant of an image and the other area will be turned to black while retaining its original size. Currently, I am getting the center width and height of the image then accessing the pixel:
Cropped = I[centerHeight:,centerWidth:]
but that just stores the fourth quadrant cropped image. Thanks!
I don't think there is a function in OpenCV that does that work. This function will solve your problem.
import numpy as np
def crop_image(img, cx, cy, w, h):
"""
args:
cx: x coordinate of center
cy: y coordinate of center
w: width of crop
h: height of crop
"""
result = np.zeros(img.shape, dtype=np.uint8)
result[cx - w//2 :cx + w//2, cy - h//2:cy + h//2] = img[cx - w//2 :cx + w//2, cy - h//2:cy + h//2]
return result
I want to get the shade value of each circles from an image.
I try to detect circles using HoughCircle.
I get the center of each circle.
I put the text (the circle numbers) in a circle.
I set the pixel subset to obtain the shading values and calculate the averaged shading values.
I want to get the results of circle number, the coordinates of the center, and averaged shading values in CSV format.
But, in the 3rd step, the circle numbers were randomly assigned. So, it's so hard to find circle number.
How can I number circles in a sequence?
# USAGE
# python detect_circles.py --image images/simple.png
# import the necessary packages
import numpy as np
import argparse
import cv2
import csv
# define a funtion of ROI calculating the average value in specified sample size
def ROI(img,x,y,sample_size):
Each_circle=img[y-sample_size:y+sample_size, x-sample_size:x+sample_size]
average_values=np.mean(Each_circle)
return average_values
# open the csv file named circles_value
circles_values=open('circles_value.csv', 'w')
# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required = True, help = "Path to the image")
args = vars(ap.parse_args())
# load the image, clone it for output, and then convert it to grayscale
image = cv2.imread(args["image"])
output = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# detect circles in the image
circles = cv2.HoughCircles(gray, cv2.HOUGH_GRADIENT, 1.2,50, 100, 1, 1, 20, 30)
# ensure at least some circles were found
if circles is not None:
# convert the (x, y) coordinates and radius of the circles to integers
circles = np.round(circles[0, :]).astype("int")
number=1
font = cv2.FONT_HERSHEY_SIMPLEX
# loop over the (x, y) coordinates and radius of the circles
for (x, y, r) in circles:
# draw the circle in the output image, then draw a rectangle
# corresponding to the center of the circle
number=str(number)
cv2.circle(output, (x, y), r, (0, 255, 0), 4)
cv2.rectangle(output, (x - 10, y - 10), (x + 10, y + 10), (0, 128, 255), -1)
# number each circle, but its result shows irregular pattern
cv2.putText(output, number, (x,y), font,0.5,(0,0,0),2,cv2.LINE_AA)
# get the average value in specified sample size (20 x 20)
sample_average_value=ROI(output, x, y, 20)
# write the csv file with number, (x,y), and average pixel value
circles_values.write(number+','+str(x)+','+str(y)+','+str(sample_average_value)+'\n')
number=int(number)
number+=1
# show the output image
cv2.namedWindow("image", cv2.WINDOW_NORMAL)
cv2.imshow("image", output)
cv2.waitKey(0)
# close the csv file
circles_values.close()
You could sort your circles based on their x, y values, the width of the image and a rough line height, for example:
import numpy as np
import argparse
import cv2
import csv
# define a funtion of ROI calculating the average value in specified sample size
def ROI(img,x,y,sample_size):
Each_circle=img[y-sample_size:y+sample_size, x-sample_size:x+sample_size]
average_values=np.mean(Each_circle)
return average_values
# open the csv file named circles_value
with open('circles_value.csv', 'wb') as circles_values:
csv_output = csv.writer(circles_values)
# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required = True, help = "Path to the image")
args = vars(ap.parse_args())
# load the image, clone it for output, and then convert it to grayscale
image = cv2.imread(args["image"])
output = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# detect circles in the image
circles = cv2.HoughCircles(gray, cv2.cv.CV_HOUGH_GRADIENT, 1.2,50, 100, 1, 1, 20, 30)
# ensure at least some circles were found
if circles is not None:
# convert the (x, y) coordinates and radius of the circles to integers
circles = np.round(circles[0, :]).astype("int")
font = cv2.FONT_HERSHEY_SIMPLEX
height = 40
# loop over the (x, y) coordinates and radius of the circles
for number, (x, y, r) in enumerate(sorted(circles, key=lambda v: v[0] + (v[1] / height) * image.shape[1]), start=1):
text = str(number)
(tw, th), bl = cv2.getTextSize(text, font, 0.5, 2) # So the text can be centred in the circle
tw /= 2
th = th / 2 + 2
# draw the circle in the output image, then draw a rectangle
# corresponding to the center of the circle
cv2.circle(output, (x, y), r, (0, 255, 0), 3)
cv2.rectangle(output, (x - tw, y - th), (x + tw, y + th), (0, 128, 255), -1)
# number each circle, centred in the rectangle
cv2.putText(output, text, (x-tw, y + bl), font, 0.5, (0,0,0), 2, cv2.CV_AA)
# get the average value in specified sample size (20 x 20)
sample_average_value = ROI(output, x, y, 20)
# write the csv file with number, (x,y), and average pixel value
csv_output.writerow([number, x, y, sample_average_value])
# show the output image
cv2.namedWindow("image", cv2.WINDOW_NORMAL)
cv2.imshow("image", output)
cv2.waitKey(0)
Also, it is easier to use Python's CSV library to write entries to your output file. This way you don't need to convert each entry to a string and add commas between each entry. enumerate() can be used to count each circle automatically. Also getTextSize() can be used to determine the dimensions of the text to be printed enabling you to centre it in the rectangle.
This would give you an output as follows:
And a CSV starting as:
1,2,29,nan
2,51,19,nan
3,107,22,100.72437499999999
4,173,23,102.33291666666666
5,233,26,88.244791666666671
6,295,22,92.953541666666666
7,358,28,142.51625000000001
8,418,26,155.12875
9,484,31,127.02541666666667
10,547,25,112.57958333333333
The mistake in your code is that your number is dependent upon the order of circles in list returned from cv2.HoughCircles which can be random, So what I would have done in this situation is to devise a formula which would convert the center(x, y) value of each circle to an ID, and the same circle would yield same ID given its center position remains same:
def get_id_from_center(x, y):
return x + y*50
for (x, y, r) in circles:
number = str(get_id_from_center(x, y))
I need to divide an image to regions of pixels whose RGB value pass a certain test.
I'm OK with scanning the image and checking each pixel's value however the part of clustering them into regions and then getting those regions coordinates (x, y, width, height) leaves me in total dark :)
here's the code I have so far
from PIL import Image
def detectRedRegions(PILImage):
image = PILImage.load()
width, height = PILImage.size
reds = []
h = 0
while h < height:
w = 0
while w < width:
px = image[w, h]
if is_red(px):
reds.append([w, h])
# Here's where I'm being clueless
w +=1
h +=1
I read tons about clustering but just can't wrap my head around this subject any code example s that will fit my needs will be great (and hopefully enlightening
Thanks!
[EDIT]
While the solution below works, it can be made better. Here is a version with better names and better performance:
from itertools import product
from PIL import Image, ImageDraw
def closed_regions(image, test):
"""
Return all closed regions in image who's pixels satisfy test.
"""
pixel = image.load()
xs, ys = map(xrange, image.size)
neighbors = dict((xy, set([xy])) for xy in product(xs, ys) if test(pixel[xy]))
for a, b in neighbors:
for cd in (a + 1, b), (a, b + 1):
if cd in neighbors:
neighbors[a, b].add(cd)
neighbors[cd].add((a, b))
seen = set()
def component(node, neighbors=neighbors, seen=seen, see=seen.add):
todo = set([node])
next_todo = todo.pop
while todo:
node = next_todo()
see(node)
todo |= neighbors[node] - seen
yield node
return (set(component(node)) for node in neighbors if node not in seen)
def boundingbox(coordinates):
"""
Return the bounding box that contains all coordinates.
"""
xs, ys = zip(*coordinates)
return min(xs), min(ys), max(xs), max(ys)
def is_black_enough(pixel):
r, g, b = pixel
return r < 10 and g < 10 and b < 10
if __name__ == '__main__':
image = Image.open('some_image.jpg')
draw = ImageDraw.Draw(image)
for rect in disjoint_areas(image, is_black_enough):
draw.rectangle(boundingbox(region), outline=(255, 0, 0))
image.show()
Unlike disjoint_areas() below, closed_regions() returns sets of pixel coordinates instead of their bounding boxes.
Also, if we use flooding instead of the connected components algorithm, we can make it even simpler and about twice as fast:
from itertools import chain, product
from PIL import Image, ImageDraw
flatten = chain.from_iterable
def closed_regions(image, test):
"""
Return all closed regions in image who's pixel satisfy test.
"""
pixel = image.load()
xs, ys = map(xrange, image.size)
todo = set(xy for xy in product(xs, ys) if test(pixel[xy]))
while todo:
region = set()
edge = set([todo.pop()])
while edge:
region |= edge
todo -= edge
edge = todo.intersection(
flatten(((x - 1, y), (x, y - 1), (x + 1, y), (x, y + 1)) for x, y in edge))
yield region
# rest like above
It was inspired by Eric S. Raymond's version of floodfill.
[/EDIT]
One could probably use floodfill, but I like this:
from collections import defaultdict
from PIL import Image, ImageDraw
def connected_components(edges):
"""
Given a graph represented by edges (i.e. pairs of nodes), generate its
connected components as sets of nodes.
Time complexity is linear with respect to the number of edges.
"""
neighbors = defaultdict(set)
for a, b in edges:
neighbors[a].add(b)
neighbors[b].add(a)
seen = set()
def component(node, neighbors=neighbors, seen=seen, see=seen.add):
unseen = set([node])
next_unseen = unseen.pop
while unseen:
node = next_unseen()
see(node)
unseen |= neighbors[node] - seen
yield node
return (set(component(node)) for node in neighbors if node not in seen)
def matching_pixels(image, test):
"""
Generate all pixel coordinates where pixel satisfies test.
"""
width, height = image.size
pixels = image.load()
for x in xrange(width):
for y in xrange(height):
if test(pixels[x, y]):
yield x, y
def make_edges(coordinates):
"""
Generate all pairs of neighboring pixel coordinates.
"""
coordinates = set(coordinates)
for x, y in coordinates:
if (x - 1, y - 1) in coordinates:
yield (x, y), (x - 1, y - 1)
if (x, y - 1) in coordinates:
yield (x, y), (x, y - 1)
if (x + 1, y - 1) in coordinates:
yield (x, y), (x + 1, y - 1)
if (x - 1, y) in coordinates:
yield (x, y), (x - 1, y)
yield (x, y), (x, y)
def boundingbox(coordinates):
"""
Return the bounding box of all coordinates.
"""
xs, ys = zip(*coordinates)
return min(xs), min(ys), max(xs), max(ys)
def disjoint_areas(image, test):
"""
Return the bounding boxes of all non-consecutive areas
who's pixels satisfy test.
"""
for each in connected_components(make_edges(matching_pixels(image, test))):
yield boundingbox(each)
def is_black_enough(pixel):
r, g, b = pixel
return r < 10 and g < 10 and b < 10
if __name__ == '__main__':
image = Image.open('some_image.jpg')
draw = ImageDraw.Draw(image)
for rect in disjoint_areas(image, is_black_enough):
draw.rectangle(rect, outline=(255, 0, 0))
image.show()
Here, pairs of neighboring pixels that both satisfy is_black_enough() are interpreted as edges in a graph. Also, every pixel is viewed as its own neighbor. Due to this re-interpretation we can use the connected component algorithm for graphs which is quite easy to implement. The result is the sequence of the bounding boxes of all areas who's pixels satisfy is_black_enough().
What you want is called area labeling or connected component detection in image processing.
There is an implementation provided in the scipy.ndimage package.
So the following should work provided you have numpy + scipy installed
import numpy as np
import scipy.ndimage as ndi
import Image
image = Image.load()
# convert to numpy array (no data copy done since both use buffer protocol)
image = np.asarray(image)
# generate a black and white image marking red pixels as 1
bw = is_red(image)
# labeling : each region is associated with an int
labels, n = ndi.label(bw)
# provide bounding box for each region in the form of tuples of slices
objects = ndi.find_objects(labels)
The following picture will tell you what I want.
I have the information of the rectangles in the image (width, height, center point and rotation degree). Now, I want to write a script to cut them out and save them as an image, but straighten them as well. As in, I want to go from the rectangle shown inside the image to the rectangle that is shown outside.
I am using OpenCV Python. Please tell me a way to accomplish this.
Kindly show some code as examples of OpenCV Python are hard to find.
You can use the warpAffine function to rotate the image around a defined center point. The suitable rotation matrix can be generated using getRotationMatrix2D (where theta is in degrees).
You then can use Numpy slicing to cut the image.
import cv2
import numpy as np
def subimage(image, center, theta, width, height):
'''
Rotates OpenCV image around center with angle theta (in deg)
then crops the image according to width and height.
'''
# Uncomment for theta in radians
#theta *= 180/np.pi
shape = ( image.shape[1], image.shape[0] ) # cv2.warpAffine expects shape in (length, height)
matrix = cv2.getRotationMatrix2D( center=center, angle=theta, scale=1 )
image = cv2.warpAffine( src=image, M=matrix, dsize=shape )
x = int( center[0] - width/2 )
y = int( center[1] - height/2 )
image = image[ y:y+height, x:x+width ]
return image
Keep in mind that dsize is the shape of the output image. If the patch/angle is sufficiently large, edges get cut off (compare image above) if using the original shape as--for means of simplicity--done above. In this case, you could introduce a scaling factor to shape (to enlarge the output image) and the reference point for slicing (here center).
The above function can be used as follows:
image = cv2.imread('owl.jpg')
image = subimage(image, center=(110, 125), theta=30, width=100, height=200)
cv2.imwrite('patch.jpg', image)
I had problems with wrong offsets while using the solutions here and in similar questions.
So I did the math and came up with the following solution that works:
def subimage(self,image, center, theta, width, height):
theta *= 3.14159 / 180 # convert to rad
v_x = (cos(theta), sin(theta))
v_y = (-sin(theta), cos(theta))
s_x = center[0] - v_x[0] * ((width-1) / 2) - v_y[0] * ((height-1) / 2)
s_y = center[1] - v_x[1] * ((width-1) / 2) - v_y[1] * ((height-1) / 2)
mapping = np.array([[v_x[0],v_y[0], s_x],
[v_x[1],v_y[1], s_y]])
return cv2.warpAffine(image,mapping,(width, height),flags=cv2.WARP_INVERSE_MAP,borderMode=cv2.BORDER_REPLICATE)
For reference here is an image that explains the math behind it:
Note that
w_dst = width-1
h_dst = height-1
This is because the last coordinate has the value width-1 and not width, or height.
The other methods will work only if the content of the rectangle is in the rotated image after rotation and will fail badly in other situations. What if some of the part are lost? See an example below:
If you are to crop the rotated rectangle text area using the above method,
import cv2
import numpy as np
def main():
img = cv2.imread("big_vertical_text.jpg")
cnt = np.array([
[[64, 49]],
[[122, 11]],
[[391, 326]],
[[308, 373]]
])
print("shape of cnt: {}".format(cnt.shape))
rect = cv2.minAreaRect(cnt)
print("rect: {}".format(rect))
box = cv2.boxPoints(rect)
box = np.int0(box)
print("bounding box: {}".format(box))
cv2.drawContours(img, [box], 0, (0, 0, 255), 2)
img_crop, img_rot = crop_rect(img, rect)
print("size of original img: {}".format(img.shape))
print("size of rotated img: {}".format(img_rot.shape))
print("size of cropped img: {}".format(img_crop.shape))
new_size = (int(img_rot.shape[1]/2), int(img_rot.shape[0]/2))
img_rot_resized = cv2.resize(img_rot, new_size)
new_size = (int(img.shape[1]/2)), int(img.shape[0]/2)
img_resized = cv2.resize(img, new_size)
cv2.imshow("original contour", img_resized)
cv2.imshow("rotated image", img_rot_resized)
cv2.imshow("cropped_box", img_crop)
# cv2.imwrite("crop_img1.jpg", img_crop)
cv2.waitKey(0)
def crop_rect(img, rect):
# get the parameter of the small rectangle
center = rect[0]
size = rect[1]
angle = rect[2]
center, size = tuple(map(int, center)), tuple(map(int, size))
# get row and col num in img
height, width = img.shape[0], img.shape[1]
print("width: {}, height: {}".format(width, height))
M = cv2.getRotationMatrix2D(center, angle, 1)
img_rot = cv2.warpAffine(img, M, (width, height))
img_crop = cv2.getRectSubPix(img_rot, size, center)
return img_crop, img_rot
if __name__ == "__main__":
main()
This is what you will get:
Apparently, some of the parts are cut out! Why do not directly warp the rotated rectangle since we can get its four corner points with cv.boxPoints() method?
import cv2
import numpy as np
def main():
img = cv2.imread("big_vertical_text.jpg")
cnt = np.array([
[[64, 49]],
[[122, 11]],
[[391, 326]],
[[308, 373]]
])
print("shape of cnt: {}".format(cnt.shape))
rect = cv2.minAreaRect(cnt)
print("rect: {}".format(rect))
box = cv2.boxPoints(rect)
box = np.int0(box)
width = int(rect[1][0])
height = int(rect[1][1])
src_pts = box.astype("float32")
dst_pts = np.array([[0, height-1],
[0, 0],
[width-1, 0],
[width-1, height-1]], dtype="float32")
M = cv2.getPerspectiveTransform(src_pts, dst_pts)
warped = cv2.warpPerspective(img, M, (width, height))
Now the cropped image becomes
Much better, isn't it? If you check carefully, you will notice that there are some black area in the cropped image. That is because a small part of the detected rectangle is out of the bound of the image. To remedy this, you may pad the image a little bit and do the crop after that. There is an example illustrated in this answer.
Now, we compare the two methods to crop the rotated rectangle from the image.
This method do not require rotating the image and can deal with this problem more elegantly with less code.
Similar recipe for openCV version 3.4.0.
from cv2 import cv
import numpy as np
def getSubImage(rect, src):
# Get center, size, and angle from rect
center, size, theta = rect
# Convert to int
center, size = tuple(map(int, center)), tuple(map(int, size))
# Get rotation matrix for rectangle
M = cv2.getRotationMatrix2D( center, theta, 1)
# Perform rotation on src image
dst = cv2.warpAffine(src, M, src.shape[:2])
out = cv2.getRectSubPix(dst, size, center)
return out
img = cv2.imread('img.jpg')
# Find some contours
thresh2, contours, hierarchy = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Get rotated bounding box
rect = cv2.minAreaRect(contours[0])
# Extract subregion
out = getSubImage(rect, img)
# Save image
cv2.imwrite('out.jpg', out)
This is my C++ version that performs the same task. I have noticed it is a bit slow. If anyone sees anything that would improve the performance of this function, then please let me know. :)
bool extractPatchFromOpenCVImage( cv::Mat& src, cv::Mat& dest, int x, int y, double angle, int width, int height) {
// obtain the bounding box of the desired patch
cv::RotatedRect patchROI(cv::Point2f(x,y), cv::Size2i(width,height), angle);
cv::Rect boundingRect = patchROI.boundingRect();
// check if the bounding box fits inside the image
if ( boundingRect.x >= 0 && boundingRect.y >= 0 &&
(boundingRect.x+boundingRect.width) < src.cols &&
(boundingRect.y+boundingRect.height) < src.rows ) {
// crop out the bounding rectangle from the source image
cv::Mat preCropImg = src(boundingRect);
// the rotational center relative tot he pre-cropped image
int cropMidX, cropMidY;
cropMidX = boundingRect.width/2;
cropMidY = boundingRect.height/2;
// obtain the affine transform that maps the patch ROI in the image to the
// dest patch image. The dest image will be an upright version.
cv::Mat map_mat = cv::getRotationMatrix2D(cv::Point2f(cropMidX, cropMidY), angle, 1.0f);
map_mat.at<double>(0,2) += static_cast<double>(width/2 - cropMidX);
map_mat.at<double>(1,2) += static_cast<double>(height/2 - cropMidY);
// rotate the pre-cropped image. The destination image will be
// allocated by warpAffine()
cv::warpAffine(preCropImg, dest, map_mat, cv::Size2i(width,height));
return true;
} // if
else {
return false;
} // else
} // extractPatch
This was a very frustrating endeavor, but finally I solved it based on rroowwllaanndd's answer. I just had to add the angle correction when the width < height. Without this I got very strange results for images which fulfilled this condition.
def crop_image(rect, image):
shape = (image.shape[1], image.shape[0]) # cv2.warpAffine expects shape in (length, height)
center, size, theta = rect
width, height = tuple(map(int, size))
center = tuple(map(int, center))
if width < height:
theta -= 90
width, height = height, width
matrix = cv.getRotationMatrix2D(center=center, angle=theta, scale=1.0)
image = cv.warpAffine(src=image, M=matrix, dsize=shape)
x = int(center[0] - width // 2)
y = int(center[1] - height // 2)
image = image[y : y + height, x : x + width]
return image