Rotate image and crop out black borders - python

My application: I am trying to rotate an image (using OpenCV and Python)
At the moment I have developed the below code which rotates an input image, padding it with black borders, giving me A. What I want is B - the largest possible area crop window within the rotated image. I call this the axis-aligned boundED box.
This is essentially the same as Rotate and crop, however I cannot get the answer on that question to work. Additionally, that answer is apparently only valid for square images. My images are rectangular.
Code to give A:
import cv2
import numpy as np
def getTranslationMatrix2d(dx, dy):
"""
Returns a numpy affine transformation matrix for a 2D translation of
(dx, dy)
"""
return np.matrix([[1, 0, dx], [0, 1, dy], [0, 0, 1]])
def rotateImage(image, angle):
"""
Rotates the given image about it's centre
"""
image_size = (image.shape[1], image.shape[0])
image_center = tuple(np.array(image_size) / 2)
rot_mat = np.vstack([cv2.getRotationMatrix2D(image_center, angle, 1.0), [0, 0, 1]])
trans_mat = np.identity(3)
w2 = image_size[0] * 0.5
h2 = image_size[1] * 0.5
rot_mat_notranslate = np.matrix(rot_mat[0:2, 0:2])
tl = (np.array([-w2, h2]) * rot_mat_notranslate).A[0]
tr = (np.array([w2, h2]) * rot_mat_notranslate).A[0]
bl = (np.array([-w2, -h2]) * rot_mat_notranslate).A[0]
br = (np.array([w2, -h2]) * rot_mat_notranslate).A[0]
x_coords = [pt[0] for pt in [tl, tr, bl, br]]
x_pos = [x for x in x_coords if x > 0]
x_neg = [x for x in x_coords if x < 0]
y_coords = [pt[1] for pt in [tl, tr, bl, br]]
y_pos = [y for y in y_coords if y > 0]
y_neg = [y for y in y_coords if y < 0]
right_bound = max(x_pos)
left_bound = min(x_neg)
top_bound = max(y_pos)
bot_bound = min(y_neg)
new_w = int(abs(right_bound - left_bound))
new_h = int(abs(top_bound - bot_bound))
new_image_size = (new_w, new_h)
new_midx = new_w * 0.5
new_midy = new_h * 0.5
dx = int(new_midx - w2)
dy = int(new_midy - h2)
trans_mat = getTranslationMatrix2d(dx, dy)
affine_mat = (np.matrix(trans_mat) * np.matrix(rot_mat))[0:2, :]
result = cv2.warpAffine(image, affine_mat, new_image_size, flags=cv2.INTER_LINEAR)
return result

The math behind this solution/implementation is equivalent to this solution of an analagous question, but the formulas are simplified and avoid singularities. This is python code with the same interface as largest_rotated_rect from the other solution, but giving a bigger area in almost all cases (always the proven optimum):
def rotatedRectWithMaxArea(w, h, angle):
"""
Given a rectangle of size wxh that has been rotated by 'angle' (in
radians), computes the width and height of the largest possible
axis-aligned rectangle (maximal area) within the rotated rectangle.
"""
if w <= 0 or h <= 0:
return 0,0
width_is_longer = w >= h
side_long, side_short = (w,h) if width_is_longer else (h,w)
# since the solutions for angle, -angle and 180-angle are all the same,
# if suffices to look at the first quadrant and the absolute values of sin,cos:
sin_a, cos_a = abs(math.sin(angle)), abs(math.cos(angle))
if side_short <= 2.*sin_a*cos_a*side_long or abs(sin_a-cos_a) < 1e-10:
# half constrained case: two crop corners touch the longer side,
# the other two corners are on the mid-line parallel to the longer line
x = 0.5*side_short
wr,hr = (x/sin_a,x/cos_a) if width_is_longer else (x/cos_a,x/sin_a)
else:
# fully constrained case: crop touches all 4 sides
cos_2a = cos_a*cos_a - sin_a*sin_a
wr,hr = (w*cos_a - h*sin_a)/cos_2a, (h*cos_a - w*sin_a)/cos_2a
return wr,hr
Here is a comparison of the function with the other solution:
>>> wl,hl = largest_rotated_rect(1500,500,math.radians(20))
>>> print (wl,hl),', area=',wl*hl
(828.2888697391496, 230.61639227890998) , area= 191016.990904
>>> wm,hm = rotatedRectWithMaxArea(1500,500,math.radians(20))
>>> print (wm,hm),', area=',wm*hm
(730.9511000407718, 266.044443118978) , area= 194465.478358
With angle angle in [0,pi/2[ the bounding box of the rotated image (width w, height h) has these dimensions:
width w_bb = w*cos_a + h*sin_a
height h_bb = w*sin_a + h*cos_a
If w_r, h_r are the computed optimal width and height of the cropped image, then the insets from the bounding box are:
in horizontal direction: (w_bb-w_r)/2
in vertical direction: (h_bb-h_r)/2
Proof:
Looking for the axis aligned rectangle between two parallel lines that has maximal area is an optimization problem with one parameter, e.g. x as in this figure:
Let s denote the distance between the two parallel lines (it will turn out to be the shorter side of the rotated rectangle). Then the sides a, b of the sought-after rectangle have a constant ratio with x, s-x, resp., namely x = a sin α and (s-x) = b cos α:
So maximizing the area a*b means maximizing x*(s-x). Because of "theorem of height" for right-angled triangles we know x*(s-x) = p*q = h*h. Hence the maximal area is reached at x = s-x = s/2, i.e. the two corners E, G between the parallel lines are on the mid-line:
This solution is only valid if this maximal rectangle fits into the rotated rectangle. Therefore the diagonal EG must not be longer than the other side l of the rotated rectangle. Since
EG = AF + DH = s/2*(cot α + tan α) = s/(2sin αcos α) = s/sin 2*α
we have the condition s ≤ lsin 2α, where s and l are the shorter and longer side of the rotated rectangle.
In case of s > lsin 2α the parameter x must be smaller (than s/2) and s.t. all corners of the sought-after rectangle are each on a side of the rotated rectangle. This leads to the equation
x*cot α + (s-x)*tan α = l
giving x = sin α*(lcos α - ssin α)/cos 2*α. From a = x/sin α and b = (s-x)/cos α we get the above used formulas.

So, after investigating many claimed solutions, I have finally found a method that works; The answer by Andri and Magnus Hoff on Calculate largest rectangle in a rotated rectangle.
The below Python code contains the method of interest - largest_rotated_rect - and a short demo.
import math
import cv2
import numpy as np
def rotate_image(image, angle):
"""
Rotates an OpenCV 2 / NumPy image about it's centre by the given angle
(in degrees). The returned image will be large enough to hold the entire
new image, with a black background
"""
# Get the image size
# No that's not an error - NumPy stores image matricies backwards
image_size = (image.shape[1], image.shape[0])
image_center = tuple(np.array(image_size) / 2)
# Convert the OpenCV 3x2 rotation matrix to 3x3
rot_mat = np.vstack(
[cv2.getRotationMatrix2D(image_center, angle, 1.0), [0, 0, 1]]
)
rot_mat_notranslate = np.matrix(rot_mat[0:2, 0:2])
# Shorthand for below calcs
image_w2 = image_size[0] * 0.5
image_h2 = image_size[1] * 0.5
# Obtain the rotated coordinates of the image corners
rotated_coords = [
(np.array([-image_w2, image_h2]) * rot_mat_notranslate).A[0],
(np.array([ image_w2, image_h2]) * rot_mat_notranslate).A[0],
(np.array([-image_w2, -image_h2]) * rot_mat_notranslate).A[0],
(np.array([ image_w2, -image_h2]) * rot_mat_notranslate).A[0]
]
# Find the size of the new image
x_coords = [pt[0] for pt in rotated_coords]
x_pos = [x for x in x_coords if x > 0]
x_neg = [x for x in x_coords if x < 0]
y_coords = [pt[1] for pt in rotated_coords]
y_pos = [y for y in y_coords if y > 0]
y_neg = [y for y in y_coords if y < 0]
right_bound = max(x_pos)
left_bound = min(x_neg)
top_bound = max(y_pos)
bot_bound = min(y_neg)
new_w = int(abs(right_bound - left_bound))
new_h = int(abs(top_bound - bot_bound))
# We require a translation matrix to keep the image centred
trans_mat = np.matrix([
[1, 0, int(new_w * 0.5 - image_w2)],
[0, 1, int(new_h * 0.5 - image_h2)],
[0, 0, 1]
])
# Compute the tranform for the combined rotation and translation
affine_mat = (np.matrix(trans_mat) * np.matrix(rot_mat))[0:2, :]
# Apply the transform
result = cv2.warpAffine(
image,
affine_mat,
(new_w, new_h),
flags=cv2.INTER_LINEAR
)
return result
def largest_rotated_rect(w, h, angle):
"""
Given a rectangle of size wxh that has been rotated by 'angle' (in
radians), computes the width and height of the largest possible
axis-aligned rectangle within the rotated rectangle.
Original JS code by 'Andri' and Magnus Hoff from Stack Overflow
Converted to Python by Aaron Snoswell
"""
quadrant = int(math.floor(angle / (math.pi / 2))) & 3
sign_alpha = angle if ((quadrant & 1) == 0) else math.pi - angle
alpha = (sign_alpha % math.pi + math.pi) % math.pi
bb_w = w * math.cos(alpha) + h * math.sin(alpha)
bb_h = w * math.sin(alpha) + h * math.cos(alpha)
gamma = math.atan2(bb_w, bb_w) if (w < h) else math.atan2(bb_w, bb_w)
delta = math.pi - alpha - gamma
length = h if (w < h) else w
d = length * math.cos(alpha)
a = d * math.sin(alpha) / math.sin(delta)
y = a * math.cos(gamma)
x = y * math.tan(gamma)
return (
bb_w - 2 * x,
bb_h - 2 * y
)
def crop_around_center(image, width, height):
"""
Given a NumPy / OpenCV 2 image, crops it to the given width and height,
around it's centre point
"""
image_size = (image.shape[1], image.shape[0])
image_center = (int(image_size[0] * 0.5), int(image_size[1] * 0.5))
if(width > image_size[0]):
width = image_size[0]
if(height > image_size[1]):
height = image_size[1]
x1 = int(image_center[0] - width * 0.5)
x2 = int(image_center[0] + width * 0.5)
y1 = int(image_center[1] - height * 0.5)
y2 = int(image_center[1] + height * 0.5)
return image[y1:y2, x1:x2]
def demo():
"""
Demos the largest_rotated_rect function
"""
image = cv2.imread("lenna_rectangle.png")
image_height, image_width = image.shape[0:2]
cv2.imshow("Original Image", image)
print "Press [enter] to begin the demo"
print "Press [q] or Escape to quit"
key = cv2.waitKey(0)
if key == ord("q") or key == 27:
exit()
for i in np.arange(0, 360, 0.5):
image_orig = np.copy(image)
image_rotated = rotate_image(image, i)
image_rotated_cropped = crop_around_center(
image_rotated,
*largest_rotated_rect(
image_width,
image_height,
math.radians(i)
)
)
key = cv2.waitKey(2)
if(key == ord("q") or key == 27):
exit()
cv2.imshow("Original Image", image_orig)
cv2.imshow("Rotated Image", image_rotated)
cv2.imshow("Cropped Image", image_rotated_cropped)
print "Done"
if __name__ == "__main__":
demo()
Simply place this image (cropped to demonstrate that it works with non-square images) in the same directory as the above file, then run it.

Congratulations for the great work! I wanted to use your code in OpenCV with the C++ library, so I did the conversion that follows. Maybe this approach could be helpful to other people.
#include <iostream>
#include <opencv.hpp>
#define PI 3.14159265359
using namespace std;
double degree_to_radian(double angle)
{
return angle * PI / 180;
}
cv::Mat rotate_image (cv::Mat image, double angle)
{
// Rotates an OpenCV 2 image about its centre by the given angle
// (in radians). The returned image will be large enough to hold the entire
// new image, with a black background
cv::Size image_size = cv::Size(image.rows, image.cols);
cv::Point image_center = cv::Point(image_size.height/2, image_size.width/2);
// Convert the OpenCV 3x2 matrix to 3x3
cv::Mat rot_mat = cv::getRotationMatrix2D(image_center, angle, 1.0);
double row[3] = {0.0, 0.0, 1.0};
cv::Mat new_row = cv::Mat(1, 3, rot_mat.type(), row);
rot_mat.push_back(new_row);
double slice_mat[2][2] = {
{rot_mat.col(0).at<double>(0), rot_mat.col(1).at<double>(0)},
{rot_mat.col(0).at<double>(1), rot_mat.col(1).at<double>(1)}
};
cv::Mat rot_mat_nontranslate = cv::Mat(2, 2, rot_mat.type(), slice_mat);
double image_w2 = image_size.width * 0.5;
double image_h2 = image_size.height * 0.5;
// Obtain the rotated coordinates of the image corners
std::vector<cv::Mat> rotated_coords;
double image_dim_d_1[2] = { -image_h2, image_w2 };
cv::Mat image_dim = cv::Mat(1, 2, rot_mat.type(), image_dim_d_1);
rotated_coords.push_back(cv::Mat(image_dim * rot_mat_nontranslate));
double image_dim_d_2[2] = { image_h2, image_w2 };
image_dim = cv::Mat(1, 2, rot_mat.type(), image_dim_d_2);
rotated_coords.push_back(cv::Mat(image_dim * rot_mat_nontranslate));
double image_dim_d_3[2] = { -image_h2, -image_w2 };
image_dim = cv::Mat(1, 2, rot_mat.type(), image_dim_d_3);
rotated_coords.push_back(cv::Mat(image_dim * rot_mat_nontranslate));
double image_dim_d_4[2] = { image_h2, -image_w2 };
image_dim = cv::Mat(1, 2, rot_mat.type(), image_dim_d_4);
rotated_coords.push_back(cv::Mat(image_dim * rot_mat_nontranslate));
// Find the size of the new image
vector<double> x_coords, x_pos, x_neg;
for (int i = 0; i < rotated_coords.size(); i++)
{
double pt = rotated_coords[i].col(0).at<double>(0);
x_coords.push_back(pt);
if (pt > 0)
x_pos.push_back(pt);
else
x_neg.push_back(pt);
}
vector<double> y_coords, y_pos, y_neg;
for (int i = 0; i < rotated_coords.size(); i++)
{
double pt = rotated_coords[i].col(1).at<double>(0);
y_coords.push_back(pt);
if (pt > 0)
y_pos.push_back(pt);
else
y_neg.push_back(pt);
}
double right_bound = *max_element(x_pos.begin(), x_pos.end());
double left_bound = *min_element(x_neg.begin(), x_neg.end());
double top_bound = *max_element(y_pos.begin(), y_pos.end());
double bottom_bound = *min_element(y_neg.begin(), y_neg.end());
int new_w = int(abs(right_bound - left_bound));
int new_h = int(abs(top_bound - bottom_bound));
// We require a translation matrix to keep the image centred
double trans_mat[3][3] = {
{1, 0, int(new_w * 0.5 - image_w2)},
{0, 1, int(new_h * 0.5 - image_h2)},
{0, 0, 1},
};
// Compute the transform for the combined rotation and translation
cv::Mat aux_affine_mat = (cv::Mat(3, 3, rot_mat.type(), trans_mat) * rot_mat);
cv::Mat affine_mat = cv::Mat(2, 3, rot_mat.type(), NULL);
affine_mat.push_back(aux_affine_mat.row(0));
affine_mat.push_back(aux_affine_mat.row(1));
// Apply the transform
cv::Mat output;
cv::warpAffine(image, output, affine_mat, cv::Size(new_h, new_w), cv::INTER_LINEAR);
return output;
}
cv::Size largest_rotated_rect(int h, int w, double angle)
{
// Given a rectangle of size wxh that has been rotated by 'angle' (in
// radians), computes the width and height of the largest possible
// axis-aligned rectangle within the rotated rectangle.
// Original JS code by 'Andri' and Magnus Hoff from Stack Overflow
// Converted to Python by Aaron Snoswell (https://stackoverflow.com/questions/16702966/rotate-image-and-crop-out-black-borders)
// Converted to C++ by Eliezer Bernart
int quadrant = int(floor(angle/(PI/2))) & 3;
double sign_alpha = ((quadrant & 1) == 0) ? angle : PI - angle;
double alpha = fmod((fmod(sign_alpha, PI) + PI), PI);
double bb_w = w * cos(alpha) + h * sin(alpha);
double bb_h = w * sin(alpha) + h * cos(alpha);
double gamma = w < h ? atan2(bb_w, bb_w) : atan2(bb_h, bb_h);
double delta = PI - alpha - gamma;
int length = w < h ? h : w;
double d = length * cos(alpha);
double a = d * sin(alpha) / sin(delta);
double y = a * cos(gamma);
double x = y * tan(gamma);
return cv::Size(bb_w - 2 * x, bb_h - 2 * y);
}
// for those interested in the actual optimum - contributed by coproc
#include <algorithm>
cv::Size really_largest_rotated_rect(int h, int w, double angle)
{
// Given a rectangle of size wxh that has been rotated by 'angle' (in
// radians), computes the width and height of the largest possible
// axis-aligned rectangle within the rotated rectangle.
if (w <= 0 || h <= 0)
return cv::Size(0,0);
bool width_is_longer = w >= h;
int side_long = w, side_short = h;
if (!width_is_longer)
std::swap(side_long, side_short);
// since the solutions for angle, -angle and pi-angle are all the same,
// it suffices to look at the first quadrant and the absolute values of sin,cos:
double sin_a = fabs(sin(angle)), cos_a = fabs(cos(angle));
double wr,hr;
if (side_short <= 2.*sin_a*cos_a*side_long)
{
// half constrained case: two crop corners touch the longer side,
// the other two corners are on the mid-line parallel to the longer line
double x = 0.5*side_short;
wr = x/sin_a;
hr = x/cos_a;
if (!width_is_longer)
std::swap(wr,hr);
}
else
{
// fully constrained case: crop touches all 4 sides
double cos_2a = cos_a*cos_a - sin_a*sin_a;
wr = (w*cos_a - h*sin_a)/cos_2a;
hr = (h*cos_a - w*sin_a)/cos_2a;
}
return cv::Size(wr,hr);
}
cv::Mat crop_around_center(cv::Mat image, int height, int width)
{
// Given a OpenCV 2 image, crops it to the given width and height,
// around it's centre point
cv::Size image_size = cv::Size(image.rows, image.cols);
cv::Point image_center = cv::Point(int(image_size.height * 0.5), int(image_size.width * 0.5));
if (width > image_size.width)
width = image_size.width;
if (height > image_size.height)
height = image_size.height;
int x1 = int(image_center.x - width * 0.5);
int x2 = int(image_center.x + width * 0.5);
int y1 = int(image_center.y - height * 0.5);
int y2 = int(image_center.y + height * 0.5);
return image(cv::Rect(cv::Point(y1, x1), cv::Point(y2,x2)));
}
void demo(cv::Mat image)
{
// Demos the largest_rotated_rect function
int image_height = image.rows;
int image_width = image.cols;
for (float i = 0.0; i < 360.0; i+=0.5)
{
cv::Mat image_orig = image.clone();
cv::Mat image_rotated = rotate_image(image, i);
cv::Size largest_rect = largest_rotated_rect(image_height, image_width, degree_to_radian(i));
// for those who trust math (added by coproc):
cv::Size largest_rect2 = really_largest_rotated_rect(image_height, image_width, degree_to_radian(i));
cout << "area1 = " << largest_rect.height * largest_rect.width << endl;
cout << "area2 = " << largest_rect2.height * largest_rect2.width << endl;
cv::Mat image_rotated_cropped = crop_around_center(
image_rotated,
largest_rect.height,
largest_rect.width
);
cv::imshow("Original Image", image_orig);
cv::imshow("Rotated Image", image_rotated);
cv::imshow("Cropped image", image_rotated_cropped);
if (char(cv::waitKey(15)) == 'q')
break;
}
}
int main (int argc, char* argv[])
{
cv::Mat image = cv::imread(argv[1]);
if (image.empty())
{
cout << "> The input image was not found." << endl;
exit(EXIT_FAILURE);
}
cout << "Press [s] to begin or restart the demo" << endl;
cout << "Press [q] to quit" << endl;
while (true)
{
cv::imshow("Original Image", image);
char opt = char(cv::waitKey(0));
switch (opt) {
case 's':
demo(image);
break;
case 'q':
return EXIT_SUCCESS;
default:
break;
}
}
return EXIT_SUCCESS;
}

Rotation and cropping in TensorFlow
I personally needed this function in TensorFlow and thanks for Aaron Snoswell, I could implement this function.
def _rotate_and_crop(image, output_height, output_width, rotation_degree, do_crop):
"""Rotate the given image with the given rotation degree and crop for the black edges if necessary
Args:
image: A `Tensor` representing an image of arbitrary size.
output_height: The height of the image after preprocessing.
output_width: The width of the image after preprocessing.
rotation_degree: The degree of rotation on the image.
do_crop: Do cropping if it is True.
Returns:
A rotated image.
"""
# Rotate the given image with the given rotation degree
if rotation_degree != 0:
image = tf.contrib.image.rotate(image, math.radians(rotation_degree), interpolation='BILINEAR')
# Center crop to ommit black noise on the edges
if do_crop == True:
lrr_width, lrr_height = _largest_rotated_rect(output_height, output_width, math.radians(rotation_degree))
resized_image = tf.image.central_crop(image, float(lrr_height)/output_height)
image = tf.image.resize_images(resized_image, [output_height, output_width], method=tf.image.ResizeMethod.BILINEAR, align_corners=False)
return image
def _largest_rotated_rect(w, h, angle):
"""
Given a rectangle of size wxh that has been rotated by 'angle' (in
radians), computes the width and height of the largest possible
axis-aligned rectangle within the rotated rectangle.
Original JS code by 'Andri' and Magnus Hoff from Stack Overflow
Converted to Python by Aaron Snoswell
Source: http://stackoverflow.com/questions/16702966/rotate-image-and-crop-out-black-borders
"""
quadrant = int(math.floor(angle / (math.pi / 2))) & 3
sign_alpha = angle if ((quadrant & 1) == 0) else math.pi - angle
alpha = (sign_alpha % math.pi + math.pi) % math.pi
bb_w = w * math.cos(alpha) + h * math.sin(alpha)
bb_h = w * math.sin(alpha) + h * math.cos(alpha)
gamma = math.atan2(bb_w, bb_w) if (w < h) else math.atan2(bb_w, bb_w)
delta = math.pi - alpha - gamma
length = h if (w < h) else w
d = length * math.cos(alpha)
a = d * math.sin(alpha) / math.sin(delta)
y = a * math.cos(gamma)
x = y * math.tan(gamma)
return (
bb_w - 2 * x,
bb_h - 2 * y
)
If you need further implementation of example and visualization in TensorFlow, you can use this repository.
I hope this could be helpful to other people.

Inspired by Coprox's amazing work I wrote a function that forms together with Coprox's code a complete solution (so it can be used by copying & pasting with no-brainer). The rotate_max_area function below simply returns a rotated image without black boundary.
def rotate_bound(image, angle):
# CREDIT: https://www.pyimagesearch.com/2017/01/02/rotate-images-correctly-with-opencv-and-python/
(h, w) = image.shape[:2]
(cX, cY) = (w // 2, h // 2)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
return cv2.warpAffine(image, M, (nW, nH))
def rotate_max_area(image, angle):
""" image: cv2 image matrix object
angle: in degree
"""
wr, hr = rotatedRectWithMaxArea(image.shape[1], image.shape[0],
math.radians(angle))
rotated = rotate_bound(image, angle)
h, w, _ = rotated.shape
y1 = h//2 - int(hr/2)
y2 = y1 + int(hr)
x1 = w//2 - int(wr/2)
x2 = x1 + int(wr)
return rotated[y1:y2, x1:x2]

A small update for brevity that makes use of the excellent imutils library.
def rotated_rect(w, h, angle):
"""
Given a rectangle of size wxh that has been rotated by 'angle' (in
radians), computes the width and height of the largest possible
axis-aligned rectangle within the rotated rectangle.
Original JS code by 'Andri' and Magnus Hoff from Stack Overflow
Converted to Python by Aaron Snoswell
"""
angle = math.radians(angle)
quadrant = int(math.floor(angle / (math.pi / 2))) & 3
sign_alpha = angle if ((quadrant & 1) == 0) else math.pi - angle
alpha = (sign_alpha % math.pi + math.pi) % math.pi
bb_w = w * math.cos(alpha) + h * math.sin(alpha)
bb_h = w * math.sin(alpha) + h * math.cos(alpha)
gamma = math.atan2(bb_w, bb_w) if (w < h) else math.atan2(bb_w, bb_w)
delta = math.pi - alpha - gamma
length = h if (w < h) else w
d = length * math.cos(alpha)
a = d * math.sin(alpha) / math.sin(delta)
y = a * math.cos(gamma)
x = y * math.tan(gamma)
return (bb_w - 2 * x, bb_h - 2 * y)
def crop(img, w, h):
x, y = int(img.shape[1] * .5), int(img.shape[0] * .5)
return img[
int(np.ceil(y - h * .5)) : int(np.floor(y + h * .5)),
int(np.ceil(x - w * .5)) : int(np.floor(x + h * .5))
]
def rotate(img, angle):
# rotate, crop and return original size
(h, w) = img.shape[:2]
img = imutils.rotate_bound(img, angle)
img = crop(img, *rotated_rect(w, h, angle))
img = cv2.resize(img,(w,h),interpolation=cv2.INTER_AREA)
return img

Swift solution
Thanks to coproc for his great solution. Here is the code in swift
// Given a rectangle of size.width x size.height that has been rotated by 'angle' (in
// radians), computes the width and height of the largest possible
// axis-aligned rectangle (maximal area) within the rotated rectangle.
func rotatedRectWithMaxArea(size: CGSize, angle: CGFloat) -> CGSize {
let w = size.width
let h = size.height
if(w <= 0 || h <= 0) {
return CGSize.zero
}
let widthIsLonger = w >= h
let (sideLong, sideShort) = widthIsLonger ? (w, h) : (w, h)
// since the solutions for angle, -angle and 180-angle are all the same,
// if suffices to look at the first quadrant and the absolute values of sin,cos:
let (sinA, cosA) = (sin(angle), cos(angle))
if(sideShort <= 2*sinA*cosA*sideLong || abs(sinA-cosA) < 1e-10) {
// half constrained case: two crop corners touch the longer side,
// the other two corners are on the mid-line parallel to the longer line
let x = 0.5*sideShort
let (wr, hr) = widthIsLonger ? (x/sinA, x/cosA) : (x/cosA, x/sinA)
return CGSize(width: wr, height: hr)
} else {
// fully constrained case: crop touches all 4 sides
let cos2A = cosA*cosA - sinA*sinA
let (wr, hr) = ((w*cosA - h*sinA)/cos2A, (h*cosA - w*sinA)/cos2A)
return CGSize(width: wr, height: hr)
}
}

Perhaps an even simplier solution would be:
def crop_image(image, angle):
h, w = image.shape
tan_a = abs(np.tan(angle * np.pi / 180))
b = int(tan_a / (1 - tan_a ** 2) * (h - w * tan_a))
d = int(tan_a / (1 - tan_a ** 2) * (w - h * tan_a))
return image[d:h - d, b:w - b]
Instead of calculating the height and width of the rotated rectangle like many have done, it is sufficient to find the height of the black triangles that form when rotating an image.

Correction to the most favored solution above given by Coprox on May 27 2013: when cosa = cosb infinity results in the last two lines. Solve by adding "or cosa equal cosb" in the preceding if selector.
Addition: if you do not know the original non-rotated nx and ny but only have the rotated frame (or image) then find the box just containing this (I do this by removing blank = monochrome borders) and first run the program reversely on its size to find nx and ny. If the image was rotated into a too small frame so that it was cut along the sides (into octagonal shape) I first find the x and y extensions to the full containment frame.
However, this also does not work for angles around 45 degrees where the result gets square instead of maintaining the non-rotated aspect ratio. For me this routine only works properly up to 30 degrees.
Still a great routine! It solved my nagging problem in astronomical image alignment.

Rotate images in correct order
import cv2
import pytesseract
import urllib
import numpy as np
import re
import imutils #added
import PIL
image = cv2.imread('my_pdf_madan_m/page_1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)
rot_data = pytesseract.image_to_osd(image);
print("[OSD] "+rot_data)
rot = re.search('(?<=Rotate: )\d+',
rot_data).group(0)
angle = float(rot)
# rotate the image to deskew it
rotated = imutils.rotate_bound(image, angle) #added
# TODO: Rotated image can be saved here
print(pytesseract.image_to_osd(rotated));
# Run tesseract OCR on image
text = pytesseract.image_to_string(rotated,
lang='eng', config="--psm 6")
print(text)

Recently implemented a solution for Pytorch. It might come in handy. Could potentially be used with the 'Random Rotation Transform' as well. Just need to read the particular angle used by the transform and then just use it with PyTorch transforms. Function simply takes in a batch of images and does the random rotation with cropping.
import torchvision.transforms as transforms
import math
def _largest_rotated_rect(w, h, angle):
"""
Given a rectangle of size wxh that has been rotated by 'angle' (in
radians), computes the width and height of the largest possible
axis-aligned rectangle within the rotated rectangle.
Original JS code by 'Andri' and Magnus Hoff from Stack Overflow
Converted to Python by Aaron Snoswell
Source: http://stackoverflow.com/questions/16702966/rotate-image-and-crop-out-black-borders
"""
quadrant = int(math.floor(angle / (math.pi / 2))) & 3
sign_alpha = angle if ((quadrant & 1) == 0) else math.pi - angle
alpha = (sign_alpha % math.pi + math.pi) % math.pi
bb_w = w * math.cos(alpha) + h * math.sin(alpha)
bb_h = w * math.sin(alpha) + h * math.cos(alpha)
gamma = math.atan2(bb_w, bb_w) if (w < h) else math.atan2(bb_w, bb_w)
delta = math.pi - alpha - gamma
length = h if (w < h) else w
d = length * math.cos(alpha)
a = d * math.sin(alpha) / math.sin(delta)
y = a * math.cos(gamma)
x = y * math.tan(gamma)
return (
bb_w - 2 * x,
bb_h - 2 * y
)
def _rotate_and_crop(image, output_height=32, output_width=32):
"""Rotate the given image with the given rotation degree and crop for the black edges if necessary. For my case, image sizes are 32x32.
Args:
image: A Batch of Tensors- normally from a dataloader.
output_height: The height of the image after preprocessing.
output_width: The width of the image after preprocessing.
Returns:
A rotated image.
"""
# Rotate the given image with the given rotation degree
rotation_transform = transforms.RandomRotation((0, 360))
angle_rot = rotation_transform.angle_rot #you will have to read it from the pytorch library
lrr_width, lrr_height = _largest_rotated_rect(output_height, output_width, math.radians(angle_rot))
croped_image = transforms.CenterCrop((lrr_height, lrr_width))
resize_transform = transforms.Resize(size=(output_height, output_width))
transform = transforms.Compose([rotation_transform, croped_image, resize_transform, transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
])
image = transform(image)
return image

By doing the calculations by hand and looking at the original post, I found a minor typo on the gamma calculations. It should actually be:
gamma = math.atan2(bb_w, bb_h) if (w < h) else math.atan2(bb_h, bb_w)

Related

How to find hotspot points when an image is rotated by different angle?

I have 36 images of the same car, each one rotated by ~10 degrees. Now, I need to show the new position of any hotspots marked on the image when they are rotated by 10 degrees. I tried using the below formula but it's not giving the right coordinates.
function rotate(x, y, xm, ym, a) {
var cos = Math.cos,
sin = Math.sin,
a = a * Math.PI / 180,
xr = (x - xm) * cos(a) - (y - ym) * sin(a) + xm,
yr = (x - xm) * sin(a) + (y - ym) * cos(a) + ym;
return [xr, yr];
}
rotate(526, 327, 640, 379, 10);
The original image size is: 1280x758.
I visualized the images using opencv, below is the result:
img = cv2.imread('img.jpg')
x,y = (526,327)
image = cv2.circle(img, (x,y), radius=5, color=(0, 0, 255), thickness=-1)
cv2_imshow(image)
Image 1 Hotspot(Red circle) at: 526,327
Image 2 Hotspot after rotating by 10 deg at: 536,307
Original Images are here:
https://i.imgur.com/gO4SLZU.jpg
https://i.imgur.com/dJbCmcW.jpg
Can anyone tell me what I am doing wrong here ?

Difficulty calculating slope for a set of rotated and shifted ellipses, sometimes inverted, sometimes completely wrong

I am using OpenCV-Python to fit an ellipse to the shape of a droplet.
Then I choose a line, which represents the surface the droplet is resting on.
I calculate the tangents at the intersection of the surface and the ellipse to get the contact angle of the droplet.
It works most of the time, but in some cases, the tangents are flipped upside down or just wrong.
It seems that the calculation for the slope of the tangent fails.
Can someone tell me why this happens?
Here you can see how it should look like (surface at y=250):
And this is the result when I choose a surface level of y=47:
I did some research and I need to detect which of the two maj_ax, min_ax was parallel to the x-Axis before the ellipse gets rotated by phi or else the slope calculation algorithm fails.
What am I doing wrong?
Here is a minimal reproducible example:
from math import cos, sin, pi, sqrt, tan, atan2, radians
import cv2
class Droplet():
def __init__(self):
self.is_valid = False
self.angle_l = 0
self.angle_r = 0
self.center = (0,0)
self.maj = 0
self.min = 0
self.phi = 0.0
self.tilt_deg = 0
self.foc_pt1 = (0,0)
self.foc_pt2 = (0,0)
self.tan_l_m = 0
self.int_l = (0,0)
self.line_l = (0,0,0,0)
self.tan_r_m = 0
self.int_r = (0,0)
self.line_r = (0,0,0,0)
self.base_diam = 0
def evaluate_droplet(img, y_base) -> Droplet:
drplt = Droplet()
crop_img = img[:y_base,:]
shape = img.shape
height = shape[0]
width = shape[1]
# values only for 8bit images!
bw_edges = cv2.Canny(crop_img, 76, 179)
contours, hierarchy = cv2.findContours(bw_edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
if len(contours) == 0:
raise ValueError('No contours found!')
edge = max(contours, key=cv2.contourArea)
(x0,y0),(maj_ax,min_ax),phi_deg = cv2.fitEllipse(edge)
phi = radians(phi_deg) # to radians
a = maj_ax/2
b = min_ax/2
intersection = calc_intersection_line_ellipse((x0,y0,a,b,phi),(0,y_base))
if intersection is None:
raise ValueError('No intersections found')
# select left and right intersection points
x_int_l = min(intersection)
x_int_r = max(intersection)
foc_len = sqrt(abs(a**2 - b**2))
# calc slope and angle of tangent
m_t_l = calc_slope_of_ellipse((x0,y0,a,b,phi), x_int_l, y_base)
angle_l = pi - atan2(m_t_l,1)
m_t_r = calc_slope_of_ellipse((x0,y0,a,b,phi), x_int_r, y_base)
angle_r = atan2(m_t_r,1) + pi
drplt.angle_l = angle_l
drplt.angle_r = angle_r
drplt.maj = maj_ax
drplt.min = min_ax
drplt.center = (x0, y0)
drplt.phi = phi
drplt.tilt_deg = phi_deg
drplt.tan_l_m = m_t_l
drplt.tan_r_m = m_t_r
drplt.line_l = (int(round(x_int_l - (int(round(y_base))/m_t_l))), 0, int(round(x_int_l + ((height - int(round(y_base)))/m_t_l))), int(round(height)))
drplt.line_r = (int(round(x_int_r - (int(round(y_base))/m_t_r))), 0, int(round(x_int_r + ((height - int(round(y_base)))/m_t_r))), int(round(height)))
drplt.int_l = (x_int_l, y_base)
drplt.int_r = (x_int_r, y_base)
drplt.foc_pt1 = (x0 + foc_len*cos(phi), y0 + foc_len*sin(phi))
drplt.foc_pt2 = (x0 - foc_len*cos(phi), y0 - foc_len*sin(phi))
drplt.base_diam = x_int_r - x_int_l
drplt.is_valid = True
# draw ellipse and lines
img = cv2.drawContours(img,contours,-1,(100,100,255),2)
img = cv2.drawContours(img,edge,-1,(255,0,0),2)
img = cv2.ellipse(img, (int(round(x0)),int(round(y0))), (int(round(a)),int(round(b))), int(round(phi*180/pi)), 0, 360, (255,0,255), thickness=1, lineType=cv2.LINE_AA)
y_int = int(round(y_base))
img = cv2.line(img, (int(round(x_int_l - (y_int/m_t_l))), 0), (int(round(x_int_l + ((height - y_int)/m_t_l))), int(round(height))), (255,0,255), thickness=1, lineType=cv2.LINE_AA)
img = cv2.line(img, (int(round(x_int_r - (y_int/m_t_r))), 0), (int(round(x_int_r + ((height - y_int)/m_t_r))), int(round(height))), (255,0,255), thickness=1, lineType=cv2.LINE_AA)
img = cv2.ellipse(img, (int(round(x_int_l)),y_int), (20,20), 0, 0, -int(round(angle_l*180/pi)), (255,0,255), thickness=1, lineType=cv2.LINE_AA)
img = cv2.ellipse(img, (int(round(x_int_r)),y_int), (20,20), 0, 180, 180 + int(round(angle_r*180/pi)), (255,0,255), thickness=1, lineType=cv2.LINE_AA)
img = cv2.line(img, (0,y_int), (width, y_int), (255,0,0), thickness=2, lineType=cv2.LINE_AA)
img = cv2.putText(img, '<' + str(round(angle_l*180/pi,1)), (5,y_int-5), cv2.FONT_HERSHEY_COMPLEX, .5, (0,0,0))
img = cv2.putText(img, '<' + str(round(angle_r*180/pi,1)), (width - 80,y_int-5), cv2.FONT_HERSHEY_COMPLEX, .5, (0,0,0))
cv2.imshow('Test',img)
cv2.waitKey(0)
return drplt
def calc_intersection_line_ellipse(ellipse_pars, line_pars):
"""
calculates intersection(s) of an ellipse with a line
:param ellipse_pars: tuple of (x0,y0,a,b,phi): x0,y0 center of ellipse; a,b sem-axis of ellipse; phi tilt rel to x axis
:param line_pars: tuple of (m,t): m is the slope and t is intercept of the intersecting line
:returns: x-coordinate(s) of intesection as list or float or none if none found
"""
## -->> http://quickcalcbasic.com/ellipse%20line%20intersection.pdf
(x0, y0, h, v, phi) = ellipse_pars
(m, t) = line_pars
y = t - y0
try:
a = v**2 * cos(phi)**2 + h**2 * sin(phi)**2
b = 2*y*cos(phi)*sin(phi) * (v**2 - h**2)
c = y**2 * (v**2 * sin(phi)**2 + h**2 * cos(phi)**2) - (h**2 * v**2)
det = b**2 - 4*a*c
if det > 0:
x1 = int(round((-b - sqrt(det))/(2*a) + x0))
x2 = int(round((-b + sqrt(det))/(2*a) + x0))
return x1,x2
elif det == 0:
x = int(round(-b / (2*a)))
return x
else:
return None
except Exception as ex:
raise ex
def calc_slope_of_ellipse(ellipse_pars, x, y):
"""
calculates the slope of the tangent at point x,y, the point needs to be on the ellipse!
:param ellipse_params: tuple of (x0,y0,a,b,phi): x0,y0 center of ellipse; a,b sem-axis of ellipse; phi tilt rel to x axis
:param x: x-coord where the slope will be calculated
:returns: the slope of the tangent
"""
(x0, y0, a, b, phi) = ellipse_pars
# transform to non-rotated ellipse
x_rot = (x - x0)*cos(phi) + (y - y0)*sin(phi)
y_rot = (x - x0)*sin(phi) + (y - y0)*cos(phi)
m_rot = -(b**2 * x_rot)/(a**2 * y_rot) # slope of tangent to unrotated ellipse
#rotate tangent line back to angle of the rotated ellipse
m_tan = tan(atan2(m_rot,1) + phi)
return m_tan
if __name__ == "__main__":
im = cv2.imread('untitled1.png')
# any value below 250 is just the droplet without the substrate
drp = evaluate_droplet(im, 250)
Original image:
I made a mistake in calc_slope_ellipse:
x_rot = (x - x0)*cos(phi) + (y - y0)*sin(phi)
should be
x_rot = (x - x0)*cos(phi) - (y - y0)*sin(phi)
this fixes the wrong sign of the slope at y=47.
I replaced the atan2:
m_rot = -(b**2 * x_rot)/(a**2 * y_rot) # slope of tangent to unrotated ellipse
#rotate tangent line back to angle of the rotated ellipse
m_tan = tan(atan2(m_rot,1) + phi)
with
tan_a = x_rot/a**2
tan_b = y_rot/b**2
#rotate tangent line back to angle of the rotated ellipse
tan_a_r = tan_a*cos(phi) + tan_b*sin(phi)
tan_b_r = tan_b*cos(phi) - tan_a*sin(phi)
m_tan = - (tan_a_r / tan_b_r)
This fixes the weird behaviour for certain cases (y=62).
Complete fcn:
def calc_slope_of_ellipse(ellipse_pars, x, y):
(x0, y0, a, b, phi) = ellipse_pars
# transform to non-rotated ellipse centered to origin
x_rot = (x - x0)*cos(phi) - (y - y0)*sin(phi)
y_rot = (x - x0)*sin(phi) + (y - y0)*cos(phi)
# Ax + By = C
tan_a = x_rot/a**2
tan_b = y_rot/b**2
#rotate tangent line back to angle of the rotated ellipse
tan_a_r = tan_a*cos(phi) + tan_b*sin(phi)
tan_b_r = tan_b*cos(phi) - tan_a*sin(phi)
m_tan = - (tan_a_r / tan_b_r)
return m_tan

How to use the cv::RotatedRect in Python

I would like to use cv::RotatedRect in Python. However I am unable to find its namespace. Help would be greatly appreciated!
EDIT:
I need this to implement essentially this.
compute rotated img's size by yourself
def rotate(img, degree):
h, w = img.shape[:2]
center = (w // 2, h // 2)
dst_h = int(w * math.fabs(math.sin(math.radians(degree))) + h * math.fabs(math.cos(math.radians(degree))))
dst_w = int(h * math.fabs(math.sin(math.radians(degree))) + w * math.fabs(math.cos(math.radians(degree))))
matrix = cv2.getRotationMatrix2D(center, degree, 1)
matrix[0, 2] += dst_w // 2 - center[0]
matrix[1, 2] += dst_h // 2 - center[1]
dst_img = cv2.warpAffine(img, matrix, (dst_w, dst_h), borderValue=(255, 255, 255))
return dst_img

Implementing my own algorithm to scale and rotate images in python

I am trying to implement an algorithm in python to scale images by a factor or rotate them by a given angle (or both at the same time). I am using opencv to handle the images and I know opencv has these functions built in, however I want to do this myself to better understand image transformations. I believe I calculate the rotation matrix correctly. However, when I try to implement the affine transformation, it does not come out correctly.
import numpy as np
import cv2
import math as m
import sys
img = cv2.imread(sys.argv[1])
angle = sys.argv[2]
#get rotation matrix
def getRMat((cx, cy), angle, scale):
a = scale*m.cos(angle*np.pi/180)
b = scale*(m.sin(angle*np.pi/180))
u = (1-a)*cx-b*cy
v = b*cx+(1-a)*cy
return np.array([[a,b,u], [-b,a,v]])
#determine shape of img
h, w = img.shape[:2]
#print h, w
#determine center of image
cx, cy = (w / 2, h / 2)
#calculate rotation matrix
#then grab sine and cosine of the matrix
mat = getRMat((cx,cy), -int(angle), 1)
print mat
cos = np.abs(mat[0,0])
sin = np.abs(mat[0,1])
#calculate new height and width to account for rotation
newWidth = int((h * sin) + (w * cos))
newHeight = int((h * cos) + (w * sin))
#print newWidth, newHeight
mat[0,2] += (newWidth / 2) - cx
mat[1,2] += (newHeight / 2) - cy
#this is how the image SHOULD look
dst = cv2.warpAffine(img, mat, (newWidth, newHeight))
cv2.imshow('dst', dst)
cv2.waitKey(0)
cv2.destroyAllWindows()
#apply transform
#attempt at my own warp affine function...still buggy tho
def warpAff(image, matrix, (width, height)):
dst = np.zeros((width, height, 3), dtype=np.uint8)
oldh, oldw = image.shape[:2]
#print oldh, oldw
#loop through old img and transform its coords
for x in range(oldh):
for y in range(oldw):
#print y, x
#transform the coordinates
u = int(x*matrix[0,0]+y*matrix[0,1]+matrix[0,2])
v = int(x*matrix[1,0]+y*matrix[1,1]+matrix[1,2])
#print u, v
#v -= width / 1.5
if (u >= 0 and u < height) and (v >= 0 and v < width):
dst[u,v] = image[x,y]
return dst
dst = warpAff(img, mat, (newWidth, newHeight))
cv2.imshow('dst', dst)
cv2.waitKey(0)
cv2.destroyAllWindows()
Image I am using for testing
You're applying the rotation backward.
This means that for an angle of 20, instead of rotating 20 degrees clockwise, you rotate 20 degrees counterclockwise. That on its own would be easy to fix—just negate the angle.
But it also means that, for each destination pixel, if no source pixel exactly rotates to it, you end up with an all-black pixel. You could solve that by using any interpolation algorithm, but it's making things more complicated.
If we instead just reverse the process, and instead of calculating the destination (u, v) for each (x, y), we calculate the source (x, y) for every destination (u, v), that solves both problems:
def warpAff(image, matrix, width, height):
dst = np.zeros((width, height, 3), dtype=np.uint8)
oldh, oldw = image.shape[:2]
# Loop over the destination, not the source, to ensure that you cover
# every destination pixel exactly 1 time, rather than 0-4 times.
for u in range(width):
for v in range(height):
x = u*matrix[0,0]+v*matrix[0,1]+matrix[0,2]
y = u*matrix[1,0]+v*matrix[1,1]+matrix[1,2]
intx, inty = int(x), int(y)
# We could interpolate here by using something like this linear
# interpolation matrix, but let's keep it simple and not do that.
# fracx, fracy = x%1, y%1
# interp = np.array([[fracx*fracy, (1-fracx)*fracy],
# [fracx*(1-fracy), (1-fracx)*(1-fracy)]])
if 0 < x < oldw and 0 < y < oldh:
dst[u, v] = image[intx, inty]
return dst
Now the only remaining problem is that you didn't apply the shift backward, so we end up shifting the image in the wrong direction when we turn everything else around. That's trivial to fix:
mat[0,2] += cx - (newWidth / 2)
mat[1,2] += cy - (newHeight / 2)
You do have one more problem: your code (and this updated code) only works for square images. You're getting height and width backward multiple times, and they almost all cancel out, but apparently one of them doesn't. In general, you're treating your arrays as (width, height) rather than (height, width), but you end up comparing to (original version) or looping over (new version) (height, width). So, if height and width are different, you end up trying to write past the end of the array.
Trying to find all of these and fix them is probably as much work as just starting over and doing it consistently everywhere from the start:
mat = getRMat(cx, cy, int(angle), 1)
cos = np.abs(mat[0,0])
sin = np.abs(mat[0,1])
newWidth = int((h * sin) + (w * cos))
newHeight = int((h * cos) + (w * sin))
mat[0,2] += cx - (newWidth / 2)
mat[1,2] += cy - (newHeight / 2)
def warpAff2(image, matrix, width, height):
dst = np.zeros((height, width, 3), dtype=np.uint8)
oldh, oldw = image.shape[:2]
for u in range(width):
for v in range(height):
x = u*matrix[0,0]+v*matrix[0,1]+matrix[0,2]
y = u*matrix[1,0]+v*matrix[1,1]+matrix[1,2]
intx, inty = int(x), int(y)
if 0 < intx < oldw and 0 < inty < oldh:
pix = image[inty, intx]
dst[v, u] = pix
return dst
dst = warpAff2(img, mat, newWidth, newHeight)
It's worth noting that there are much simpler (and more efficient) ways to implement this. If you build a 3x3 square matrix, you can vectorize the multiplication. Also, you can create the matrix more simply by just multiplying a shift matrix # a rotation matrix # an unshift matrix instead of manually fixing things up after the fact. But hopefully this version, since it's as close as possible to your original, should be easiest to understand.

overlay a smaller image on a larger image python OpenCv

Hi I am creating a program that replaces a face in a image with someone else's face. However, I am stuck on trying to insert the new face into the original, larger image. I have researched ROI and addWeight(needs the images to be the same size) but I haven't found a way to do this in python. Any advise is great. I am new to opencv.
I am using the following test images:
smaller_image:
larger_image:
Here is my Code so far... a mixer of other samples:
import cv2
import cv2.cv as cv
import sys
import numpy
def detect(img, cascade):
rects = cascade.detectMultiScale(img, scaleFactor=1.1, minNeighbors=3, minSize=(10, 10), flags = cv.CV_HAAR_SCALE_IMAGE)
if len(rects) == 0:
return []
rects[:,2:] += rects[:,:2]
return rects
def draw_rects(img, rects, color):
for x1, y1, x2, y2 in rects:
cv2.rectangle(img, (x1, y1), (x2, y2), color, 2)
if __name__ == '__main__':
if len(sys.argv) != 2: ## Check for error in usage syntax
print "Usage : python faces.py <image_file>"
else:
img = cv2.imread(sys.argv[1],cv2.CV_LOAD_IMAGE_COLOR) ## Read image file
if (img == None):
print "Could not open or find the image"
else:
cascade = cv2.CascadeClassifier("haarcascade_frontalface_alt.xml")
gray = cv2.cvtColor(img, cv.CV_BGR2GRAY)
gray = cv2.equalizeHist(gray)
rects = detect(gray, cascade)
## Extract face coordinates
x1 = rects[0][3]
y1 = rects[0][0]
x2 = rects[0][4]
y2 = rects[0][5]
y=y2-y1
x=x2-x1
## Extract face ROI
faceROI = gray[x1:x2, y1:y2]
## Show face ROI
cv2.imshow('Display face ROI', faceROI)
small = cv2.imread("average_face.png",cv2.CV_LOAD_IMAGE_COLOR)
print "here"
small=cv2.resize(small, (x, y))
cv2.namedWindow('Display image') ## create window for display
cv2.imshow('Display image', small) ## Show image in the window
print "size of image: ", img.shape ## print size of image
cv2.waitKey(1000)
A simple way to achieve what you want:
import cv2
s_img = cv2.imread("smaller_image.png")
l_img = cv2.imread("larger_image.jpg")
x_offset=y_offset=50
l_img[y_offset:y_offset+s_img.shape[0], x_offset:x_offset+s_img.shape[1]] = s_img
Update
I suppose you want to take care of the alpha channel too. Here is a quick and dirty way of doing so:
s_img = cv2.imread("smaller_image.png", -1)
y1, y2 = y_offset, y_offset + s_img.shape[0]
x1, x2 = x_offset, x_offset + s_img.shape[1]
alpha_s = s_img[:, :, 3] / 255.0
alpha_l = 1.0 - alpha_s
for c in range(0, 3):
l_img[y1:y2, x1:x2, c] = (alpha_s * s_img[:, :, c] +
alpha_l * l_img[y1:y2, x1:x2, c])
Using #fireant's idea, I wrote up a function to handle overlays. This works well for any position argument (including negative positions).
def overlay_image_alpha(img, img_overlay, x, y, alpha_mask):
"""Overlay `img_overlay` onto `img` at (x, y) and blend using `alpha_mask`.
`alpha_mask` must have same HxW as `img_overlay` and values in range [0, 1].
"""
# Image ranges
y1, y2 = max(0, y), min(img.shape[0], y + img_overlay.shape[0])
x1, x2 = max(0, x), min(img.shape[1], x + img_overlay.shape[1])
# Overlay ranges
y1o, y2o = max(0, -y), min(img_overlay.shape[0], img.shape[0] - y)
x1o, x2o = max(0, -x), min(img_overlay.shape[1], img.shape[1] - x)
# Exit if nothing to do
if y1 >= y2 or x1 >= x2 or y1o >= y2o or x1o >= x2o:
return
# Blend overlay within the determined ranges
img_crop = img[y1:y2, x1:x2]
img_overlay_crop = img_overlay[y1o:y2o, x1o:x2o]
alpha = alpha_mask[y1o:y2o, x1o:x2o, np.newaxis]
alpha_inv = 1.0 - alpha
img_crop[:] = alpha * img_overlay_crop + alpha_inv * img_crop
Example usage:
import numpy as np
from PIL import Image
# Prepare inputs
x, y = 50, 0
img = np.array(Image.open("img_large.jpg"))
img_overlay_rgba = np.array(Image.open("img_small.png"))
# Perform blending
alpha_mask = img_overlay_rgba[:, :, 3] / 255.0
img_result = img[:, :, :3].copy()
img_overlay = img_overlay_rgba[:, :, :3]
overlay_image_alpha(img_result, img_overlay, x, y, alpha_mask)
# Save result
Image.fromarray(img_result).save("img_result.jpg")
Result:
If you encounter errors or unusual outputs, please ensure:
img should not contain an alpha channel. (e.g. If it is RGBA, convert to RGB first.)
img_overlay has the same number of channels as img.
Based on fireant's excellent answer above, here is the alpha blending but a bit more human legible. You may need to swap 1.0-alpha and alpha depending on which direction you're merging (mine is swapped from fireant's answer).
o* == s_img.*
b* == b_img.*
for c in range(0,3):
alpha = s_img[oy:oy+height, ox:ox+width, 3] / 255.0
color = s_img[oy:oy+height, ox:ox+width, c] * (1.0-alpha)
beta = l_img[by:by+height, bx:bx+width, c] * (alpha)
l_img[by:by+height, bx:bx+width, c] = color + beta
Here it is:
def put4ChannelImageOn4ChannelImage(back, fore, x, y):
rows, cols, channels = fore.shape
trans_indices = fore[...,3] != 0 # Where not transparent
overlay_copy = back[y:y+rows, x:x+cols]
overlay_copy[trans_indices] = fore[trans_indices]
back[y:y+rows, x:x+cols] = overlay_copy
#test
background = np.zeros((1000, 1000, 4), np.uint8)
background[:] = (127, 127, 127, 1)
overlay = cv2.imread('imagee.png', cv2.IMREAD_UNCHANGED)
put4ChannelImageOn4ChannelImage(background, overlay, 5, 5)
A simple function that blits an image front onto an image back and returns the result. It works with both 3 and 4-channel images and deals with the alpha channel. Overlaps are handled as well.
The output image has the same size as back, but always 4 channels.
The output alpha channel is given by (u+v)/(1+uv) where u,v are the alpha channels of the front and back image and -1 <= u,v <= 1. Where there is no overlap with front, the alpha value from back is taken.
import cv2
def merge_image(back, front, x,y):
# convert to rgba
if back.shape[2] == 3:
back = cv2.cvtColor(back, cv2.COLOR_BGR2BGRA)
if front.shape[2] == 3:
front = cv2.cvtColor(front, cv2.COLOR_BGR2BGRA)
# crop the overlay from both images
bh,bw = back.shape[:2]
fh,fw = front.shape[:2]
x1, x2 = max(x, 0), min(x+fw, bw)
y1, y2 = max(y, 0), min(y+fh, bh)
front_cropped = front[y1-y:y2-y, x1-x:x2-x]
back_cropped = back[y1:y2, x1:x2]
alpha_front = front_cropped[:,:,3:4] / 255
alpha_back = back_cropped[:,:,3:4] / 255
# replace an area in result with overlay
result = back.copy()
print(f'af: {alpha_front.shape}\nab: {alpha_back.shape}\nfront_cropped: {front_cropped.shape}\nback_cropped: {back_cropped.shape}')
result[y1:y2, x1:x2, :3] = alpha_front * front_cropped[:,:,:3] + (1-alpha_front) * back_cropped[:,:,:3]
result[y1:y2, x1:x2, 3:4] = (alpha_front + alpha_back) / (1 + alpha_front*alpha_back) * 255
return result
For just add an alpha channel to s_img I just use cv2.addWeighted before the line
l_img[y_offset:y_offset+s_img.shape[0], x_offset:x_offset+s_img.shape[1]] = s_img
as following:
s_img=cv2.addWeighted(l_img[y_offset:y_offset+s_img.shape[0], x_offset:x_offset+s_img.shape[1]],0.5,s_img,0.5,0)
When attempting to write to the destination image using any of these answers above and you get the following error:
ValueError: assignment destination is read-only
A quick potential fix is to set the WRITEABLE flag to true.
img.setflags(write=1)
A simple 4on4 pasting function that works-
def paste(background,foreground,pos=(0,0)):
#get position and crop pasting area if needed
x = pos[0]
y = pos[1]
bgWidth = background.shape[0]
bgHeight = background.shape[1]
frWidth = foreground.shape[0]
frHeight = foreground.shape[1]
width = bgWidth-x
height = bgHeight-y
if frWidth<width:
width = frWidth
if frHeight<height:
height = frHeight
# normalize alpha channels from 0-255 to 0-1
alpha_background = background[x:x+width,y:y+height,3] / 255.0
alpha_foreground = foreground[:width,:height,3] / 255.0
# set adjusted colors
for color in range(0, 3):
fr = alpha_foreground * foreground[:width,:height,color]
bg = alpha_background * background[x:x+width,y:y+height,color] * (1 - alpha_foreground)
background[x:x+width,y:y+height,color] = fr+bg
# set adjusted alpha and denormalize back to 0-255
background[x:x+width,y:y+height,3] = (1 - (1 - alpha_foreground) * (1 - alpha_background)) * 255
return background
I reworked #fireant's concept to allow for optional alpha masks and allow any x or y, including values outside of the bounds of the image. It will crop to the bounds.
def overlay_image_alpha(img, img_overlay, x, y, alpha_mask=None):
"""Overlay `img_overlay` onto `img` at (x, y) and blend using optional `alpha_mask`.
`alpha_mask` must have same HxW as `img_overlay` and values in range [0, 1].
"""
if y < 0 or y + img_overlay.shape[0] > img.shape[0] or x < 0 or x + img_overlay.shape[1] > img.shape[1]:
y_origin = 0 if y > 0 else -y
y_end = img_overlay.shape[0] if y < 0 else min(img.shape[0] - y, img_overlay.shape[0])
x_origin = 0 if x > 0 else -x
x_end = img_overlay.shape[1] if x < 0 else min(img.shape[1] - x, img_overlay.shape[1])
img_overlay_crop = img_overlay[y_origin:y_end, x_origin:x_end]
alpha = alpha_mask[y_origin:y_end, x_origin:x_end] if alpha_mask is not None else None
else:
img_overlay_crop = img_overlay
alpha = alpha_mask
y1 = max(y, 0)
y2 = min(img.shape[0], y1 + img_overlay_crop.shape[0])
x1 = max(x, 0)
x2 = min(img.shape[1], x1 + img_overlay_crop.shape[1])
img_crop = img[y1:y2, x1:x2]
img_crop[:] = alpha * img_overlay_crop + (1.0 - alpha) * img_crop if alpha is not None else img_overlay_crop

Categories