Perspective transform with whole image - python

So I have four points in an array A, denoting the corners of a rectangular object (but not a rectangle when projected onto the image plane). I know the size of the rectangle, so I can calculate the perspective transform with
cv2.getPerspectiveTransform(four_corners, np.array([[0, 0], [0, height], [width, height], [width, 0], dtype=np.float32))
Then I can transform the image with cv2.warpPerspective.
The problem (which can be demonstrated by another people's question Counting aspect ratio of Perspective Transform destination image) is that the warped result is cropped. Only the region inside the four corners are in the final image, while I want the whole original image to be warped into the final result.
How do I achieve that?

Related

Image Processing: how to imwarp with simple mask on destination?

Following my own question from 4 years ago, this time in Python only-
I am looking for a way to perform texture mapping into a small region in a destination image, defined by 4 corners given as (x, y) pixel coordinates. This region is not necessarily rectangular. It is a perspective projection of some rectangle onto the image plane.
I would like to map some (rectangular) texture into the mask defined by those corners.
Mapping directly by forward-mapping the texture will not work properly, as source pixels will be mapped to non-integer locations in the destination.
This problem is usually solved by inverse-warping from the destination to the source, then coloring according to some interpolation.
Opencv's warpPerspective doesn't work here, as it can't take a mask in.
Inverse-warping the entire destination and then mask is not acceptable because the majority of the computation is redundant.
Is there a built-in opencv (or other) function that accomplishes above requirements?
If not, what is a good way to get a list of pixels from my ROI defined by corners, in favor of passing that to projectPoints?
Example background image:
I want to fill the area outlined by the red lines (defined by its corners) with some other texture, say this one
Mapping between them can be obtained by mapping the texture's corners to the ROI corners with cv2.getPerspectiveTransform
For future generations, here is how to only back and forward warp pixels within the bbox of the warped corner points, as #Micka suggested.
here banner is the grass image, and banner_coords_2d are the corners of the red region on image, which is meme-man.
def transform_banner(banner_coords_2d, banner, image):
# show_points_on_image("banner corners", image, banner_coords_2d)
banner_height, banner_width, _ = banner.shape
src_banner_points = np.float32([
[0, 0],
[banner_width - 1, 0],
[0, banner_height - 1],
[banner_width - 1, banner_height - 1],
])
# only warp to size of bbox of warped corners, not all of the image
warped_left = np.round(np.min(banner_coords_2d[:, 0])).astype(int)
warped_right = np.round(np.max(banner_coords_2d[:, 0])).astype(int)
warped_top = np.round(np.min(banner_coords_2d[:, 1])).astype(int)
warped_bottom = np.round(np.max(banner_coords_2d[:, 1])).astype(int)
warped_width = int(warped_right - warped_left)
warped_height = int(warped_bottom - warped_top)
dst_banner_points = banner_coords_2d.astype(np.float32)
dst_banner_points[:, 0] -= warped_left
dst_banner_points[:, 1] -= warped_top
tform = cv2.getPerspectiveTransform(src_banner_points, dst_banner_points)
warped_banner = cv2.warpPerspective(banner, tform, (warped_width, warped_height))
# cv2.imshow("warped_banner", warped_banner)
image_with_banner = image.copy()
image_with_banner[warped_top: warped_bottom, warped_left: warped_right][warped_banner != 0] = warped_banner[
warped_banner != 0]
# cv2.imshow("image_with_banner", image_with_banner)
return image_with_banner
Likely, this can be done more neatly, I am open to edits.

How to get Bird's Eye View from KITTI by Projection Matrix?

The goal is to get the Bird's Eye View from KITTI images (dataset), and I have the Projection Matrix (3x4).
There are many ways to generate transformation matrices. For Bird's Eye View I have read some kind math expressions, like:
H12 = H2*H1-1=ARA-1=P*A-1 in OpenCV - Projection, homography matrix and bird's eye view
and x = Pi * Tr * X in kitti dataset camera projection matrix
but none of these options worked for my purpose.
PYTHON CODE
import numpy as np
import cv2
image = cv2.imread('Data/RGB/000007.png')
maxHeight, maxWidth = image.shape[:2]
M has 3x4 dimensions
M = np.array(([721.5377, 0.0, 609.5593, 44.85728], [0.0, 721.5377, 72.854, 0.2163791], [0.0, 0.0, 1.0, .002745884]))
Here It's necessary a M matrix with 3x3 dimensions
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
show the original and warped images
cv2.imshow("Original", image)
cv2.imshow("Warped", warped)
cv2.waitKey(0)
I need to know how to manage the Projection Matrix for getting Bird's Eye View.
So far, everything I've tried throws warped images at me, without information even close to what I need.
This is a example of image from the KITTI database.
This is other example of image from the KITTI database.
On the left, images are shown detecting cars in 3D (above) and 2D (below). On the right is the Bird's Eye View that I want to obtain. Therefore, I need to obtain the transformation matrix to transform the coordinates of the boxes that delimit the cars.
Here is my code to manually build a bird's eye view transform:
cv::Mat1d CameraModel::getInversePerspectiveMapping(double pixelPerMeter, cv::Point const & origin) const {
double f = pixelPerMeter * cameraPosition()[2];
cv::Mat1d R(3,3);
R << 0, 1, 0,
1, 0, 0,
0, 0, 1;
cv::Mat1d K(3,3);
K << f, 0, origin.x,
0, f, origin.y,
0, 0, 1;
cv::Mat1d transformtoGround = K * R * mCameraToCarMatrix(cv::Range(0,3), cv::Range(0,3));
return transformtoGround * mIntrinsicMatrix.inv();
}
The member variables/functions used inside the functions are
mCameraToCarMatrix: a 4x4 matrix holding the homogeneous rigid transformation from the camera's coordinate system to the car's coordinate system. The camera's axes are x-right, y-down, z-forward. The car's axes are x-forward, y-left, z-up. Within this function only the rotation part of mCameraToCarMatrix is used.
mIntrinsicMatrix: the 3x3 matrix holding the camera's intrinsic parameters
cameraPosition()[2]: the Z-coordinate (height) of the camera in the car's coordinate frame. It's the same as mCameraToCarMatrix(2,3).
The function parameters:
pixelPerMeter: the resolution of the bird's eye view image. A distance of 1 meter on the XY plane will translate to pixelPerMeter pixels in the bird's eye view image.
origin: the camera's position in the bird's eye view image
You can pass the transform matrix to cv::initUndistortRectifyMaps() as newCameraMatrix and then use cv::remap to create the bird's eye view image.

OpenCV and Python -- How to superimpose images by specifying coordinates? [duplicate]

This question already has an answer here:
Displaying stitched images together without cutoff using warpAffine
(1 answer)
Closed 5 years ago.
In short, my question is how do I put an image on top of another by specifying specific coordinates for the added image? I would need to extend the "canvas" of the base image as needed so that the added image doesn't get cropped.
Here's the extended version:
My project is to take pictures extracted from a drone video and make a rough map with them, by aligning one photo with the last. I know there is software I can use to do this, like Agisoft Photoscan, but my goal is to create a more lightweight, rough solution.
So here's my plan, which I intend to do with each frame:
Use estimateRigidTransform, to generate the transformation matrix to align curr_photo with the last photo, base
Calculate the bounding rectangle needed to enclose the resulting image (using transformations of the four corners)
Modify the transformation matrix so that the top left of the bounding box is at the origin
Apply the transformation to the current photo, using the bounding rectangle's width and height to ensure none of the resulting image gets cropped
Super-impose the current image with the last image (making sure no cropping of either image occurs), by adding curr_image to base at the proper coordinates. This step is what I am asking about.
Here is the code that does steps one to four.
import numpy as np
import cv2
base = cv2.imread("images/frame_03563.jpg")
curr_photo = cv2.imread("images/frame_03564.jpg")
height, width = curr_photo.shape[:2]
# Step 1
# which transformation is required to go from curr_photo to base?
transformation = cv2.estimateRigidTransform(curr_photo, base, True)
# Step 2
# add a line to the affine transformation matrix so it can be used by
# perspectiveTransform
three_by_three = np.array([
transformation[0],
transformation[1],
[0, 0, 1]], dtype="float32")
# get corners of curr_photo (to be transformed)
corners = np.array([
[0, 0],
[width - 1, 0],
[width - 1, height - 1],
[0, height - 1]
])
# where do the corners of the image go
trans_corners = cv2.perspectiveTransform(np.float32([corners]), three_by_three)
# get the bounding rectangle for the four corner points (and thus, the transformed image)
bx, by, bwidth, bheight = cv2.boundingRect(trans_corners)
# Step 3
# modify transformation matrix so that the top left of the bounding box is at the origin
transformation[0][2] = transformation[0][2] - bx
transformation[1][2] = transformation[1][2] - by
# Step 4
# transform the image in a window the size of its bounding rectangle (so no cropping)
mod_curr_photo = cv2.warpAffine(curr_photo, transformation, (bwidth, bheight))
# for viewing
cv2.imshow("base", base)
cv2.imshow("current photo", curr_photo)
cv2.imshow("image2 transformed to image 1", mod_curr_photo)
cv2.waitKey()
I've also attached two sample images. I used the first one as the base, but it works either way.
Edit: I have now turned the answer linked below into a Python module, which you can now grab from GitHub here.
I answered this question a few weeks ago. The answer should contain everything needed to accomplish what you're after; the only thing I don't discuss there is alpha blending or other techniques to blend the borders of the images together as you would with a panorama or similar.
In order to not crop the warped photo you need to calculate the needed padding beforehand because the image warp itself could reference negative indices, in which case it won't draw them...so you need to calculate the warp locations first, pad your image enough to account for those indices outside your image bounds, and then modify your warp matrix to add those translations in so they get warped to positive values.
This allows you to create an image like this:
Image from Oxford's VGG.

Scale and Centre image - Skimage

I am trying to scale a set of images in Skimage. I am using the following code, which works well, except that the new rescaled image (by a factor 2) is now centered in the top-left (see below). I would like the image to remain in the original centre. Is there a simple way to achieve this? My aim is to have the saved copy of the image (e.g. as jpg file) to remain centered. My question does not concern the display of the image through imshow. E.g. when i save the image per below - the image is centered to the upper left, which causes issues with subsequent steps in my code.
###Part of the code
tform=skimage.transform.SimilarityTransform(scale=2, rotation=0,translation=(0, 0))
rotated = skimage.transform.warp(test, tform)
plt.imshow(rotated)
import scipy
scipy.misc.imsave('rotated.jpg', rotated)
Scaling as itself is defined as one subset of affine transformations.
The affine transformation matrix for scaling only is defined as
s_x, 0, 0
0, s_y, 0
0, 0, 1
where s_x and s_y are the scaling factors in the respective dimensions (defined relative to the origin at (0,0)). If you want your image, to be scaled not relative to the origin, but another point, you first translate the image , so that the center of scaling is in the origin, then you scale, then you move the image back. You simply do a matrix multiplication of your transform matrices with the scale matrix. I had a similar problem with rotation, that can be found here. Same principle applies for this problem. The result is
s_x, 0, (-s_x*x)+x
0, s_y, (-s_y*y)+y
0, 0, 1
where x and y are half the size of your image in the respective dimensions.
The resulting matrix can be used with:
skimage.transform.AffineTransform(matrix)

Perspective Warping in OpenCV based on know camera orientation

I am working on a project which attempts to remove the perspective distortion from an image based on the known orientation of the camera. My thinking is that I can create a rotational matrix based on the known X, Y, and Z orientations of the camera. I can then apply those matrices to the image via the WarpPerspective method.
In my script (written in Python) I have created three rotational matrices, each based on an orientation angle. I have gotten to a point where I am stuck on two issues. First, when I load each individual matrix into the WarpPerspective method, it doesn't seem to be working correctly. Whenever I warp an image on one axis it appears to significantly overwarp the image. The contents of the image are only recognizable if I limit the orientation angle to around 1 degree or less.
Secondly, how do I combine the three rotational matrices into a single matrix to be loaded into the WarpPerspective method. Can I import a 3x3 rotational matrix into that method, or do I have to create a 4x4 projective matrix. Below is the code that I am working on.
Thank you for your help.
CR
from numpy import *
import cv
#Sets angle of camera and converts to radians
x = -14 * (pi/180)
y = 20 * (pi/180)
z = 15 * (pi/180)
#Creates the Rotational Matrices
rX = array([[1, 0, 0], [0, cos(x), -sin(x)], [0, sin(x), cos(x)]])
rY = array([[cos(y), 0, -sin(y)], [0, 1, 0], [sin(y), 0, cos(y)]])
rZ = array([[cos(z), sin(z), 0], [-sin(z), cos(z), 0], [0, 0, 1]])
#Converts to CVMat format
X = cv.fromarray(rX)
Y = cv.fromarray(rY)
Z = cv.fromarray(rZ)
#Imports image file and creates destination filespace
im = cv.LoadImage("reference_image.jpg")
dst = cv.CreateImage(cv.GetSize(im), cv.IPL_DEPTH_8U, 3)
#Warps Image
cv.WarpPerspective(im, dst, X)
#Display
cv.NamedWindow("distorted")
cv.ShowImage("distorted", im)
cv.NamedWindow("corrected")
cv.ShowImage("corrected", dst)
cv.WaitKey(0)
cv.DestroyWindow("distorted")
cv.DestroyWindow("corrected")
You are doing several things wrong. First, you can't rotate on the x or y axis without a camera model. Imagine a camera with an incredibly wide field of view. You could hold it really close to an object and see the entire thing but if that object rotated its edges would seem to fly towards you very quickly with a strong perspective distortion. On the other hand a small field of view (think telescope) has very little perspective distortion. A nice place to start is setting your image plane at least as far from the camera as it is wide and putting your object right on the image plane. That is what I did in this example (c++ openCV)
The steps are
construct a rotation matrix
center the image at the origin
rotate the image
move the image down the z axis
multiply by the camera matrix
warp the perspective
//1
float x = -14 * (M_PI/180);
float y = 20 * (M_PI/180);
float z = 15 * (M_PI/180);
cv::Matx31f rot_vec(x,y,z);
cv::Matx33f rot_mat;
cv::Rodrigues(rot_vec, rot_mat); //converts to a rotation matrix
cv::Matx33f translation1(1,0,-image.cols/2,
0,1,-image.rows/2,
0,0,1);
rot_mat(0,2) = 0;
rot_mat(1,2) = 0;
rot_mat(2,2) = 1;
//2 and 3
cv::Matx33f trans = rot_mat*translation1;
//4
trans(2,2) += image.rows;
cv::Matx33f camera_mat(image.rows,0,image.rows/2,
0,image.rows,image.rows/2,
0,0,1);
//5
cv::Matx33f transform = camera_mat*trans;
//6
cv::Mat final;
cv::warpPerspective(image, final, cv::Mat(transform),image.size());
This code gave me this output
I did not see Franco's answer until I posted this. He is completely correct, using FindHomography would save you all these steps. Still I hope this is useful.
Just knowing the rotation is not enough unless your images are taken either using a telecentric lens, or with a telephoto lens with very long focal (in which cases the images are nearly orthographic, and there is no perspective distortion).
Besides, it's not necessary. True, you can undo the perspective foreshortening of one plane in the image by calibrating the camera (i.e. estimating the intrinsic and extrinsic parameters to form the full camera projection matrix).
But you achieve the same result much more simply if you can identify in the image a quadrangle which is the image of a real-world square (or rectangle with known width/height ratio). If you can do that, you can trivially compute the homography matrix that maps the square (rectangle) to the quadrangle, then warp using its inverse.
The Wikipedia page on rotation matrices shows how it is possible to combine the three basic rotation matrices into one.

Categories