I am trying to scale a set of images in Skimage. I am using the following code, which works well, except that the new rescaled image (by a factor 2) is now centered in the top-left (see below). I would like the image to remain in the original centre. Is there a simple way to achieve this? My aim is to have the saved copy of the image (e.g. as jpg file) to remain centered. My question does not concern the display of the image through imshow. E.g. when i save the image per below - the image is centered to the upper left, which causes issues with subsequent steps in my code.
###Part of the code
tform=skimage.transform.SimilarityTransform(scale=2, rotation=0,translation=(0, 0))
rotated = skimage.transform.warp(test, tform)
plt.imshow(rotated)
import scipy
scipy.misc.imsave('rotated.jpg', rotated)
Scaling as itself is defined as one subset of affine transformations.
The affine transformation matrix for scaling only is defined as
s_x, 0, 0
0, s_y, 0
0, 0, 1
where s_x and s_y are the scaling factors in the respective dimensions (defined relative to the origin at (0,0)). If you want your image, to be scaled not relative to the origin, but another point, you first translate the image , so that the center of scaling is in the origin, then you scale, then you move the image back. You simply do a matrix multiplication of your transform matrices with the scale matrix. I had a similar problem with rotation, that can be found here. Same principle applies for this problem. The result is
s_x, 0, (-s_x*x)+x
0, s_y, (-s_y*y)+y
0, 0, 1
where x and y are half the size of your image in the respective dimensions.
The resulting matrix can be used with:
skimage.transform.AffineTransform(matrix)
Related
Suppose that I have an input x of size [H,W] and also a mu_x and mu_y (which may be fractional)representing the pixels in x and y direction to shift. Is there any efficient way in pytorch without using c++ to shift the tensor x for mu_x and mu_y units with bilinear interpolation.
To be more precise, let's say we have an image. mu_x = 5 and mu_y = 3, we may want to shift the image so that the image moves rightward 5 pixels and downward 3 pixels, with the pixels out of boundary of [H,W] removed and new pixels introduced at the other end of the boundary to be 0. However, with fractional mu_x and mu_y, we need to use bilinear interpolation to estimate the resulting image.
Is it possible to be implemented with pure pytorch tensor operations? Or do I need to use c++.
I believe you can achieve this by applying grid sampling on your original input and using a grid to guide the sampling process. If you take a coordinate grid of your image and sample using that the resulting image will be equal to the original image. However you can apply a shift on this grid and therefore sample with the given shift. Grid sampling works with floating-point grids of course, which means you can apply an arbitrary non-round shift to your image and choose a sampling mode (bilinear is the default).
This can be implemented out of the box with F.grid_sampling. Given an image tensor img, we first construct a pixel grid of that image using torch.meshgrid. Keep in mind the grid used by the sampler must be normalized to [-1, -1]. Therefore pixel x=0,y=0 should be mapped to (-1,-1), pixel x=w,y=h mapped to (1,1), and the center pixel will end up at around (0,0).
Use two torch.arange with a [0,1]-normalization followed by a remapping to [-1,1]:
>>> c,h,w = img.shape
>>> x, y = torch.arange(h)/(h-1), torch.arange(w)/(w-1)
>>> grid = torch.dstack(torch.meshgrid(x, y))*2-1
So the resulting grid has a shape of (c, h, w) which will be the dimensions of the output image produced by the sampling process.
Since we are not working with batched elements, we need to unsqueeze singleton dimensions on both img and grid. Then we can apply F.grid_sample:
>>> sampled = F.grid_sample(img[None], grid[None])
Following this you can apply your arbitrary mu_x, mu_y shift and even easily use this to batches of images and shifts. The way you would define your sampling is by defining a shifted grid:
>>> x_s, y_s = (torch.arange(h)+mu_y)/(h-1), (torch.arange(w)+mu_x)/(w-1)
Where mu_x and mu_y are the values in pixels (floating point) with wish which the image is shifted on the horizontal and vertical axes respectively. To acquire the sampled image, apply F.grid_sampling on a grid made up of x_s and y_s:
>>> grid_shifted = torch.dstack(torch.meshgrid(x_s, y_s))*2-1
>>> sampled = F.grid_sample(img[None], grid_shifted[None])
We need to detect whether the images produced by our tunable lens are blurred or not.
We want to find a proxy measure for blurriness.
My current thinking is to first apply Sobel along the x direction because the jumps or the stripes are mostly along this direction. Then computing the x direction marginal means and finally compute the standard deviation of these marginal means.
We expect this Std is bigger for a clear image and smaller for a blurred one because clear images shall have a large intensity or more bigger jumps of pixel values.
But we get the opposite results. How could we improve this blurriness measure?
def sobel_image_central_std(PATH):
# use the blue channel
img = cv2.imread(PATH)[:,:,0]
# extract the central part of the image
hh, ww = img.shape
hh2 = hh // 2
ww2 = ww// 2
hh4 = hh // 4
ww4 = hh //4
img_center = img[hh4:(hh2+hh4), ww4:(ww2+ww4)]
# Sobel operator
sobelx = cv2.Sobel(img_center, cv2.CV_64F, 1, 0, ksize=3)
x_marginal = sobelx.mean(axis = 0)
plt.plot(x_marginal)
return(x_marginal.std())
Blur #1
Blur #2
Clear #1
Clear #2
In general:
Is there a way to detect if an image is blurry?
You can combine calculation this with your other question where you are searching for the central angle.
Once you have the angle (and the center, maybe outside of the image) you can make an axis transformation to remove the circular component of the cone. Instead you get x (radius) and y (angle) where y would run along the circular arcs.
Maybe you can get the center of the image from the camera set-up.
Then you don't need to calculate it using the intersection of the edges from the central angle. Or just do it manually once if it is fixed for all images.
Look at polar coordinate systems.
Due to the shape of the cone the image will be more dense at the peak but this should be a fixed factor. But this will probably bias the result when calculation the blurriness along the transformed image.
So what you could to correct this is create a synthetic cone image with circular lines and do the transformation on it. Again, requires some try-and-error.
But it should deliver some mask that you could use to correct the "blurriness bias".
This question already has an answer here:
Displaying stitched images together without cutoff using warpAffine
(1 answer)
Closed 5 years ago.
In short, my question is how do I put an image on top of another by specifying specific coordinates for the added image? I would need to extend the "canvas" of the base image as needed so that the added image doesn't get cropped.
Here's the extended version:
My project is to take pictures extracted from a drone video and make a rough map with them, by aligning one photo with the last. I know there is software I can use to do this, like Agisoft Photoscan, but my goal is to create a more lightweight, rough solution.
So here's my plan, which I intend to do with each frame:
Use estimateRigidTransform, to generate the transformation matrix to align curr_photo with the last photo, base
Calculate the bounding rectangle needed to enclose the resulting image (using transformations of the four corners)
Modify the transformation matrix so that the top left of the bounding box is at the origin
Apply the transformation to the current photo, using the bounding rectangle's width and height to ensure none of the resulting image gets cropped
Super-impose the current image with the last image (making sure no cropping of either image occurs), by adding curr_image to base at the proper coordinates. This step is what I am asking about.
Here is the code that does steps one to four.
import numpy as np
import cv2
base = cv2.imread("images/frame_03563.jpg")
curr_photo = cv2.imread("images/frame_03564.jpg")
height, width = curr_photo.shape[:2]
# Step 1
# which transformation is required to go from curr_photo to base?
transformation = cv2.estimateRigidTransform(curr_photo, base, True)
# Step 2
# add a line to the affine transformation matrix so it can be used by
# perspectiveTransform
three_by_three = np.array([
transformation[0],
transformation[1],
[0, 0, 1]], dtype="float32")
# get corners of curr_photo (to be transformed)
corners = np.array([
[0, 0],
[width - 1, 0],
[width - 1, height - 1],
[0, height - 1]
])
# where do the corners of the image go
trans_corners = cv2.perspectiveTransform(np.float32([corners]), three_by_three)
# get the bounding rectangle for the four corner points (and thus, the transformed image)
bx, by, bwidth, bheight = cv2.boundingRect(trans_corners)
# Step 3
# modify transformation matrix so that the top left of the bounding box is at the origin
transformation[0][2] = transformation[0][2] - bx
transformation[1][2] = transformation[1][2] - by
# Step 4
# transform the image in a window the size of its bounding rectangle (so no cropping)
mod_curr_photo = cv2.warpAffine(curr_photo, transformation, (bwidth, bheight))
# for viewing
cv2.imshow("base", base)
cv2.imshow("current photo", curr_photo)
cv2.imshow("image2 transformed to image 1", mod_curr_photo)
cv2.waitKey()
I've also attached two sample images. I used the first one as the base, but it works either way.
Edit: I have now turned the answer linked below into a Python module, which you can now grab from GitHub here.
I answered this question a few weeks ago. The answer should contain everything needed to accomplish what you're after; the only thing I don't discuss there is alpha blending or other techniques to blend the borders of the images together as you would with a panorama or similar.
In order to not crop the warped photo you need to calculate the needed padding beforehand because the image warp itself could reference negative indices, in which case it won't draw them...so you need to calculate the warp locations first, pad your image enough to account for those indices outside your image bounds, and then modify your warp matrix to add those translations in so they get warped to positive values.
This allows you to create an image like this:
Image from Oxford's VGG.
So I have four points in an array A, denoting the corners of a rectangular object (but not a rectangle when projected onto the image plane). I know the size of the rectangle, so I can calculate the perspective transform with
cv2.getPerspectiveTransform(four_corners, np.array([[0, 0], [0, height], [width, height], [width, 0], dtype=np.float32))
Then I can transform the image with cv2.warpPerspective.
The problem (which can be demonstrated by another people's question Counting aspect ratio of Perspective Transform destination image) is that the warped result is cropped. Only the region inside the four corners are in the final image, while I want the whole original image to be warped into the final result.
How do I achieve that?
I am working on a project which attempts to remove the perspective distortion from an image based on the known orientation of the camera. My thinking is that I can create a rotational matrix based on the known X, Y, and Z orientations of the camera. I can then apply those matrices to the image via the WarpPerspective method.
In my script (written in Python) I have created three rotational matrices, each based on an orientation angle. I have gotten to a point where I am stuck on two issues. First, when I load each individual matrix into the WarpPerspective method, it doesn't seem to be working correctly. Whenever I warp an image on one axis it appears to significantly overwarp the image. The contents of the image are only recognizable if I limit the orientation angle to around 1 degree or less.
Secondly, how do I combine the three rotational matrices into a single matrix to be loaded into the WarpPerspective method. Can I import a 3x3 rotational matrix into that method, or do I have to create a 4x4 projective matrix. Below is the code that I am working on.
Thank you for your help.
CR
from numpy import *
import cv
#Sets angle of camera and converts to radians
x = -14 * (pi/180)
y = 20 * (pi/180)
z = 15 * (pi/180)
#Creates the Rotational Matrices
rX = array([[1, 0, 0], [0, cos(x), -sin(x)], [0, sin(x), cos(x)]])
rY = array([[cos(y), 0, -sin(y)], [0, 1, 0], [sin(y), 0, cos(y)]])
rZ = array([[cos(z), sin(z), 0], [-sin(z), cos(z), 0], [0, 0, 1]])
#Converts to CVMat format
X = cv.fromarray(rX)
Y = cv.fromarray(rY)
Z = cv.fromarray(rZ)
#Imports image file and creates destination filespace
im = cv.LoadImage("reference_image.jpg")
dst = cv.CreateImage(cv.GetSize(im), cv.IPL_DEPTH_8U, 3)
#Warps Image
cv.WarpPerspective(im, dst, X)
#Display
cv.NamedWindow("distorted")
cv.ShowImage("distorted", im)
cv.NamedWindow("corrected")
cv.ShowImage("corrected", dst)
cv.WaitKey(0)
cv.DestroyWindow("distorted")
cv.DestroyWindow("corrected")
You are doing several things wrong. First, you can't rotate on the x or y axis without a camera model. Imagine a camera with an incredibly wide field of view. You could hold it really close to an object and see the entire thing but if that object rotated its edges would seem to fly towards you very quickly with a strong perspective distortion. On the other hand a small field of view (think telescope) has very little perspective distortion. A nice place to start is setting your image plane at least as far from the camera as it is wide and putting your object right on the image plane. That is what I did in this example (c++ openCV)
The steps are
construct a rotation matrix
center the image at the origin
rotate the image
move the image down the z axis
multiply by the camera matrix
warp the perspective
//1
float x = -14 * (M_PI/180);
float y = 20 * (M_PI/180);
float z = 15 * (M_PI/180);
cv::Matx31f rot_vec(x,y,z);
cv::Matx33f rot_mat;
cv::Rodrigues(rot_vec, rot_mat); //converts to a rotation matrix
cv::Matx33f translation1(1,0,-image.cols/2,
0,1,-image.rows/2,
0,0,1);
rot_mat(0,2) = 0;
rot_mat(1,2) = 0;
rot_mat(2,2) = 1;
//2 and 3
cv::Matx33f trans = rot_mat*translation1;
//4
trans(2,2) += image.rows;
cv::Matx33f camera_mat(image.rows,0,image.rows/2,
0,image.rows,image.rows/2,
0,0,1);
//5
cv::Matx33f transform = camera_mat*trans;
//6
cv::Mat final;
cv::warpPerspective(image, final, cv::Mat(transform),image.size());
This code gave me this output
I did not see Franco's answer until I posted this. He is completely correct, using FindHomography would save you all these steps. Still I hope this is useful.
Just knowing the rotation is not enough unless your images are taken either using a telecentric lens, or with a telephoto lens with very long focal (in which cases the images are nearly orthographic, and there is no perspective distortion).
Besides, it's not necessary. True, you can undo the perspective foreshortening of one plane in the image by calibrating the camera (i.e. estimating the intrinsic and extrinsic parameters to form the full camera projection matrix).
But you achieve the same result much more simply if you can identify in the image a quadrangle which is the image of a real-world square (or rectangle with known width/height ratio). If you can do that, you can trivially compute the homography matrix that maps the square (rectangle) to the quadrangle, then warp using its inverse.
The Wikipedia page on rotation matrices shows how it is possible to combine the three basic rotation matrices into one.