This question already has an answer here:
Displaying stitched images together without cutoff using warpAffine
(1 answer)
Closed 5 years ago.
In short, my question is how do I put an image on top of another by specifying specific coordinates for the added image? I would need to extend the "canvas" of the base image as needed so that the added image doesn't get cropped.
Here's the extended version:
My project is to take pictures extracted from a drone video and make a rough map with them, by aligning one photo with the last. I know there is software I can use to do this, like Agisoft Photoscan, but my goal is to create a more lightweight, rough solution.
So here's my plan, which I intend to do with each frame:
Use estimateRigidTransform, to generate the transformation matrix to align curr_photo with the last photo, base
Calculate the bounding rectangle needed to enclose the resulting image (using transformations of the four corners)
Modify the transformation matrix so that the top left of the bounding box is at the origin
Apply the transformation to the current photo, using the bounding rectangle's width and height to ensure none of the resulting image gets cropped
Super-impose the current image with the last image (making sure no cropping of either image occurs), by adding curr_image to base at the proper coordinates. This step is what I am asking about.
Here is the code that does steps one to four.
import numpy as np
import cv2
base = cv2.imread("images/frame_03563.jpg")
curr_photo = cv2.imread("images/frame_03564.jpg")
height, width = curr_photo.shape[:2]
# Step 1
# which transformation is required to go from curr_photo to base?
transformation = cv2.estimateRigidTransform(curr_photo, base, True)
# Step 2
# add a line to the affine transformation matrix so it can be used by
# perspectiveTransform
three_by_three = np.array([
transformation[0],
transformation[1],
[0, 0, 1]], dtype="float32")
# get corners of curr_photo (to be transformed)
corners = np.array([
[0, 0],
[width - 1, 0],
[width - 1, height - 1],
[0, height - 1]
])
# where do the corners of the image go
trans_corners = cv2.perspectiveTransform(np.float32([corners]), three_by_three)
# get the bounding rectangle for the four corner points (and thus, the transformed image)
bx, by, bwidth, bheight = cv2.boundingRect(trans_corners)
# Step 3
# modify transformation matrix so that the top left of the bounding box is at the origin
transformation[0][2] = transformation[0][2] - bx
transformation[1][2] = transformation[1][2] - by
# Step 4
# transform the image in a window the size of its bounding rectangle (so no cropping)
mod_curr_photo = cv2.warpAffine(curr_photo, transformation, (bwidth, bheight))
# for viewing
cv2.imshow("base", base)
cv2.imshow("current photo", curr_photo)
cv2.imshow("image2 transformed to image 1", mod_curr_photo)
cv2.waitKey()
I've also attached two sample images. I used the first one as the base, but it works either way.
Edit: I have now turned the answer linked below into a Python module, which you can now grab from GitHub here.
I answered this question a few weeks ago. The answer should contain everything needed to accomplish what you're after; the only thing I don't discuss there is alpha blending or other techniques to blend the borders of the images together as you would with a panorama or similar.
In order to not crop the warped photo you need to calculate the needed padding beforehand because the image warp itself could reference negative indices, in which case it won't draw them...so you need to calculate the warp locations first, pad your image enough to account for those indices outside your image bounds, and then modify your warp matrix to add those translations in so they get warped to positive values.
This allows you to create an image like this:
Image from Oxford's VGG.
Related
Following my own question from 4 years ago, this time in Python only-
I am looking for a way to perform texture mapping into a small region in a destination image, defined by 4 corners given as (x, y) pixel coordinates. This region is not necessarily rectangular. It is a perspective projection of some rectangle onto the image plane.
I would like to map some (rectangular) texture into the mask defined by those corners.
Mapping directly by forward-mapping the texture will not work properly, as source pixels will be mapped to non-integer locations in the destination.
This problem is usually solved by inverse-warping from the destination to the source, then coloring according to some interpolation.
Opencv's warpPerspective doesn't work here, as it can't take a mask in.
Inverse-warping the entire destination and then mask is not acceptable because the majority of the computation is redundant.
Is there a built-in opencv (or other) function that accomplishes above requirements?
If not, what is a good way to get a list of pixels from my ROI defined by corners, in favor of passing that to projectPoints?
Example background image:
I want to fill the area outlined by the red lines (defined by its corners) with some other texture, say this one
Mapping between them can be obtained by mapping the texture's corners to the ROI corners with cv2.getPerspectiveTransform
For future generations, here is how to only back and forward warp pixels within the bbox of the warped corner points, as #Micka suggested.
here banner is the grass image, and banner_coords_2d are the corners of the red region on image, which is meme-man.
def transform_banner(banner_coords_2d, banner, image):
# show_points_on_image("banner corners", image, banner_coords_2d)
banner_height, banner_width, _ = banner.shape
src_banner_points = np.float32([
[0, 0],
[banner_width - 1, 0],
[0, banner_height - 1],
[banner_width - 1, banner_height - 1],
])
# only warp to size of bbox of warped corners, not all of the image
warped_left = np.round(np.min(banner_coords_2d[:, 0])).astype(int)
warped_right = np.round(np.max(banner_coords_2d[:, 0])).astype(int)
warped_top = np.round(np.min(banner_coords_2d[:, 1])).astype(int)
warped_bottom = np.round(np.max(banner_coords_2d[:, 1])).astype(int)
warped_width = int(warped_right - warped_left)
warped_height = int(warped_bottom - warped_top)
dst_banner_points = banner_coords_2d.astype(np.float32)
dst_banner_points[:, 0] -= warped_left
dst_banner_points[:, 1] -= warped_top
tform = cv2.getPerspectiveTransform(src_banner_points, dst_banner_points)
warped_banner = cv2.warpPerspective(banner, tform, (warped_width, warped_height))
# cv2.imshow("warped_banner", warped_banner)
image_with_banner = image.copy()
image_with_banner[warped_top: warped_bottom, warped_left: warped_right][warped_banner != 0] = warped_banner[
warped_banner != 0]
# cv2.imshow("image_with_banner", image_with_banner)
return image_with_banner
Likely, this can be done more neatly, I am open to edits.
We need to detect whether the images produced by our tunable lens are blurred or not.
We want to find a proxy measure for blurriness.
My current thinking is to first apply Sobel along the x direction because the jumps or the stripes are mostly along this direction. Then computing the x direction marginal means and finally compute the standard deviation of these marginal means.
We expect this Std is bigger for a clear image and smaller for a blurred one because clear images shall have a large intensity or more bigger jumps of pixel values.
But we get the opposite results. How could we improve this blurriness measure?
def sobel_image_central_std(PATH):
# use the blue channel
img = cv2.imread(PATH)[:,:,0]
# extract the central part of the image
hh, ww = img.shape
hh2 = hh // 2
ww2 = ww// 2
hh4 = hh // 4
ww4 = hh //4
img_center = img[hh4:(hh2+hh4), ww4:(ww2+ww4)]
# Sobel operator
sobelx = cv2.Sobel(img_center, cv2.CV_64F, 1, 0, ksize=3)
x_marginal = sobelx.mean(axis = 0)
plt.plot(x_marginal)
return(x_marginal.std())
Blur #1
Blur #2
Clear #1
Clear #2
In general:
Is there a way to detect if an image is blurry?
You can combine calculation this with your other question where you are searching for the central angle.
Once you have the angle (and the center, maybe outside of the image) you can make an axis transformation to remove the circular component of the cone. Instead you get x (radius) and y (angle) where y would run along the circular arcs.
Maybe you can get the center of the image from the camera set-up.
Then you don't need to calculate it using the intersection of the edges from the central angle. Or just do it manually once if it is fixed for all images.
Look at polar coordinate systems.
Due to the shape of the cone the image will be more dense at the peak but this should be a fixed factor. But this will probably bias the result when calculation the blurriness along the transformed image.
So what you could to correct this is create a synthetic cone image with circular lines and do the transformation on it. Again, requires some try-and-error.
But it should deliver some mask that you could use to correct the "blurriness bias".
[Updated The Question at the End]
I'm trying to detect a design pattern of simple geometrical shapes in a 640x480 image. I have divided the image in 32x32 blocks and checking in which block each shape's center lies.
Based on this calculation I created a numpy matrix of (160x120) zeros (float32) with
col=640/4
row=480/4
Each time a shape is found, the center is calculated and check in which block it is found. The corresponding item along with its 8 neighbors in 160x120 numpy array are set to 1. In the end the 160x120 numpy array is represented as a grayscale image with black background and white pixels representing the blocks of detected shapes.
As shown in the image below.
The image in top left corner represents the 160x120 numpy array. No issue so far.
As you can see the newly generated image has a white line on black foreground. I want to find the rho,theta,x0,y0,x1,y1 for this line. So I decided to use HoughLines transformation for this.
For is as followed:
edges = cv2.Canny(np.uint8(g_quadrants), 50, 150, apertureSize=3)
lines = cv2.HoughLines(edges, 1, np.pi / 180, 200)
print lines
Here g_quadrants is the 160x120 matrix representing a gray scale image but output of cv2.HoughLines does not contain anything but None.
Please help me with this.
Update:
The small window with a black and white (np.float32 consider GrayScale) image displaying a white is what I get actually when I
Divide the 640x480 in 32x32 blocks
Find the triangles in the image
Create a 32x32 matrix to map the results for each block
Update the corresponding matrix element by 1 if a triangle is found in a block
Zoomed View:
You can see there are white pixels forming a straight line. The may be some unwanted detected. I need to eliminate unwanted lone pixels and reconstructing a continuous straight line. That may be achieved by dilating then eroding the image. I need the find x0,y0, x1,y1, rho, theta of this line.
Their may be more than one lines. In that case I need to find top 2 lines with respect to length.
I have two images, one with only background and the other with background + detectable object (in my case its a car). Below are the images
I am trying to remove the background such that I only have car in the resulting image. Following is the code that with which I am trying to get the desired results
import numpy as np
import cv2
original_image = cv2.imread('IMG1.jpg', cv2.IMREAD_COLOR)
gray_original = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
background_image = cv2.imread('IMG2.jpg', cv2.IMREAD_COLOR)
gray_background = cv2.cvtColor(background_image, cv2.COLOR_BGR2GRAY)
foreground = np.absolute(gray_original - gray_background)
foreground[foreground > 0] = 255
cv2.imshow('Original Image', foreground)
cv2.waitKey(0)
The resulting image by subtracting the two images is
Here is the problem. The expected resulting image should be a car only.
Also, If you take a deep look in the two images, you'll see that they are not exactly same that is, the camera moved a little so background had been disturbed a little. My question is that with these two images how can I subtract the background. I do not want to use grabCut or backgroundSubtractorMOG algorithm right now because I do not know right now whats going on inside those algorithms.
What I am trying to do is to get the following resulting image
Also if possible, please guide me with a general way of doing this not only in this specific case that is, I have a background in one image and background+object in the second image. What could be the best possible way of doing this. Sorry for such a long question.
I solved your problem using the OpenCV's watershed algorithm. You can find the theory and examples of watershed here.
First I selected several points (markers) to dictate where is the object I want to keep, and where is the background. This step is manual, and can vary a lot from image to image. Also, it requires some repetition until you get the desired result. I suggest using a tool to get the pixel coordinates.
Then I created an empty integer array of zeros, with the size of the car image. And then I assigned some values (1:background, [255,192,128,64]:car_parts) to pixels at marker positions.
NOTE: When I downloaded your image I had to crop it to get the one with the car. After cropping, the image has size of 400x601. This may not be what the size of the image you have, so the markers will be off.
Afterwards I used the watershed algorithm. The 1st input is your image and 2nd input is the marker image (zero everywhere except at marker positions). The result is shown in the image below.
I set all pixels with value greater than 1 to 255 (the car), and the rest (background) to zero. Then I dilated the obtained image with a 3x3 kernel to avoid losing information on the outline of the car. Finally, I used the dilated image as a mask for the original image, using the cv2.bitwise_and() function, and the result lies in the following image:
Here is my code:
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load the image
img = cv2.imread("/path/to/image.png", 3)
# Create a blank image of zeros (same dimension as img)
# It should be grayscale (1 color channel)
marker = np.zeros_like(img[:,:,0]).astype(np.int32)
# This step is manual. The goal is to find the points
# which create the result we want. I suggest using a
# tool to get the pixel coordinates.
# Dictate the background and set the markers to 1
marker[204][95] = 1
marker[240][137] = 1
marker[245][444] = 1
marker[260][427] = 1
marker[257][378] = 1
marker[217][466] = 1
# Dictate the area of interest
# I used different values for each part of the car (for visibility)
marker[235][370] = 255 # car body
marker[135][294] = 64 # rooftop
marker[190][454] = 64 # rear light
marker[167][458] = 64 # rear wing
marker[205][103] = 128 # front bumper
# rear bumper
marker[225][456] = 128
marker[224][461] = 128
marker[216][461] = 128
# front wheel
marker[225][189] = 192
marker[240][147] = 192
# rear wheel
marker[258][409] = 192
marker[257][391] = 192
marker[254][421] = 192
# Now we have set the markers, we use the watershed
# algorithm to generate a marked image
marked = cv2.watershed(img, marker)
# Plot this one. If it does what we want, proceed;
# otherwise edit your markers and repeat
plt.imshow(marked, cmap='gray')
plt.show()
# Make the background black, and what we want to keep white
marked[marked == 1] = 0
marked[marked > 1] = 255
# Use a kernel to dilate the image, to not lose any detail on the outline
# I used a kernel of 3x3 pixels
kernel = np.ones((3,3),np.uint8)
dilation = cv2.dilate(marked.astype(np.float32), kernel, iterations = 1)
# Plot again to check whether the dilation is according to our needs
# If not, repeat by using a smaller/bigger kernel, or more/less iterations
plt.imshow(dilation, cmap='gray')
plt.show()
# Now apply the mask we created on the initial image
final_img = cv2.bitwise_and(img, img, mask=dilation.astype(np.uint8))
# cv2.imread reads the image as BGR, but matplotlib uses RGB
# BGR to RGB so we can plot the image with accurate colors
b, g, r = cv2.split(final_img)
final_img = cv2.merge([r, g, b])
# Plot the final result
plt.imshow(final_img)
plt.show()
If you have a lot of images you will probably need to create a tool to annotate the markers graphically, or even an algorithm to find markers automatically.
The problem is that you're subtracting arrays of unsigned 8 bit integers. This operation can overflow.
To demonstrate
>>> import numpy as np
>>> a = np.array([[10,10]],dtype=np.uint8)
>>> b = np.array([[11,11]],dtype=np.uint8)
>>> a - b
array([[255, 255]], dtype=uint8)
Since you're using OpenCV, the simplest way to achieve your goal is to use cv2.absdiff().
>>> cv2.absdiff(a,b)
array([[1, 1]], dtype=uint8)
I recommend using OpenCV's grabcut algorithm. You first draw a few lines on the foreground and background, and keep doing this until your foreground is sufficiently separated from the background. It is covered here: https://docs.opencv.org/trunk/d8/d83/tutorial_py_grabcut.html
as well as in this video: https://www.youtube.com/watch?v=kAwxLTDDAwU
I am trying to scale a set of images in Skimage. I am using the following code, which works well, except that the new rescaled image (by a factor 2) is now centered in the top-left (see below). I would like the image to remain in the original centre. Is there a simple way to achieve this? My aim is to have the saved copy of the image (e.g. as jpg file) to remain centered. My question does not concern the display of the image through imshow. E.g. when i save the image per below - the image is centered to the upper left, which causes issues with subsequent steps in my code.
###Part of the code
tform=skimage.transform.SimilarityTransform(scale=2, rotation=0,translation=(0, 0))
rotated = skimage.transform.warp(test, tform)
plt.imshow(rotated)
import scipy
scipy.misc.imsave('rotated.jpg', rotated)
Scaling as itself is defined as one subset of affine transformations.
The affine transformation matrix for scaling only is defined as
s_x, 0, 0
0, s_y, 0
0, 0, 1
where s_x and s_y are the scaling factors in the respective dimensions (defined relative to the origin at (0,0)). If you want your image, to be scaled not relative to the origin, but another point, you first translate the image , so that the center of scaling is in the origin, then you scale, then you move the image back. You simply do a matrix multiplication of your transform matrices with the scale matrix. I had a similar problem with rotation, that can be found here. Same principle applies for this problem. The result is
s_x, 0, (-s_x*x)+x
0, s_y, (-s_y*y)+y
0, 0, 1
where x and y are half the size of your image in the respective dimensions.
The resulting matrix can be used with:
skimage.transform.AffineTransform(matrix)