perspective transform image segment and transform back

perspective transform image segment and transform back - python

I am currently trying to cut a region of interest out of an image, do some calculations based on the information inside the snippet, and then either transform the snippet back into the original position or transform some coordinates from the calculation done on the snippet back into the original image.
Here are some code snippets:
x, y, w, h = cv2.boundingRect(localized_mask)
p1 = [x, y + h]
p4 = [x, y]
p3 = [x + w, y]
p2 = [x + w, y + h]
w1 = int(np.linalg.norm(np.array(p2) - np.array(p3)))
w2 = int(np.linalg.norm(np.array(p4) - np.array(p1)))
h1 = int(np.linalg.norm(np.array(p1) - np.array(p2)))
h2 = int(np.linalg.norm(np.array(p3) - np.array(p4)))
maxWidth = max(w1, w2)
maxHeight = max(h1, h2)
neighbor_points = [p1, p2, p3, p4]
output_poins = np.float32(
[
[0, 0],
[0, maxHeight],
[maxWidth, maxHeight],
[maxWidth, 0],
]
)
matrix = cv2.getPerspectiveTransform(np.float32(neighbor_points), output_poins)
result = cv2.warpPerspective(
mask, matrix, (maxWidth, maxHeight), cv2.INTER_LINEAR
)
Here are some images to illustrate this problem:
Original with marked RoI:
Transformed snippet with markings:
I tried to transform the snippet back into the original position with the following code snippets:
test2 = cv2.warpPerspective(
result, matrix, (maxHeight, maxWidth), cv2.WARP_INVERSE_MAP
)
test3 = cv2.warpPerspective(
result, matrix, (img.shape[1], img.shape[0]), cv2.WARP_INVERSE_MAP
)
Both resulted in a black image with either the shape of the snippet or a black image with the shape of the original image.
But I am honestly more interested in the white markings inside the snippet, so I tried to transform these by hand with the following code snippet:
inverse_matrix = cv2.invert(matrix)[1]
inverse_left=[]
for point in output_dict["left"]["knots"]:
trans_point = [point.x, point.y] + [1]
trans_point = np.float32(trans_point)
x, y, z = np.dot(inverse_matrix, trans_point)
new_x = np.uint8(x/z)
new_y = np.uint8(y/z)
inverse_left.append([new_x, new_y])
But I didn't account for the position of the RoI inside the image and the resulting coordinates (white dots in the upper left half) didn't end up where I wanted them.
Does anybody have an idea what I am doing wrong or know a better solution to this problem?
Thanks.

Finally found a solution and it was as simple as i thought it would be...
I first inverted the transformation matrix I used to get my image snippet and looped and transformed every single coordinate that I got out of my calculation based on the snippet.
The code looks something like this:
inv_matrix = cv2.invert(matrix)
for point in points:
x, y = (cv2.transform(np.array([[[point.x, point.y]]]), inv_matrix[1]).squeeze())[:2]

Related

How do I use scipy.interpolate.griddata similar to griddata in MATLAB?

I am performing some 3D linear interpolation to a set of 3D images (96 * 96 * 60) to apply affine transform. I used MATLAB griddata function:
% T = [4*4] affine transform matrix
[X_org,Y_org,Z_org]=meshgrid(1:imgWidth,1:imgHeight,1:sliceNum);
x=X_org(:);
y=Y_org(:);
z=Z_org(:);
% Fill in v with input image data
for i = 1:length(x)
v(i)=img_in(y(i),x(i),z(i));
end
v = v';
tempsource = [X_org(:) Y_org(:) Z_org(:) ones(length(Z_org(:)),1)]';
sourceCoor =inv(T) * tempsource; % TargetCoor=T*sourceCoor, padding 1s to the matrix
%interpolation method = 'linear';
xq=sourceCoor(1,:)';
yq=sourceCoor(2,:)';
zq=sourceCoor(3,:)';
vq = griddata(x,y,z,v,xq,yq,zq);
img_out = reshape(vq,imgWidth,imgHeight,sliceNum); % the correct output
However, if I tried in Python, scipy.interpolate.griddata behaves differently. I first create same inputs, x,y,z,v,xq,yq,zq to the function:
# T = [4*4] affine transform matrix
Y_org, X_org, Z_org = np.mgrid[1:imgWidth+1, 1:imgHeight+1, 1:sliceNum+1]
x = X_org.flatten(order='F')
y = Y_org.flatten(order='F')
z = Z_org.flatten(order='F')
v = np.zeros_like(x, dtype=np.float32)
for i in range(len(x)):
v[i] = img_in[x[i]-1, y[i]-1, z[i]-1]
v = v.T
tempSoure = np.vstack((x, y, z, np.ones(z.shape[0])))
sourceCoor = np.linalg.inv(T) # tempSoure
xq = sourceCoor[0, :].T
yq = sourceCoor[1, :].T
zq = sourceCoor[2, :].T
# every variable before this line is same as in MATLAB
# vq = griddata((x, y, z), v, (xq, yq, zq))
# vq = griddata((xq, yq, zq), v, (x, y, z))
img_out = np.reshape(vq, (imgHeight, imgWidth, sliceNum))
Two approaches in last three lines of the Python code snippet do not work. The upper one takes forever to execute, while the lower one returns a different and weird result. I see that Python is 0-indexed and MATLAB is 1-indexed so I made some modification in Python version (v[i] = img_in[x[i]-1, y[i]-1, z[i]-1]{The order of indexes I passed in also changed to maintain the original v}, etc.), but I am worried the griddata function in Python will still try to fit with 0-indexed style, resulting to no solution for my 1-indexed style input. Is that the cause of my outcomes? And how can I manage to get the correct output image, assuming 0-indexed?

What are the inaccuracies of this 'inverse map' function in OpenCV?

I am trying to horizontally stretch an image in a very specific way. Each x prime coordinate should follow a tangent path with respect to the original x coordinate. I believe there are two ways to do this:
Inverse the tangent function and map it normally
Map the tangent function and then inverse the mapping
Using this answer for map inversion, Im trying to figure out why the two images are not the same. I know that the first method gives me the correct image that I'm looking for, so why doesnt the second method work? Is it because of the "limited precision" that #ChristophRackwitz commented on the answer?
import cv2
import glob
import numpy as np
import math
A = -1010
B = -3.931
C = 5.258
D = 978.3
M = -193.8
N = 1740
def get_tan_func_value(x):
return A * math.tan((((x-N)/M)+B)/C) + D
def get_inverse_tan_func_value(x):
return M * (C*math.atan((x-D)/A) - B) + N
# answer from linked post
def invert_map(F, shape):
I = np.zeros_like(F)
I[:,:,1], I[:,:,0] = np.indices(shape)
P = np.copy(I)
for i in range(10):
P += I - cv2.remap(F, P, None, interpolation=cv2.INTER_LINEAR)
return P
# import image
images = glob.glob('*.jpg')
img = cv2.imread(images[0])
h, w = img.shape[:2]
map_x_tan = np.zeros((img.shape[0], img.shape[1]), dtype=np.float32)
map_x_inverse_tan = np.zeros((img.shape[0], img.shape[1]), dtype=np.float32)
map_y = np.zeros((img.shape[0], img.shape[1]), dtype=np.float32)
# x tan function map
for i in range(map_x_tan.shape[0]):
map_x_tan[i,:] = [get_tan_func_value(x) for x in range(map_x_tan.shape[1])]
# x inverse tan function map
for i in range(map_x_inverse_tan.shape[0]):
map_x_inverse_tan[i,:] = [get_inverse_tan_func_value(x) for x in range(map_x_inverse_tan.shape[1])]
# default y map
for j in range(map_y.shape[1]):
map_y[:,j] = [y for y in range(map_y.shape[0])]
# convert x tan map to 2 channel (x,y) map
(xymap_tan, _) = cv2.convertMaps(map1=map_x_tan, map2=map_y, dstmap1type=cv2.CV_32FC2)
# invert the 2 channel x tan map
xymap_inverted = invert_map(xymap_tan, (h,w))
# remap and write the target image (inverse tan function with normal map)
target = cv2.remap(img, map_x_inverse_tan, map_y, cv2.INTER_LINEAR)
cv2.imwrite("target.jpg", target)
# remap and write the attempted image (normal tan function with inverted map)
attempt = cv2.remap(img, xymap_inverted, None, cv2.INTER_LINEAR)
cv2.imwrite("attempt.jpg", attempt)
Method 1: Target Image
Method 2: Attempt Image
The results show that the attempt (normal tan function with inverted map) has less stretching near the edges of the image than expected. Almost everywhere else on the images are identical except the edges. I did not post the original picture to save space.

I've played around with that invert_map procedure. It seems slightly susceptible to oscillation.
use this instead:
def invert_map(F):
(h, w) = F.shape[:2] # (h, w, 2), "xymap"
I = np.zeros_like(F)
I[:,:,1], I[:,:,0] = np.indices((h,w)) # identity map
P = np.copy(I)
for i in range(10):
correction = I - cv2.remap(F, P, None, interpolation=cv2.INTER_LINEAR)
P += correction * 0.5
return P
I simply damped the correction by 0.5, which makes the fixed point iteration tamer, converging a lot faster too.
In my experiments with your tan map, I've found that 5-10 iterations are good enough already, and there's no further progress in further iterations.
Entire notebook of my explorations: https://gist.github.com/crackwitz/67f76f8a9eff21476b080c06d20660d0
Feature request: https://github.com/opencv/opencv/issues/22120

Most efficient way to transfrom a 2d array to a different coordinate system using a function, then interpolate the resultant holes

To start, Im basically trying to go from this:
To this:
Given that each coordinate [x,y] correspond with a given point in the second image after a function is applied to x and y. f(x,y)=coords of the second image for the value of [x,y]. The way Im handling this part as of now is to make a "map" array of x and y and the lookup in that array to find the new point. so mapArrayX[x] will give the new x value and mapArray[y] will give the new Y value. The Issue with this is that I have to iterate over the entire image (256,000 points) and that takes roughly .4 seconds. Is there a better way to do this?
The second issue is after transforming the coordinates I get an image with holes in it that looks like this:
which I make look like the image above without the holes by doing this:
dewarpedImage[dewarpedImage == 0] = np.nan
x = np.arange(0, dewarpedImage.shape[1])
y = np.arange(0, dewarpedImage.shape[0])
# mask invalid values
dewarpedImage = np.ma.masked_invalid(dewarpedImage)
xx, yy = np.meshgrid(x, y)
# get only the valid values
x1 = xx[~dewarpedImage.mask]
y1 = yy[~dewarpedImage.mask]
newarr = dewarpedImage[~dewarpedImage.mask]
startTime = time.time()
dewarpedImage = interpolate.griddata((x1, y1), newarr.ravel(),
(xx, yy),
method='linear')
This takes roughly 3 seconds to perform. Is there a faster way to do this maybe. I ideally need to get this whole process to go from taking 3+seconds to less than 1 second.
Here is my conversion function/how I generate my mapping:
RANGE_BIN_SIZE = .39
def rangeBinToRange(rangeBin):
return rangeBin * RANGE_BIN_SIZE
def azToDegree(azBin):
degree = math.degrees(math.asin((azBin - 127.5) * 0.3771/(0.19812*255)))
return degree
def makeWarpMap():
print("making warp maps")
xMap = np.zeros((1024, 256))
yMap = np.zeros((1024, 256))
for az in range(256):
for rang in range(1024):
azDegree = azToDegree(az)
dist = rangeBinToRange(rang)
x = round(dist * math.sin(math.radians(azDegree)) + 381)
y = round(dist * math.cos(math.radians(azDegree)))
xMap[rang][az] = x
yMap[rang][az] = y
np.save("warpmapX", xMap)
np.save("warpmapY", yMap)
print(azToDegree(0))
if not path.exists("warpmapX.npy") or not path.exists("warpmapY.npy"):
makeWarpMap()
data = np.load(filename)
xMap = np.load("warpmapX.npy")
yMap = np.load("warpmapY.npy")
dewarpedImage = np.zeros((400, 762))
print(data.shape)
for az in range(256):
azslice = data[:, az]
for rang in range(1024):
intensity = azslice[rang]
x = xMap[rang][az]
y = yMap[rang][az]
dewarpedImage[int(y)][int(x)] = intensity

You have holes in your converted image because your conversion does not span the entire polar image. I would recommend to do the reverse conversion. In other words, for each (X,Y) in polar image, find corresponding point (x,y) in cartesian image and get that color. That way you won't need to deal with holes at all and it will give you a full image (it will get rid of 3sec conversion). If you provide your conversion function, I can help you do the reverse conversion.

Apply rotation defined by Euler angles to 3D image, in python

I'm working with 3D images and have to rotate them according to Euler angles (phi,psi,theta) in 'zxz' convention (these Euler angles are part of a dataset, so I have to use that convention). I found the function scipy.ndimage.rotate that seems useful in that regard.
arrayR = scipy.ndimage.rotate(array , phi, axes=(0,1), reshape=False)
arrayR = scipy.ndimage.rotate(arrayR, psi, axes=(1,2), reshape=False)
arrayR = scipy.ndimage.rotate(arrayR, the, axes=(0,1), reshape=False)
Sadly, this does not do what intended. This is why:
Definition:
In the z-x-z convention, the x-y-z frame is rotated three times: first
about the z-axis by an angle phi; then about the new x-axis by an
angle psi; then about the newest z-axis by an angle theta.
However with above code, the rotations are always with respect to the original axes. Which is why obtained rotations are not correct. Anyone has a suggestion to obtain correct rotations, as explained in the definition?
In other words, in the present 'zxz' convention the rotations are intrinsic (rotations about the axes of the rotating coordinate system XYZ, solidary with the moving body, which changes its orientation after each elemental rotation). If I use the above code, the rotations are extrinsic (rotations about the axes xyz of the original coordinate system, which is assumed to remain motionless). I need a way for doing extrinsic rotations, in python.

I found a satisfying solution following this link: https://nbviewer.jupyter.org/gist/lhk/f05ee20b5a826e4c8b9bb3e528348688
This method uses np.meshgrid, scipy.ndimage.map_coordinates. The above link uses some third party library for generating the rotation matrix, however I use scipy.spatial.transform.Rotation. This function allows to define both intrinsic and extrinsic rotations: see description of scipy.spatial.transform.Rotation.from_euler.
Here is my function:
import numpy as np
from scipy.spatial.transform import Rotation as R
from scipy.ndimage import map_coordinates
# Rotates 3D image around image center
# INPUTS
# array: 3D numpy array
# orient: list of Euler angles (phi,psi,the)
# OUTPUT
# arrayR: rotated 3D numpy array
# by E. Moebel, 2020
def rotate_array(array, orient):
phi = orient[0]
psi = orient[1]
the = orient[2]
# create meshgrid
dim = array.shape
ax = np.arange(dim[0])
ay = np.arange(dim[1])
az = np.arange(dim[2])
coords = np.meshgrid(ax, ay, az)
# stack the meshgrid to position vectors, center them around 0 by substracting dim/2
xyz = np.vstack([coords[0].reshape(-1) - float(dim[0]) / 2, # x coordinate, centered
coords[1].reshape(-1) - float(dim[1]) / 2, # y coordinate, centered
coords[2].reshape(-1) - float(dim[2]) / 2]) # z coordinate, centered
# create transformation matrix
r = R.from_euler('zxz', [phi, psi, the], degrees=True)
mat = r.as_matrix()
# apply transformation
transformed_xyz = np.dot(mat, xyz)
# extract coordinates
x = transformed_xyz[0, :] + float(dim[0]) / 2
y = transformed_xyz[1, :] + float(dim[1]) / 2
z = transformed_xyz[2, :] + float(dim[2]) / 2
x = x.reshape((dim[1],dim[0],dim[2]))
y = y.reshape((dim[1],dim[0],dim[2]))
z = z.reshape((dim[1],dim[0],dim[2])) # reason for strange ordering: see next line
# the coordinate system seems to be strange, it has to be ordered like this
new_xyz = [y, x, z]
# sample
arrayR = map_coordinates(array, new_xyz, order=1)
Note:
You can also use this function for intrinsic rotations, simply adapt the first argument of 'from_euler' to your Euler convention. In this case, you obtain equivalent result than in my 1st post (using scipy.ndimage.rotate). However I noticed that the present code is 3x faster (0.01s for 40^3 volume) than when using scipy.ndimage.rotate (0.03s for 40^3 volume).
Hope this will help someone!

There seem to be a bit confusion about the "axes" parameter in your first post. To do a rotation about the x axis, the plane of rotation would be the yz plane which means your "axes" parameter should be set to (1,2). Also the first and the third rotations are, presumably about the x and z axes. But, both your rotations are in the xy plane. Could these be possibly the reasons behind the discrepancies in your answers? I am not convinced by your explanations about the new and original axes. The independent calls to the "rotate" function does not have access to the old data in any form or shape. It only sees the new axes and rotated array.

I check the code https://nbviewer.jupyter.org/gist/lhk/f05ee20b5a826e4c8b9bb3e528348688
There is a minor bug. The tested image is square, but if doing rectangular image, it will encounter some problems. below are correct ones for 2D and 3D rotations (noted that the Euler angle sequence used in my example is 'ZYZ', you should define this before using it):
def rotate_array_2D(array, orient):
# create a transformation matrix
angle=orient/180.*np.pi
c=np.cos(angle)
s=np.sin(angle)
mat=np.array([[c,s],[-s,c]])
# create meshgrid
dim = array.shape
ax = np.arange(dim[0])
ay = np.arange(dim[1])
coords = np.meshgrid(ax, ay)
# stack the meshgrid to position vectors, center them around 0 by substracting dim/2
xy = np.vstack([coords[0].reshape(-1) - float(dim[0]) / 2, # x coordinate, centered
coords[1].reshape(-1) - float(dim[1]) / 2]) # y coordinate, centered
# apply transformation
transformed_xy = np.dot(mat, xy)
# extract coordinates
x = transformed_xy[0, :] + float(dim[0]) / 2
y = transformed_xy[1, :] + float(dim[1]) / 2
x = x.reshape((dim[1],dim[0]))
y = y.reshape((dim[1],dim[0]))
new_xy = [x,y]
# sample
arrayR = map_coordinates(array, new_xy, order=1).T
return arrayR
def rotate_array_3D(array, orient):
rot = orient[0]
tilt = orient[1]
phi = orient[2]
# create meshgrid
dim = array.shape
ax = np.arange(dim[0])
ay = np.arange(dim[1])
az = np.arange(dim[2])
coords = np.meshgrid(ax, ay, az)
# stack the meshgrid to position vectors, center them around 0 by substracting dim/2
xyz = np.vstack([coords[0].reshape(-1) - float(dim[0]) / 2, # x coordinate, centered
coords[1].reshape(-1) - float(dim[1]) / 2, # y coordinate, centered
coords[2].reshape(-1) - float(dim[2]) / 2]) # z coordinate, centered
# create transformation matrix
r = R.from_euler('ZYZ', [rot, tilt, phi], degrees=True)
mat = r.as_matrix()
# apply transformation
transformed_xyz = np.dot(mat, xyz)
# extract coordinates
x = transformed_xyz[0, :] + float(dim[0]) / 2
y = transformed_xyz[1, :] + float(dim[1]) / 2
z = transformed_xyz[2, :] + float(dim[2]) / 2
x = x.reshape((dim[1],dim[0],dim[2]))
y = y.reshape((dim[1],dim[0],dim[2]))
z = z.reshape((dim[1],dim[0],dim[2])) # I test the rotation in 2D and this strange thing can be explained
new_xyz = [x,y,z]
arrayR = map_coordinates(array, new_xyz, order=1).T
return arrayR

Understanding opencv's decomposeHomographyMat outputs

I'm trying to find the angle required to move my camera so it's directly in front of an object. If my camera is looking at the object at a 30 degree angle from the left, then my script should return 30 degrees. I'm using cv2.decomposeHomographyMat to find a rotation matrix which works fine. There are 4 solutions returned from this function, so in my script I am outputting 4 angles. Of these angles, there are only two unique angles. My problem is I don't know which of these two angles is correct.
I know the decomposeHomographyMat returns four possible solutions, but shouldn't the angles be the same? I also found the coordinates of my points projected on a 2D plane, but I wasn't sure what to do with this information in regards to finding which angle is correct (here pts3D are the 2D points of the object taken from the camera image with a 0 added for the z column making it 3D pts):
for i in range(len(Ts)):
projectedPts = cv2.projectPoints(pts3D, Rs[i], Ts[i], CAM_MATRIX, DIST_COEFFS)[0][:,0,:]
Here is a snippet from my code. Maybe I am incorrectly determining the angles from the rotation matrix? In my example below, y1 and y2 will be the same angle, and y3 and y4 will be the same angle. Can someone help explain how I determine which angle is the correct angle, and why there are two different angles returned?
def rotationMatrixToEulerAngles(R):
sy = math.sqrt(Rs[0][0] * R[0][0] + R[1][0] * R[1][0])
singular = sy < 1e-6
if not singular :
x = math.atan2(R[2][1] , R[2][2])
y = math.atan2(-R[2][0], sy)
z = math.atan2(R[1][0], R[0][0])
else :
x = math.atan2(-R[1][2], R[1][1])
y = math.atan2(-R[2][0], sy)
z = 0
return np.rad2deg(y)
H, status = cv2.findHomography(corners, REFPOINTS)
output = cv2.warpPerspective(frame, H, (800,800))
# Rs returns 4 matricies, we use the first one
_, Rs, Ts, Ns = cv2.decomposeHomographyMat(H, CAM_MATRIX)
y1 = rotationMatrixToEulerAngles(Rs[0])
y2 = rotationMatrixToEulerAngles(Rs[1])
y3 = rotationMatrixToEulerAngles(Rs[2])
y4 = rotationMatrixToEulerAngles(Rs[3])
Thanks!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

perspective transform image segment and transform back - python

Related

How do I use scipy.interpolate.griddata similar to griddata in MATLAB?

What are the inaccuracies of this 'inverse map' function in OpenCV?

Most efficient way to transfrom a 2d array to a different coordinate system using a function, then interpolate the resultant holes

Apply rotation defined by Euler angles to 3D image, in python

Understanding opencv's decomposeHomographyMat outputs

Categories

Resources