Reading pose using opencv2 - question about angles - python

Let's consider the code given in the following link:
https://towardsdatascience.com/head-pose-estimation-using-python-d165d3541600
I'm trying to analyze and understand everything what is behind it. One of the things that I cannot undetstand is this part:
# Get angles
angles, mtxR, mtxQ, Qx, Qy, Qz = cv2.RQDecomp3x3(rmat)
# Get the y rotation degree
x = angles[0] * 360
y = angles[1] * 360
I don't understand why we multpliy here by 360. The explanation is cv2.RQDecomp3x3(rmat) is in radians, so to obtain angles, we need to mupltiply it by 360. But does it make any sense? Since 1 radian is 180/np.pi approx 57, why we multiply it by 360 not by 57.
But also let's consider the following code:
rot_mat = np.array([[1, 0, 0], [0, 0, -1], [0, 1, 0]])
angles, _, _, _, _, _ = cv2.RQDecomp3x3(rot_mat)
angles
(90.0, 0.0, 0.0)
Now - if the output is in radians, why it's 90? Then, according to the code in link, the angle should be 90 * 360 = 32400 which completly make no sense.
So my questions are:
Why do we multiply it by 360 not by 57 if its in Radians?
Do we really need to multiply it? Aren't those numbers angles already?

Related

Why is angle of point (X, Y point) w.r.t origin after rotation is different than before rotation in Python?

I have two doubts. I have X and Y coordinates which I have listed below. I also have plotted coordinates as shown in picture below.
x = [0, 1, 1, 0, 0, 1]
y = [1, 1, 2, 2, 3, 3]
Now, I have decided to rotate the geometry in clockwise. Therefore, I have rotate all points at 45 degree (+ve) using below formula.
x_dash = x[i] * math.cos(theta) + y[i] * math.sin(theta)
y_dash = -x[i] * math.sin(theta) + y[i] * math.cos(theta)
After using above code (Formula), I got below results which shows new coordinate points after 45 degree clockwise rotation and after plotting, I got below plot.
x_dash = [0.8509035245341184, 1.3762255133518482, 2.2271290378859665, 1.7018070490682369, 2.552710573602355, 3.078032562420085]
y_dash = [0.5253219888177297, -0.3255815357163887, 0.19974045310134103, 1.0506439776354595, 1.575965966453189, 0.7250624419190707]
My questions:
(1) if I take two coordinates (X and Y) of one point and if I find an angle using
theta = np.degrees(np.arctan2(y, x)),
I did not get 45 degree.
For example:
np.degrees(np.arctan2(0.5253219888177297, 0.8509035245341184))
Result: 31.68992191129556
However, when I found an angle of 1st point before rotation. np.degrees(np.arctan2(1, 0)), I got 90.0.
I would like to know the reason that why there is a differece between the angle of same point before and after the rotation.
(2) If I have a rotated geometry like in the 2nd picture and I do not know the angle of rotation. What should I do make that geometry without rotation (like in the first picture).
Kinldy help me with these questions.
By default, math.sin() and math.cos() assumes that the arguments are in radians. So, the code considers the angle of rotation as 45 radians, and not 45 degrees.
You can define theta as:
theta = numpy.radians(45)
Hope this clarifies everything.

2D image coordinate to 3D space coordinate through camera matrix

I am trying to get a grasp of how to project 2D coordinates into a 3D space through my camera matrix, but I can't for the love of it, understand it.
So I am hoping that someone here can point me to a guide or something that can help me. Here is what I got:
I have read and tried all of these articles to try and understand the material:
Find 3D coordinate with respect to the camera using 2D image coordinates
https://en.wikipedia.org/wiki/Camera_matrix
https://se.mathworks.com/help/vision/ug/camera-calibration.html#bu0nh2_
https://staff.fnwi.uva.nl/r.vandenboomgaard/IPCV20162017/LectureNotes/CV/PinholeCamera/PinholeCamera.html
https://towardsdatascience.com/camera-calibration-fda5beb373c3
So I have a camera that is pointing "straight down" towards a table and it is centered on the table. I am guessing that from this I can create my translation matrix and rotation matrix (I am unsure what angle down is compared to 0degree, 90 or 180?)
T = [0.0, 0.0, 0.0]
R = [[cos(angle), -sin(angle), 0.0],
[sin(angle), cos(angle), 0.0],
[0.0, 0.0, 1.0]]
These are my extrinsic matrices.
My 2D photo is 1280x720px and my cameras focal length is 1.88mm and from this I can create a camera matrix based on this:
fx = 1280 / 1.88
fy = 720 / 1.88
u0 = 1280 / 2
v0 = 720 / 2
K = [[0.00146875, 0.0, 640.0, 0.0],
[0.0, 0.00261111, 360.0, 0.0],
[0.0, 0.0, 1.0, 0.0]]
I know that the distance between my camera and the table is 650mm
As far as I understand I am supposed to use linear algebra or matrix multiplication to take my 2D coordinate (300, 200) and put it into 3D space, but how to actually do it I can't seem to figure out.
It seems like a lot of the material I can find is about matching a 3D coordinate in 2D space.
From this question
How do I reverse-project 2D points into 3D?
I found this formula:
mat = [
[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1],
]
x = mat[0][0] * p.x + mat[0][1] * p.y + mat[0][2] * p.z + mat[0][3] * 1
y = mat[1][0] * p.x + mat[1][1] * p.y + mat[1][2] * p.z + mat[1][3] * 1
w = mat[3][0] * p.x + mat[3][1] * p.y + mat[3][2] * p.z + mat[3][3] * 1
But again I am not sure if this is the way to go since it gives me some weird results.
I am really hoping someone can help me out. Please request if any more information is needed.
Edit:
I noticed that there are two different formulas for the intrinsic camera matrix:
which has u0,v0 and cx, xy in different locations, but they both express the center of the image.
Which one is correct to use and with what units, mm or pixels?
I looked into vector x matrix multiplication and I think I understand that part.
The second matrix formula with cx,cy in the third row will never consider the Z distance because of the way the multiplication works. Again I am not entirely sure how this works, but that does not make sense to me?

Rotating data by angle theta

Might be a simple question but let's say you have 2d normal data that you want to rotate 90 degrees counter clockwise, to do this you can use rotation matrices and construct one with theta = np.pi / 2 and then multiply the data by your rotation matrix. That works great, however when I try to rotate the data 45 degrees (np.pi / 4), it does not work. It appears to have rotated the data clockwise, but flipping the sign of the angle does not change the resulting plot. How can I rotate the data 45 degrees counterclockwise?
cov = np.array([[1, .7], [.7, 1]])
data = np.random.multivariate_normal(np.zeros(2), cov, size=10000)
theta = np.pi / 4
rot_matrix = np.array([[np.cos(theta), -np.sin(theta)],
[np.sin(theta), np.cos(theta)]])
data_rot = (rot_matrix # data.T).T
fig, axes = plt.subplots(2)
axes[0].scatter(data[:, 0], data[:, 1])
axes[1].scatter(data_rot[:, 0], data_rot[:, 1])
fig.show()
yields the image:
(whereas I expected a 45 counterclockwise rotation to make the data look like a vertical line) while changing the theta to np.pi / 2 yields the correct image of:
Your rotation matrix is correct. It's the auto scaling from matplotlib that is making look as though the rotation is wrong. Try adding these before fig.show()
axes[0].set_aspect("equal")
axes[1].set_aspect("equal")

Find point along line a specified distance from a polygon

Given a 2-D closed polygon defined by a series of points and an infinite line, I would like to find points on that line a specified distance from the polygon. The polygon is known to be closed, not intersecting, and not containing 3 consecutive collinear points. In general there are many possible points along the line. Ideally I would like to find them all, or alternatively the one nearest some initial guess location. I am using python but a solution in any language would be helpful. I believe scipy.spatial kdtree will be one important component, but I cannot see how to do the whole solution. Here is some code to define the problem, which shows at least some of the corner cases involved:
import numpy as np
import matplotlib.pyplot as plt
poly = np.array([[0, 0],
[10, 0],
[10, 3],
[1, 1],
[1, 6],
[0, 6],
[.8, 4],
[0, 0]])
line = np.array([[-2, 4.5],
[12, 3]])
plt.plot(poly[:, 0], poly[:, 1])
plt.plot(line[:, 0], line[:, 1])
plt.xlim([-1, 11])
plt.ylim([-1, 7])
plt.show()
points = find_points_distance_from_polygon(poly, line, distance)
Edit: I am looking for the algorithm to find the points.
Update:
What I have tried so far is an approximate solution using the distance to each point. My thought was that if I refined the polygon by adding additional points along each line, then this approach might be accurate enough. However I would have to add a lot of points if the distance was small. I thought there is probably a better way.
import scipy.spatial as spatial
import scipy.optimize as opt
import math
def find_point_distance_from_polygon_along_line(tree, line, dist, guess_ratio):
def f(x):
pt = line[0, :] + x * (line[1, :] - line[0, :])
d, i = tree.query(pt)
return math.fabs(d - dist)
res = opt.minimize(f, [guess_ratio])
return line[0, :] + res.x * (line[1, :] - line[0, :])
tree = spatial.cKDTree(poly)
pt = find_point_distance_from_polygon_along_line(tree, line, 1, 0)
For the example in the plot and a distance of 0.5, I expect to find 4 points at approximately (.1, 4.2), (1.5, 4.1), (9.1, 3.3), and (10.5, 3.1). My current plan would find more points, particularly points which are some distance from the opposite edge of the polygon. I want the line connecting the point on the line to the polygon to be external to the polygon.
If number of polygon edges is reasonable, you can use simple linear algorithm.
Let's parametric equation for line is
L(u) = L0 + u * dL
where L0 is some base point, dL is direction vector, u is parameter
and parametric equation for i-th segment is
P = P[i] + t * Dir[i]
where P[i] is the first point of segment, Dir[i] is normalized direction vector, t is parameter in range 0..1
Arbitrary point at the line has it's projection on given segment at parameter
t = DotProduct(L(u) - P[i], Dir[i]) //equation 1
and length of normal to the projection (needed distance) is
Dist = Abs(CrossProduct(L(u) - P[i], Dir[i]))
Abs((L0x + u * dLx - Px) * Diry - (L0y + u * dLy - Py) * Dirx) = Dist
so
u = (+-Dist - ((L0x- Px)*Diry -(L0y-Py)*Dirx)) / (dLx * Diry - dLy * Dirx)
substitute values u into equation 1 and check if parameter t is in range 0..1 (projection inside the segment). If yes, L(u) is needed point.
Then check distance to vertices - solve
(L0x + u * dLx - Px)^2 + (L0y + u * dLy - Py)^2 = Dist^2

3d rotation on image

I'm trying to get some code that will perform a perspective transformation (in this case a 3d rotation) on an image.
import os.path
import numpy as np
import cv
def rotation(angle, axis):
return np.eye(3) + np.sin(angle) * skew(axis) \
+ (1 - np.cos(angle)) * skew(axis).dot(skew(axis))
def skew(vec):
return np.array([[0, -vec[2], vec[1]],
[vec[2], 0, -vec[0]],
[-vec[1], vec[0], 0]])
def rotate_image(imgname_in, angle, axis, imgname_out=None):
if imgname_out is None:
base, ext = os.path.splitext(imgname_in)
imgname_out = base + '-out' + ext
img_in = cv.LoadImage(imgname_in)
img_size = cv.GetSize(img_in)
img_out = cv.CreateImage(img_size, img_in.depth, img_in.nChannels)
transform = rotation(angle, axis)
cv.WarpPerspective(img_in, img_out, cv.fromarray(transform))
cv.SaveImage(imgname_out, img_out)
When I rotate about the z-axis, everything works as expected, but rotating around the x or y axis seems completely off. I need to rotate by angles as small as pi/200 before I start getting results that seem at all reasonable. Any idea what could be wrong?
First, build the rotation matrix, of the form
[cos(theta) -sin(theta) 0]
R = [sin(theta) cos(theta) 0]
[0 0 1]
Applying this coordinate transform gives you a rotation around the origin.
If, instead, you want to rotate around the image center, you have to first shift the image center
to the origin, then apply the rotation, and then shift everything back. You can do so using a
translation matrix:
[1 0 -image_width/2]
T = [0 1 -image_height/2]
[0 0 1]
The transformation matrix for translation, rotation, and inverse translation then becomes:
H = inv(T) * R * T
I'll have to think a bit about how to relate the skew matrix to the 3D transformation. I expect the easiest route is to set up a 4D transformation matrix, and then to project that back to 2D homogeneous coordinates. But for now, the general form of the skew matrix:
[x_scale 0 0]
S = [0 y_scale 0]
[x_skew y_skew 1]
The x_skew and y_skew values are typically tiny (1e-3 or less).
Here's the code:
from skimage import data, transform
import numpy as np
import matplotlib.pyplot as plt
img = data.camera()
theta = np.deg2rad(10)
tx = 0
ty = 0
S, C = np.sin(theta), np.cos(theta)
# Rotation matrix, angle theta, translation tx, ty
H = np.array([[C, -S, tx],
[S, C, ty],
[0, 0, 1]])
# Translation matrix to shift the image center to the origin
r, c = img.shape
T = np.array([[1, 0, -c / 2.],
[0, 1, -r / 2.],
[0, 0, 1]])
# Skew, for perspective
S = np.array([[1, 0, 0],
[0, 1.3, 0],
[0, 1e-3, 1]])
img_rot = transform.homography(img, H)
img_rot_center_skew = transform.homography(img, S.dot(np.linalg.inv(T).dot(H).dot(T)))
f, (ax0, ax1, ax2) = plt.subplots(1, 3)
ax0.imshow(img, cmap=plt.cm.gray, interpolation='nearest')
ax1.imshow(img_rot, cmap=plt.cm.gray, interpolation='nearest')
ax2.imshow(img_rot_center_skew, cmap=plt.cm.gray, interpolation='nearest')
plt.show()
And the output:
I do not get the way you build your rotation matrix. It seems rather complicated to me. Usually, it would be built by constructing a zero matrix, putting 1 on unneeded axes, and the common sin, cos, -cos, sin into the two used dimensions. Then multiplying all these together.
Where did you get that np.eye(3) + np.sin(angle) * skew(axis) + (1 - np.cos(angle)) * skew(axis).dot(skew(axis)) construct from?
Try building the projection matrix from basic building blocks. Constructing a rotation matrix is fairly easy, and "rotationmatrix dot skewmatrix" should work.
You might need to pay attention to the rotation center though. Your image probably is placed at a virtual position of 1 on the z axis, so by rotating on x or y, it moves around a bit.
So you'd need to use a translation so z becomes 0, then rotate, then translate back. (Translation matrixes in affine coordinates are pretty simple, too. See wikipedia: https://en.wikipedia.org/wiki/Transformation_matrix )

Categories