I am using Python with OpenCV 3.4.
I have a system composed of 2 cameras that I want to use to track an object and get its trajectory, then its speed.
I am currently able to calibrate intrinsically and extrinsically each of my cameras. I can track my object through the video and get the 2d coordinates in my video plan.
My problem now is that I would like to project my points from my both 2D plan into 3D points.
I've tried functions as triangulatePoints but it seems it's not working in a proper way.
Here is my actual function to get 3d coords. It returns some coordinates that seems a little bit off compared to the actual coordinates
def get_3d_coord(left_two_d_coords, right_two_d_coords):
pt1 = left_two_d_coords.reshape((len(left_two_d_coords), 1, 2))
pt2 = right_two_d_coords.reshape((len(right_two_d_coords), 1, 2))
extrinsic_left_camera_matrix, left_distortion_coeffs, extrinsic_left_rotation_vector, \
extrinsic_left_translation_vector = trajectory_utils.get_extrinsic_parameters(
1)
extrinsic_right_camera_matrix, right_distortion_coeffs, extrinsic_right_rotation_vector, \
extrinsic_right_translation_vector = trajectory_utils.get_extrinsic_parameters(
2)
#returns arrays of the same size
(pt1, pt2) = correspondingPoints(pt1, pt2)
projection1 = computeProjMat(extrinsic_left_camera_matrix,
extrinsic_left_rotation_vector, extrinsic_left_translation_vector)
projection2 = computeProjMat(extrinsic_right_camera_matrix,
extrinsic_right_rotation_vector, extrinsic_right_translation_vector)
out = cv2.triangulatePoints(projection1, projection2, pt1, pt2)
oc = []
for idx, elem in enumerate(out[0]):
oc.append((out[0][idx], out[1][idx], out[2][idx], out[3][idx]))
oc = np.array(oc, dtype=np.float32)
point3D = []
for idx, elem in enumerate(oc):
W = out[3][idx]
obj = [None] * 4
obj[0] = out[0][idx] / W
obj[1] = out[1][idx] / W
obj[2] = out[2][idx] / W
obj[3] = 1
pt3d = [obj[0], obj[1], obj[2]]
point3D.append(pt3d)
return point3D
Here are some screenshot of the 2d trajectory that I get for both my cameras :
Here are some screenshot of the 3d trajectory that we get for the same camera.
As you can see the 2d trajectory doesn't look as the 3d one, and I am not able to get a accurate distance between two points.
I just would like getting real coordinates, it means knowing the (almost) exact real distance walked by a person even in a curved road.
EDIT to add reference data and examples
Here is some example and input data to reproduce the problem.
First, here are some data.
2D points for camera1
546,357
646,351
767,357
879,353
986,360
1079,365
1152,364
corresponding 2D for camera2
236,305
313,302
414,308
532,308
647,314
752,320
851,323
3D points that we get from triangulatePoints
"[0.15245444, 0.30141047, 0.5444277]"
"[0.33479974, 0.6477136, 0.25396818]"
"[0.6559921, 1.0416716, -0.2717265]"
"[1.1381898, 1.5703914, -0.87318224]"
"[1.7568599, 1.9649554, -1.5008119]"
"[2.406788, 2.302272, -2.0778883]"
"[3.078426, 2.6655817, -2.6113863]"
In these following images, we can see the 2d trajectory (top line) and the 3d projection reprojected in 2d (bottom line). Colors are alternating to show which 3d points correspond to 2d point.
And finally here are some data to reproduce.
camera 1 : camera matrix
5.462001610064596662e+02 0.000000000000000000e+00 6.382260289544193483e+02
0.000000000000000000e+00 5.195528638702176067e+02 3.722480290221320161e+02
0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00
camera 2 : camera matrix
4.302353276501239066e+02 0.000000000000000000e+00 6.442674231451971991e+02
0.000000000000000000e+00 4.064124751062329324e+02 3.730721752718034736e+02
0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00
camera 1 : distortion vector
-1.039009381799949928e-02 -6.875769941694849507e-02 5.573643708806085006e-02 -7.298826373638074051e-04 2.195279856716004369e-02
camera 2 : distortion vector
-8.089289768586239993e-02 6.376634681503455396e-04 2.803641672679824115e-02 7.852965318823987989e-03 1.390248981867302919e-03
camera 1 : rotation vector
1.643658457134109296e+00
-9.626823326237364531e-02
1.019865700311696488e-01
camera 2 : rotation vector
1.698451227150894471e+00
-4.734769748661146055e-02
5.868343803315514279e-02
camera 1 : translation vector
-5.004031689969588026e-01
9.358682517577661120e-01
2.317689087311113116e+00
camera 2 : translation vector
-4.225788801112133619e+00
9.519952012307866251e-01
2.419197507326224184e+00
camera 1 : object points
0 0 0
0 3 0
0.5 0 0
0.5 3 0
1 0 0
1 3 0
1.5 0 0
1.5 3 0
2 0 0
2 3 0
camera 2 : object points
4 0 0
4 3 0
4.5 0 0
4.5 3 0
5 0 0
5 3 0
5.5 0 0
5.5 3 0
6 0 0
6 3 0
camera 1 : image points
5.180000000000000000e+02 5.920000000000000000e+02
5.480000000000000000e+02 4.410000000000000000e+02
6.360000000000000000e+02 5.910000000000000000e+02
6.020000000000000000e+02 4.420000000000000000e+02
7.520000000000000000e+02 5.860000000000000000e+02
6.500000000000000000e+02 4.430000000000000000e+02
8.620000000000000000e+02 5.770000000000000000e+02
7.000000000000000000e+02 4.430000000000000000e+02
9.600000000000000000e+02 5.670000000000000000e+02
7.460000000000000000e+02 4.430000000000000000e+02
camera 2 : image points
6.080000000000000000e+02 5.210000000000000000e+02
6.080000000000000000e+02 4.130000000000000000e+02
7.020000000000000000e+02 5.250000000000000000e+02
6.560000000000000000e+02 4.140000000000000000e+02
7.650000000000000000e+02 5.210000000000000000e+02
6.840000000000000000e+02 4.150000000000000000e+02
8.400000000000000000e+02 5.190000000000000000e+02
7.260000000000000000e+02 4.160000000000000000e+02
9.120000000000000000e+02 5.140000000000000000e+02
7.600000000000000000e+02 4.170000000000000000e+02
Assuming both your resolutions are 1280x720 I calculated the left camera rotation and translation.
left_obj = np.array([[
[0, 0, 0],
[0, 3, 0],
[0.5, 0, 0],
[0.5, 3, 0],
[1, 0, 0],
[1 ,3, 0],
[1.5, 0, 0],
[1.5, 3, 0],
[2, 0, 0],
[2, 3, 0]
]], dtype=np.float32)
left_img = np.array([[
[5.180000000000000000e+02, 5.920000000000000000e+02],
[5.480000000000000000e+02, 4.410000000000000000e+02],
[6.360000000000000000e+02, 5.910000000000000000e+02],
[6.020000000000000000e+02, 4.420000000000000000e+02],
[7.520000000000000000e+02, 5.860000000000000000e+02],
[6.500000000000000000e+02, 4.430000000000000000e+02],
[8.620000000000000000e+02, 5.770000000000000000e+02],
[7.000000000000000000e+02, 4.430000000000000000e+02],
[9.600000000000000000e+02, 5.670000000000000000e+02],
[7.460000000000000000e+02, 4.430000000000000000e+02]
]], dtype=np.float32)
left_camera_matrix = np.array([
[4.777926320579549042e+02, 0.000000000000000000e+00, 5.609694925007885331e+02],
[0.000000000000000000e+00, 2.687583555325996372e+02, 5.712247987054799978e+02],
[0.000000000000000000e+00, 0.000000000000000000e+00, 1.000000000000000000e+00]
])
left_distortion_coeffs = np.array([
-8.332059138465927606e-02,
-1.402986394998156472e+00,
2.843132503678651168e-02,
7.633417606366312003e-02,
1.191317644548635979e+00
])
ret, left_camera_matrix, left_distortion_coeffs, rot, trans = cv2.calibrateCamera(left_obj, left_img, (1280, 720),
left_camera_matrix, left_distortion_coeffs, None, None, cv2.CALIB_USE_INTRINSIC_GUESS)
print(rot[0])
print(trans[0])
I got different results:
[[ 2.7262137 ] [-0.19060341] [-0.30345874]]
[[-0.48068581] [ 0.75257108] [ 1.80413094]]
The same for right camera:
[[ 2.1952522 ] [ 0.20281459] [-0.46649734]]
[[-2.96484428] [-0.0906817 ] [ 3.84203022]]
You can check rotations approximately this way: calculate relative rotation between computed results and compare against relative rotation between real camera positions. Translations: calculate relative normalized translation vector between computed results and compare against normalized relative translation between real camera positions. What coordinate system OpenCV uses is depicted here .
Related
I have a set of xyz points and a set of tetrahedrons. Where each node of the tetrahedron points to an index in the points table.
I need to plot the tetrahedrons with a corresponding color based on the tag attribute.
points
Index
x
y
z
0
x_1
y_1
z_1
1
x_2
y_2
z_2
...
...
...
...
tetrahedrons
Index
a
b
c
d
tag
0
a_1.pt
b_1.pt
c_1.pt
d_1.pt
9
1
a_2.pt
b_2.pt
c_2.pt
d_2.pt
0
...
...
...
...
...
...
I have tried using the Mesh3d api but it does not allow for a 4th vertex.
I can plot something like the code below but it does not have all the faces of the tetrahedron.
go.Figure(data=[
go.Mesh3d(
x=mesh_pts.x, y=mesh_pts.y, z=mesh_pts.z,
i=tagged_th.a, j=tagged_th.b, k=tagged_th.c,
),
]).show()
I think the Volume or Isosurface plots might work but I'm not sure how to convert my data into a format to be consumed by those apis.
I can't hide the fact that, a few minutes ago, I wasn't even aware of i,j,k parameters. But, still, I know that Mesh3D draws triangles, not tetrahedron. You need to take advantage of those i,j,k parameters to control which triangles are drawn. But it is still your job to tell which triangles need to be drawn to that it look like tetrahedrons.
Yes, there are 4 triangles per tetrahedron. If you wish to draw them four, you need to explicitly pass i,j,k for all 4. Not just pass i,j,k and an nonexistent l and expect plotly to understand that this means 4 triangles.
If a, b, c and d are 4 vertices of a tetrahedron, then the 4 triangles you need to draw are the 4 combinations of 3 of vertices from those. That is bcd, acd, abd and abc.
Let's write this in 4 rows
bcd
acd
abd
abc
^^^
|||
||\------k
|\------ j
\------- i
So, if, now, a, b, c and d are list of n vertices, then i, j, k must be lists 4 times longer
i=b + a + a + a
j=c + c + b + b
k=d + d + d + c
Application: let's define 2 tetrahedrons, one sitting on the spike of the other, using your dataframes format
import plotly.graph_objects as go
import pandas as pd
mesh_pts = pd.DataFrame({'x':[0, 1, 0, 0, 1, 0, 0],
'y':[0, 0, 1, 0, 0, 1, 0],
'z':[0, 0, 0, 1, 1, 1, 2]})
tagged_th = pd.DataFrame({'a':[0,3],
'b':[1,4],
'c':[2,5],
'd':[3,6],
'tag':[0,1]})
# And from there, just create a list of triangles, made of 4 combinations
# of 3 points taken from list of tetrahedron vertices
go.Figure(data=[
go.Mesh3d(
x=mesh_pts.x,
y=mesh_pts.y,
z=mesh_pts.z,
i=pd.concat([tagged_th.a, tagged_th.a, tagged_th.a, tagged_th.b]),
j=pd.concat([tagged_th.b, tagged_th.b, tagged_th.c, tagged_th.c]),
k=pd.concat([tagged_th.c, tagged_th.d, tagged_th.d, tagged_th.d]),
intensitymode='cell',
intensity=pd.concat([tagged_th.tag, tagged_th.tag, tagged_th.tag, tagged_th.tag])
)
]).show()
I don't see what you mean by "does not allow for a 4th vertex". Here is an example with two tetrahedra:
import plotly.graph_objects as go
import plotly.io as pio
import numpy as np
i = np.array([0, 0, 0, 1])
j = np.array([1, 2, 3, 2])
k = np.array([2, 3, 1, 3])
fig = go.Figure(data = [
go.Mesh3d(
x = [0,1,2,0, 4,5,6,4],
y = [0,0,1,2, 0,0,1,2],
z = [0,2,2,3, 4,2,4,1],
i = np.concatenate((i, i+4)),
j = np.concatenate((j, j+4)),
k = np.concatenate((k, k+4)),
facecolor = ["red","red","red","red", "green","green","green","green"]
)
])
pio.write_html(fig, file = "tetrahedra.html", auto_open = True)
I am working on a transmission line following algorithm using quadcopters. To do so, I need to calculate the line position on the image I receive from the UAV in order to determine a pitch velocity so the lines can be kept at the center of the image. The problem is that, when I apply a velocity on x-axis to move the UAV to the desired setpoint (left/right moviment), the image plane tilts along with the UAV which increases the positional error incorrectly. The images below exemplify the issue.
I tried something similar to this post since the UAV euler angles is known. This approach reduced the distortion caused by the frame tilting, but I couldn't eliminate it.
Transform a frame to be as if it was taken from above using OpenCV
The code:
f = 692.81 # Focal Length
# Frame Shape
cx = width
cy = height
#Euller Angles
roll_ = 0
pitch_ = pitch
yaw_ = 0
dx = 0
dy = 0
dz = 1
A2 = np.array([[f,0,cx,0],[0,f, cy,0],[0,0,1,0]])
A1 = np.array([[1/f,0,-cx/f],[0,1/f,-cy/f],[0,0,0],[0,0,dz]])
RX = np.array([[1,0,0,0],[0,np.cos(roll_),-(np.sin(roll_)),0],[0, np.sin(roll_), np.cos(roll_),0],[0,0,0,1]])
RY = np.array([[np.cos(pitch_), 0, -np.sin(pitch_),0],[0,1,0,0],[(np.sin(pitch_)), 0, np.cos(pitch_),0],[0,0,0,1]])
RZ = np.array([[np.cos(yaw_), -(np.sin(yaw_)), 0,0],[np.sin(yaw_), np.cos(yaw_), 0,0],[0,0,1,0],[0,0,0,1]])
T = np.array([[1, 0, 0, dx],[0, 1, 0, dy],[0, 0, 1, dz],[0, 0, 0, 1]])
R = np.dot(np.dot(RX, RY), RZ)
H = np.dot(A2, np.dot(T, np.dot(R, A1)))
#The output frame
linha_bw = cv2.warpPerspective(linha_bw, H,(frame.shape[1],frame.shape[0]),None,cv2.INTER_LINEAR)
The results from this transformation can be seen on the graph below. The blue curve is the controller without the image rectification, while the red one is the controller with the code above.
I'm not sure if there is mistakes on my code or there is a better approach to solve my problem through image processing techniques. Any help is highly appreciated !
schematic diagram picture
use opencv slovePNP func input world coordinate four point get R and T
R = np.array([[-0.0813445156856268], [-2.5478950926311636], [1.7376856594745234]], dtype=np.float32)
T = np.array([[10.838262901867047], [-6.506593974297687], [60.308121310607724]], dtype=np.float32)
The following corresponding data have been measured
World coordinate system point -> camera coordinate system point
0, 0 ,0 -> [11.052, -6.596, 60.9]
13, 0, 0 -> [-2.142, -5.628, 61.3]
13, 0, 13.5 -> [-2.668, -17.547, 56.2] : possible Error 2 cm
How do I convert the camera coordinate system to the world coordinate system using R,T vector?
I have found the conversion formula
R = np.array([[-0.0813445156856268], [-2.5478950926311636], [1.7376856594745234]], dtype=np.float32)
T = np.array([[10.838262901867047], [-6.506593974297687], [60.308121310607724]], dtype=np.float32)
world_point = [13, 0, 0]
rvec_matrix = cv2.Rodrigues(R)[0]
rmat = np.matrix(rvec_matrix)
tmat = np.matrix(T)
pmat = np.matrix(np.array([[world_point[0]], [world_point[1]], [world_point[]2]], dtype=np.float32))
# world coordinate to camera coordinate
cam_point = rmat * pmat + tmat
print(cam_point)
# camera coordinate to world coordinate
world_point = rmat ** -1 * (cam_point - tmat)
I am creating a lattice graph using graph_tool. I am trying to create a Property Map that represents X Y coordinates in the graph. For instance, if I create a lattice graph with a height of 5 and a width of 10, I want the value of the property map for vertex 0 to be [0, 0], vertex 1 to be [1, 0], vertex 10 to be [0, 1], etc
I generated the image using the code below:
g = lattice([10, 5])
pos = sfdp_layout(graph)
graph_draw(graph, pos=pos, output_size=(500,500), vertex_text=graph.vertex_index, output="lattice.png")
In the code above the value of pos[0] is array([-16.4148811 , -11.80299953])
Am I in the right direction using sfdp_layout?
The vertices are numbered according to row-major ordering, so you can just compute the coordinates from their indices:
g = lattice([10, 5])
x = g.new_vp("double", arange(g.num_vertices()) % 10)
y = g.new_vp("double", arange(g.num_vertices()) // 10)
pos = group_vector_property([x,y])
graph_draw(g, pos, output="lattice.png")
I have a set of 3d coordinates that was generated using meshgrid(). I want to be able to rotate these about the 3 axes.
I tried unraveling the meshgrid and doing a rotation on each point but the meshgrid is large and I run out of memory.
This question addresses this in 2d with einsum(), but I can't figure out the string format when extending it to 3d.
I have read several other pages about einsum() and its format string but haven't been able to figure it out.
EDIT:
I call my meshgrid axes X, Y, and Z, each is of shape (213, 48, 37). Also, the actual memory error came when I tried to put the results back into a meshgrid.
When I attempted to 'unravel' it to do point by point rotation I used the following function:
def mg2coords(X, Y, Z):
return np.vstack([X.ravel(), Y.ravel(), Z.ravel()]).T
I looped over the result with the following:
def rotz(angle, point):
rad = np.radians(angle)
sin = np.sin(rad)
cos = np.cos(rad)
rot = [[cos, -sin, 0],
[sin, cos, 0],
[0, 0, 1]]
return np.dot(rot, point)
After the rotation I will be using the points to interpolate onto.
Working with your definitions:
In [840]: def mg2coords(X, Y, Z):
return np.vstack([X.ravel(), Y.ravel(), Z.ravel()]).T
In [841]: def rotz(angle):
rad = np.radians(angle)
sin = np.sin(rad)
cos = np.cos(rad)
rot = [[cos, -sin, 0],
[sin, cos, 0],
[0, 0, 1]]
return np.array(rot)
# just to the rotation matrix
define a sample grid:
In [842]: X,Y,Z=np.meshgrid([0,1,2],[0,1,2,3],[0,1,2],indexing='ij')
In [843]: xyz=mg2coords(X,Y,Z)
rotate it row by row:
In [844]: xyz1=np.array([np.dot(rot,i) for i in xyz])
equivalent einsum row by row calculation:
In [845]: xyz2=np.einsum('ij,kj->ki',rot,xyz)
They match:
In [846]: np.allclose(xyz2,xyz1)
Out[846]: True
Alternatively I could collect the 3 arrays into one 4d array, and rotate that with einsum. Here np.array adds a dimension at the start. So the dot sum j dimension is 1st, and the 3d of the arrays follow:
In [871]: XYZ=np.array((X,Y,Z))
In [872]: XYZ2=np.einsum('ij,jabc->iabc',rot,XYZ)
In [873]: np.allclose(xyz2[:,0], XYZ2[0,...].ravel())
Out[873]: True
Similary for the 1 and 2.
Alternatively I could split XYZ2 into 3 component arrays:
In [882]: X2,Y2,Z2 = XYZ2
In [883]: np.allclose(X2,xyz2[:,0].reshape(X.shape))
Out[883]: True
Use ji instead of ij if you want to rotate in the other direction, i.e. use rot.T.