Transformation Matrices in TensorFlow - python

I want to implement the forward kinematic of a robot with TensorFlow; mainly to gain automatic differentiation and to plug this module into larger network architectures.
In general I have a bunch of 4x4 transformation matrices, defined by the dh-parameters (d, theta, a, alpha) and the joint angle q:
[[ cos(theta+q), -sin(theta+q), 0, a],
[sin(theta+q)*cos(alpha), cos(theta+q)*cos(alpha), -sin(alpha), -sin(alpha)*d],
[sin(theta+q)*sin(alpha), cos(theta+q)*sin(alpha), cos(alpha), cos(alpha)*d],
[ 0, 0, 0, 1]])
My robot has 10 different joints, all connected sequentially.
I thought it would be smart to precompute sine and cosine.
q = tf.keras.layers.Input((10,))
sin_q = tf.sin(q)
cos_q = tf.cos(q)
Lets look at the transformation at the first joint with the specific set of dh-parameters (d=0.1055, theta=0, a=0, alpha=0):
m0 = [[cos(q0), -sin(q0), 0, 0],
[sin(q0), cos(q0), 0, 0],
0, 0, 1, 0.10550],
0, 0, 0, 1]]
My first problem is how to build something like this with TensorFlow?
In numpy I would initialize the matrix and fill in the nonzero values.
m_shape = tf.TensorShape((batch_size,4,4))
m0 = tf.zeros(m_shape)
m0[..., 0, 0] = cos_q[..., 0]
m0[..., 0, 1] = -sin_q[..., 0]
m0[..., 1, 0] = cos_q[..., 0]
m0[..., 1, 1] = sin_q[..., 0]
m0[..., 2, 3] = 0.10550
m0[..., 3, 3] = 1
Error -> 'Tensor' object does not support item assignment
But Tensorflow doesn't allow assignment for tensors.
It seems that the way to go is via tf.stack(). I need to create a vector of ones of the same size as my not specified batch_size, stack and reshape.
(Note: In the general case there are less zero values)
e = tf.ones_like(q[..., 0])
m0 = tf.stack([cos_q[..., 0], -sin_q[..., 0], 0*e, 0*e,
sin_q[..., 0], cos_q[..., 0], 0*e, 0*e,
0*e, 0*e, 1*e, 0.10550*e,
0*e, 0*e, 0*e, 1*e], axis=-1)
m0 = tf.keras.layers.Reshape((4, 4))(m0)
Is this correct or is there a smarter way to build such general transformations in TensorFlow?
As final result I am interested in the transformation at the end of the kinematic chain. I want to put in an array of different joint configurations (?, 10) and get the transformation at the end effector (?, 4, 4).
m_end = m0 # m1 # m2 # ... # m10
forward_net = tf.keras.Model(inputs=[q], outputs=[m_end]
result = forward_net.predict(np.random.random((100, 10)))
This works but its neither elegant nor fast.
The speed is my bigger problem; the same implementation in numpy is 150x faster.
How can I improve the speed? I thought TensorFlow should excel at tasks like this.
Should I build it as Model and use predict to calculate the results; there is nothing to learn here, so I am not sure what to use.

If you want to build 4x4 rotation matrices from an angle, or from the sine and cosine of an angle, you can do it like this:
import tensorflow as tf
def make_rotation(alpha, axis):
return make_rotation_sincos(tf.math.sin(alpha), tf.math.cos(alpha), axis)
def make_rotation_sincos(sin, cos, axis):
axis = axis.strip().lower()
zeros = tf.zeros_like(sin)
ones = tf.ones_like(sin)
if axis == 'x':
rot = tf.stack([
tf.stack([ ones, zeros, zeros], -1),
tf.stack([zeros, cos, -sin], -1),
tf.stack([zeros, sin, cos], -1),
], -2)
elif axis == 'y':
rot = tf.stack([
tf.stack([ cos, zeros, sin], -1),
tf.stack([zeros, ones, zeros], -1),
tf.stack([ -sin, zeros, cos], -1),
], -2)
elif axis == 'z':
rot = tf.stack([
tf.stack([ cos, -sin, zeros], -1),
tf.stack([ sin, cos, zeros], -1),
tf.stack([zeros, zeros, ones], -1),
], -2)
else:
raise ValueError('Invalid axis {!r}.'.format(axis))
last_row = tf.expand_dims(tf.stack([zeros, zeros, zeros], -1), -2)
last_col = tf.expand_dims(tf.stack([zeros, zeros, zeros, ones], -1), -1)
return tf.concat([tf.concat([rot, last_row], -2), last_col], -1)
About computing the forward kinematic chain, you can do that with tf.scan. For example, assuming the initial shape (?, 10):
# Make rotation matrices
rots = make_rotation(...)
rots_t = tf.transpose(rots, (1, 0, 2, 3))
out = tf.scan(tf.matmul, rots_t)[-1]

Related

Compute Homography Direct - known camera location(s) - Image is... aliased?

I'm following the code presented here:
Compute homography for a virtual camera with opencv
As a note, I made a tiny modification to the code: in the translation matrix, I'm left-multiplying the first 3 rows of the last column by -R to get the translation in the global frame. I also changed the translation matrix definition to use "-dist" because in the global frame, movement toward the camera would be in the negative z direction.
When I turn the X rotation to 0, I get a weird... aliased version of the loaded image that appears ABOVE the horizon line, where there should be nothing.
My question:
Why? Is this just a weird artifact of how the homography is calculated? How can I get rid of it? I know for x=0 (in the presented code) I can just ignore/erase anything above the horizon line, but my use case the x rotation might be -10 to 10 degrees or so - how can I calculate where the horizon line would be in those cases (so I can ignore image data above it) - or is there a mathematical solution the computing the homography that will bypass this problem all together?
Thanks!
EDIT: Adding in code/image in question:
import cv2
import numpy as np
rotXdeg = 90
rotYdeg = 90
rotZdeg = 90
f = 500
dist = 500
def onRotXChange(val):
global rotXdeg
rotXdeg = val
def onRotYChange(val):
global rotYdeg
rotYdeg = val
def onRotZChange(val):
global rotZdeg
rotZdeg = val
def onFchange(val):
global f
f=val
def onDistChange(val):
global dist
dist=val
if __name__ == '__main__':
#Read input image, and create output image
src = cv2.imread('/path/to/image.jpg')
dst = np.ndarray(shape=src.shape,dtype=src.dtype)
#Create user interface with trackbars that will allow to modify the parameters of the transformation
wndname1 = "Source:"
wndname2 = "WarpPerspective: "
cv2.namedWindow(wndname1, 1)
cv2.namedWindow(wndname2, 1)
cv2.createTrackbar("Rotation X", wndname2, rotXdeg, 180, onRotXChange)
cv2.createTrackbar("Rotation Y", wndname2, rotYdeg, 180, onRotYChange)
cv2.createTrackbar("Rotation Z", wndname2, rotZdeg, 180, onRotZChange)
cv2.createTrackbar("f", wndname2, f, 2000, onFchange)
cv2.createTrackbar("Distance", wndname2, dist, 2000, onDistChange)
#Show original image
cv2.imshow(wndname1, src)
h , w = src.shape[:2]
while True:
rotX = (rotXdeg - 90)*np.pi/180
rotY = (rotYdeg - 90)*np.pi/180
rotZ = (rotZdeg - 90)*np.pi/180
#Projection 2D -> 3D matrix
A1= np.matrix([[1, 0, -w/2],
[0, 1, -h/2],
[0, 0, 0 ],
[0, 0, 1 ]])
# Rotation matrices around the X,Y,Z axis
RX = np.matrix([[1, 0, 0, 0],
[0,np.cos(rotX),-np.sin(rotX), 0],
[0,np.sin(rotX),np.cos(rotX) , 0],
[0, 0, 0, 1]])
RY = np.matrix([[ np.cos(rotY), 0, np.sin(rotY), 0],
[ 0, 1, 0, 0],
[ -np.sin(rotY), 0, np.cos(rotY), 0],
[ 0, 0, 0, 1]])
RZ = np.matrix([[ np.cos(rotZ), -np.sin(rotZ), 0, 0],
[ np.sin(rotZ), np.cos(rotZ), 0, 0],
[ 0, 0, 1, 0],
[ 0, 0, 0, 1]])
#Composed rotation matrix with (RX,RY,RZ)
R = RX * RY * RZ
#Translation matrix on the Z axis change dist will change the height
T = np.matrix([[1,0,0,0],
[0,1,0,0],
[0,0,1,-dist],
[0,0,0,1]])
extractT = T[:3, 3:4]
solveT = -R[:3, :3]#extractT
T[:3, 3:4] = solveT
#Camera Intrisecs matrix 3D -> 2D
A2= np.matrix([[f, 0, w/2,0],
[0, f, h/2,0],
[0, 0, 1,0]])
# Final and overall transformation matrix
H = A2 * (T * (R * A1))
# Apply matrix transformation
cv2.warpPerspective(src, H, (w, h), dst, cv2.INTER_CUBIC)
#Show the image
cv2.imshow(wndname2, dst)
if (cv2.waitKey(1) == ord('q')):
break
Image:

Creating polygons using rasterio and shapely based on a lon, lat and an array of 0's and 1's

I have an numpy array with 0 and 1's. I am trying to convert the 1's into polygons. I have managed to do so using rasterio and shapely as seen in the code below:
im = np.array([[0, 0, 0, 0, 0],
[0, 1, 1, 1, 0],
[0, 0, 1, 0, 0],
[0, 1, 1, 1, 0],
[0, 0, 0, 0, 0]])
shapes = rasterio.features.shapes(im)
polygons = [shapely.geometry.Polygon(shape[0]["coordinates"][0]) for shape in shapes if shape[1] == 1]
print(polygon[0])
however, each row and column reffers to longitude and latitude coordinates stored in different arrays. for example:
lon = np.array([125. , 125.25, 125.5 , 125.75, 126. ])
lat = np.array([-35. , -35.25, -35.5 , -35.75, -36. ])
Does anyone know how to create the polygons associated with the correct coordinates? I think I have to use the transform parameter of the rasterio.features.shapes function. Yet I haven't been able to figure it out yet.
Found the solution. The transform parameter indeed is needed. This parameter is and affine trasnformation which is defined as:
xres = (max(lon) - min(lon))/len(lon)
yres = (lat[-1] - lat[0])/len(lat)
transform1 = Affine.translation(min(lon) - xres / 2, lat[0] - yres / 2) * Affine.scale(xres, yres)
that can be used to create the polygons with rasterio and shapely
shapes = rasterio.features.shapes(im, transform = transform1)
polygons = [shapely.geometry.Polygon(shape[0]["coordinates"][0]) for shape in shapes if shape[1] == 1]

H (observation) matrix in Kalman Filter when only measuring some of the state-space variables

I'm implementing a Kalman filter for an 2D tracked object. I'm measuring the position and the velocity of the object. For the moment, I assume I have all the data from the sensors at the same time, so my observation matrix H is
H = eye(4,4), a 4x4 identity matrix. (See code below)
However, in my final implementation I will have the data from the sensors at different times. So in some update loops I will have the velocity, and in others I will have the position. How would I write the H matrix in those cases?
Is it okay to write
[position loop]
[1, 0, 0, 0 ]
[0, 1, 0, 0 ]
[0, 0, 0, 0 ]
[0, 0, 0, 0 ]
[velocity loop]
[0, 0, 0, 0 ]
[0, 0, 0, 0 ]
[0, 0, 1, 0 ]
[0, 0, 0, 1 ]
Note that my state space variables are [x, y, vx, vy]
I wonder if using those matrices does not imply that my observations are zero, or something like that.
Can I leave the covariances matrices untouched? I guess not.
#Implementation of 2D filter with FilterPy.
import numpy as np
from filterpy.kalman import KalmanFilter
from filterpy.common import Q_discrete_white_noise
import matplotlib.pyplot as plt
# --------- PARAM -----------
dt = 0.1
v_dev = 0.3
pos_dev = 0.8
duration = 50
acceleration_noise = 0.3
# --------- MODEL ------------
transition_matrix = [[1,0,dt,0],[0,1,0,dt],[0,0,1,0],[0,0,0,1]]
transition_covariance = np.array([
[ 0.25*pow(dt, 4), 0, 0.5* pow(dt, 3), 0 ],
[ 0, 0.25*pow(dt, 4), 0, 0.5* pow(dt, 3)],
[ 0.5* pow(dt, 3), 0, dt*dt, 0],
[ 0, 0.5*dt*dt*dt, 0, dt*dt]]) * acceleration_noise *acceleration_noise # A large process noise favors the measurements. ()
#Transition matrix with acceleration componentn
observation_matrix = np.eye(4, 4)
initial_state = [0, 0, 0.5, 0.5]
initial_state_covariance = [[ pos_dev*pos_dev, 0, 0 ,0],[0, pos_dev*pos_dev, 0, 0],[0, 0, v_dev * v_dev, 0 ],[0, 0, 0, v_dev * v_dev ]]
observation_covariance = [[pos_dev * pos_dev , 0, 0 ,0],[0, pos_dev * pos_dev, 0, 0],[0, 0, v_dev * v_dev, 0 ],[0, 0, 0, v_dev * v_dev ]]
#-----------------------------
#---------- FAKE DATA ---------
ind = np.array( range( round(duration/dt) ) )
time = ind * dt
position = np.zeros( (2, len(ind)) )
position[0,:] = time
position[1,:] = 3 * np.sin(time)
noise = pos_dev * np.random.randn(2, len(ind))
noisy_pos = position + noise
vel = position[:,1:len(ind)] - position[:,0:len(ind)-1]
vel = vel / dt
vel_ind = np.zeros( (2, len(ind) -1 ) )
vel_ind[0,:] = position[0,0:len(ind)-1]
vel_ind[1,:] = position[1,0:len(ind)-1]
vel_noise = v_dev * np.random.randn(2, len(ind) - 1 )
noisy_vel = vel + vel_noise
observations = np.zeros((len(ind), 4))
observations[:,[0,1]] = np.transpose(noisy_pos)
observations[1:len(ind),[2,3]] = np.transpose(noisy_vel)
observations[0,[2,3]] = np.transpose(noisy_vel[[0,1],0] )
# KALMAN!
filtered_state_means = np.zeros((len(time), 4))
filtered_state_covariances = np.zeros( ( len(time), 4, 4) )
kf = KalmanFilter( dim_x = 4, dim_z = 4) # state space: x, y, vx, vy, measuring all
kf.x = np.array( initial_state )
kf.F = np.array( transition_matrix )
kf.H = np.array( observation_matrix )
kf.P = np.array( initial_state_covariance )
kf.Q = np.array( transition_covariance )
kf.R =np.array( observation_covariance ) #measurement covariance
for i in range(0, len(time) ):
# Ommitting some data points
if( i > no_gps_start and i < no_gps_end):
# No data from gps
kf.H = np.array( ([0, 0, 0, 0],[0, 0, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]) )
else:
kf.H = observation_matrix
kf.predict()
kf.update(observations[i])
filtered_state_means[i] = kf.x
filtered_state_covariances[i] = kf.P
# Plotting everything
xmin = -2
xmax = 22
ymin = -4.3
ymax = 4.3
axisLimits = [xmin, xmax, ymin, ymax]
plt.figure(1)
plt.plot( position[0,:], position[1,:], linewidth=1 , color= '0.7')
plt.plot( noisy_pos[0,:], noisy_pos[1,:], '.')
plt.axis( axisLimits )
plt.figure(2)
plt.plot( position[0,:], position[1,:], linewidth=1 , color= '0.7')
plt.quiver( vel_ind[0,:], vel_ind[1,:], noisy_vel[0,:], noisy_vel[1,:], angles='xy', scale_units='xy', scale=10)
plt.axis( axisLimits )
plt.figure(3)
plt.plot( position[0,:], position[1,:], linewidth=1 , color= '0.7', zorder= 1)
plt.plot( filtered_state_means[:,0], filtered_state_means[:,1], linewidth = 1, zorder= 2)
plt.plot( noisy_pos[0,:], noisy_pos[1,:], '.', color = "#fd92f8", zorder= 0)
plt.plot( no_gps_x, no_gps_y, 'ro')
plt.show()
You are right, you are not allowed to modify the observation matrix in this way.
In your case the best solution would be a sequential Kalman Filter, which was developed exactly for handling of missing measurements. The measurement vector is replaced through a sequence of separate scalar measurements. The filter can proceed them independently and is not corrupted if one or more measurements do not exist at some point in time.
Have a look at Dan Simon's "Optimal State Estimation" Chapter 6.1 (you can try to find the book online). He derives alternative equations for the Kalman Filter, which are pretty easy to implement. The prediction step stays the same, you need to modify the update step.
Pros:
you don't need to compute the inverse matrix at all (nice for embedded systems)
if your H matrix has a lot of zeros the equivalent sequential expressions are very short and computationally efficient
Contras:
the R matrix (measurement covariance) has to be diagonal

How can I use scipy's affine_transform to do an arbitrary affine transformation on a color image?

My goal is to transform an image in such a way that three source points are mapped to three target points in an empty array. I have solved the finding of the correct affine matrix, however I cannot apply an affine transformation on a color image.
More specifically, I am struggling with the correct use of the scipy.ndimage.interpolation.affine_transform method. As this question and it's anwers point out, the affine_transform-method can be somewhat unintuitive (especially regarding offset calculation), however, user timday shows how apply a rotation and a shearing on an image and position it in another array, while user geodata gives more background information.
My problem is to generalize the approach shown there (1) to color images and (2) to an arbitrary transformation which I calculated myself.
This is my code (which should run as is on your computer):
import numpy as np
from scipy import ndimage
import matplotlib.pyplot as plt
def calcAffineMatrix(sourcePoints, targetPoints):
# For three source- and three target points, find the affine transformation
# Function works correctly, not part of the question
A = []
b = []
for sp, trg in zip(sourcePoints, targetPoints):
A.append([sp[0], 0, sp[1], 0, 1, 0])
A.append([0, sp[0], 0, sp[1], 0, 1])
b.append(trg[0])
b.append(trg[1])
result, resids, rank, s = np.linalg.lstsq(np.array(A), np.array(b))
a0, a1, a2, a3, a4, a5 = result
# Ignoring offset here, later use timday's suggested offset calculation
affineTrafo = np.array([[a0, a1, 0], [a2, a3, 0], [0, 0, 1]], 'd')
# Testing the correctness of transformation matrix
for i, _ in enumerate(sourcePoints):
src = sourcePoints[i]
src.append(1.)
trg = targetPoints[i]
trg.append(1.)
at = affineTrafo.copy()
at[2, 0:2] = [a4, a5]
assert(np.array_equal(np.round(np.array(src).dot(at)), np.array(trg)))
return affineTrafo
# Prepare source image
sourcePoints = [[162., 112.], [130., 112.], [162., 240.]]
targetPoints = [[180., 102.], [101., 101.], [190., 200.]]
image = np.empty((300, 300, 3), dtype='uint8')
image[:] = 255
# Mark border for better visibility
image[0:2, :] = 0
image[-3:-1, :] = 0
image[:, 0:2] = 0
image[:, -3:-1] = 0
# Mark source points in red
for sp in sourcePoints:
sp = [int(u) for u in sp]
image[sp[1] - 5:sp[1] + 5, sp[0] - 5:sp[0] + 5, :] = np.array([255, 0, 0])
# Show image
plt.subplot(3, 1, 1)
plt.imshow(image)
# Prepare array in which the image is placed
array = np.empty((400, 300, 3), dtype='uint8')
array[:] = 255
a2 = array.copy()
# Mark target points in blue
for tp in targetPoints:
tp = [int(u) for u in tp]
a2[tp[1] - 2:tp[1] + 2, tp[0] - 2:tp[0] + 2] = [0, 0, 255]
# Show array
plt.subplot(3, 1, 2)
plt.imshow(a2)
# Next 5 program lines are actually relevant for question:
# Calculate affine matrix
affineTrafo = calcAffineMatrix(sourcePoints, targetPoints)
# This follows the c_in-c_out method proposed in linked stackoverflow issue
# extended for color channel (no translation here)
c_in = np.array([sourcePoints[0][0], sourcePoints[0][1], 0])
c_out = np.array([targetPoints[0][0], targetPoints[0][1], 0])
offset = (c_in - np.dot(c_out, affineTrafo))
# Affine transform!
ndimage.interpolation.affine_transform(image, affineTrafo, order=2, offset=offset,
output=array, output_shape=array.shape,
cval=255)
# Mark blue target points in array, expected to be above red source points
for tp in targetPoints:
tp = [int(u) for u in tp]
array[tp[1] - 2:tp[1] + 2, tp[0] - 2:tp[0] + 2] = [0, 0, 255]
plt.subplot(3, 1, 3)
plt.imshow(array)
plt.show()
Other approaches I tried include working with the inverse, transpose or both of affineTrafo:
affineTrafo = np.linalg.inv(affineTrafo)
affineTrafo = affineTrafo.T
affineTrafo = np.linalg.inv(affineTrafo.T)
affineTrafo = np.linalg.inv(affineTrafo).T
In his answer, geodata shows how to calculate the matrix that affine_trafo needs to do a scaling and rotation:
If one wants a scaling S first and then a rotation R it holds that T=R*S and therefore T.inv=S.inv*R.inv (note the reversed order).
Which I tried to copy using matrix decomposition (decomposing my affine transformation into a rotation, a shearing and another rotation):
u, s, v = np.linalg.svd(affineTrafo[:2,:2])
uInv = np.linalg.inv(u)
sInv = np.linalg.inv(np.diag((s)))
vInv = np.linalg.inv(v)
affineTrafo[:2, :2] = uInv.dot(sInv).dot(vInv)
Again, without success.
For all of my results, it's not (only) an offset problem. It is clearly visible from the pictures that the relative positions of source and target points do not correspond.
I searched the web and stackoverflow and did not find an answer for my problem. Please help me! :)
I finally got it working thanks to AlexanderReynolds hint to use another library. This is of course a workaround; I could not get it working using scipy's affine_transform, so I used OpenCVs cv2.warpAffine instead. In case this is helpful to anyone else, this is my code:
import numpy as np
import matplotlib.pyplot as plt
import cv2
# Prepare source image
sourcePoints = [[162., 112.], [130., 112.], [162., 240.]]
targetPoints = [[180., 102.], [101., 101.], [190., 200.]]
image = np.empty((300, 300, 3), dtype='uint8')
image[:] = 255
# Mark border for better visibility
image[0:2, :] = 0
image[-3:-1, :] = 0
image[:, 0:2] = 0
image[:, -3:-1] = 0
# Mark source points in red
for sp in sourcePoints:
sp = [int(u) for u in sp]
image[sp[1] - 5:sp[1] + 5, sp[0] - 5:sp[0] + 5, :] = np.array([255, 0, 0])
# Show image
plt.subplot(3, 1, 1)
plt.imshow(image)
# Prepare array in which the image is placed
array = np.empty((400, 300, 3), dtype='uint8')
array[:] = 255
a2 = array.copy()
# Mark target points in blue
for tp in targetPoints:
tp = [int(u) for u in tp]
a2[tp[1] - 2:tp[1] + 2, tp[0] - 2:tp[0] + 2] = [0, 0, 255]
# Show array
plt.subplot(3, 1, 2)
plt.imshow(a2)
# Calculate affine matrix and transform image
M = cv2.getAffineTransform(np.float32(sourcePoints), np.float32(targetPoints))
array = cv2.warpAffine(image, M, array.shape[:2], borderValue=[255, 255, 255])
# Mark blue target points in array, expected to be above red source points
for tp in targetPoints:
tp = [int(u) for u in tp]
array[tp[1] - 2:tp[1] + 2, tp[0] - 2:tp[0] + 2] = [0, 0, 255]
plt.subplot(3, 1, 3)
plt.imshow(array)
plt.show()
Comments:
Interesting how it worked almost immediately after changing the library. After having spent more than a day trying to get it work with scipy, this is a lesson for myself to change libraries faster.
In case someone wants to find an (least squares) approximation for an affine transformation based on more than three points, this is how you get the matrix that works with cv2.warpAffine:
Code:
def calcAffineMatrix(sourcePoints, targetPoints):
# For three or more source and target points, find the affine transformation
A = []
b = []
for sp, trg in zip(sourcePoints, targetPoints):
A.append([sp[0], 0, sp[1], 0, 1, 0])
A.append([0, sp[0], 0, sp[1], 0, 1])
b.append(trg[0])
b.append(trg[1])
result, resids, rank, s = np.linalg.lstsq(np.array(A), np.array(b))
a0, a1, a2, a3, a4, a5 = result
affineTrafo = np.float32([[a0, a2, a4], [a1, a3, a5]])
return affineTrafo

camera calibration change orientation of axis

I want to solve for the extrinsics by using direct linear transformation on corresponding 3D LIDAR points and 2D camera points. I already have the intrinsics.
Problem is, points behind the camera gets re-projected as well (see picture below).
So I constrain to only points "in front of the camera", i.e z > 0. The problem is, on different trials where different sets of points are used, the produced extrinsic matrix produces differing axes. Sometimes, constraining z > 0 gives the right results (centre part of image), whereas other times I need z < 0, which I believe to be the z-axis going into the camera. So the question is, how do I constrain the Z axes of the camera to be sticking out of the camera?
def with_intrinsic(points2d, points3d, intrinsic):
cam1_K_inverse = np.linalg.inv(intrinsic)
#direct linear transformation calibration, assumes no intrinsic matrix
assert points2d.shape[0] >= 3
assert points3d.shape[0] == points2d.shape[0]
A = []
points2d_homo = []
for u,v in points2d:
points2d_homo.append([u, v, 1])
points2d_homo = np.array(points2d_homo).T #columns to be data points
points2d_inv = np.dot(cam1_K_inverse, points2d_homo).T
assert points2d_inv.shape == (points2d.shape[0], 3)
assert points2d_inv[0, 2] == 1
for idx in range(points2d.shape[0]):
x3d, y3d, z3d = points3d[idx]
u, v, _ = points2d_inv[idx]
A.append([x3d, y3d, z3d, 1, 0, 0, 0, 0, -u * x3d, -u * y3d, -u * z3d, -u])
A.append([0, 0, 0, 0, x3d, y3d, z3d, 1, -v * x3d, -v * y3d, -v * z3d, -v])
A = np.array(A)
U, D, VT = np.linalg.svd(A)
M = VT.T[:, -1].reshape((3, 4))
error = get_reprojection_error(points2d, points3d, intrinsic, M)
logging.debug("error with_intrinsic: %s", error)
return M
update: I tried to check if re-projecting 1 of the "training" points will yield me z < 0. If so, I do a np.dot(R, extrinsic) to rotate the point about PI radians around 1 of the axis. I've tried all 3 axes but that still don't seem to yield the correct result.
R1 = np.array([
[1, 0, 0],
[0, np.cos(pi), -np.sin(pi)],
[0, np.sin(pi), np.cos(pi)],
])
R2 = np.array([
[np.cos(pi), 0, np.sin(pi)],
[0, 1, 0],
[-np.sin(pi), 0, np.cos(pi)],
])
R3 = np.array([
[np.cos(pi), -np.sin(pi), 0],
[np.sin(pi), np.cos(pi), 0],
[0, 0, 1],
])

Categories