How to get a good cv2.stereoCalibrate after successful cv2.calibrateCamera

How to get a good cv2.stereoCalibrate after successful cv2.calibrateCamera - python

Hi everyone I've been digging a bit into computer vision using Python and OpenCV and was trying to calibrate two cameras I've bought in order to do some 3D stereo reconstruction but I'm having some problems with it.
I've followed mostly this tutorial in order to calibrate the cameras separately (I apply it to both of them) and then I intend to use the cv2.stereoCalibrate to get the relative calibration.
With the single camera calibration everything seems to be working correctly, I get a very low re-proyect error and as far as my knowledge goes the matrices seems to look OK. Here I leave the results of the single camera calibration.
cameraMatrix1 and distCoeffs1:
[[ 951.3607329 0. 298.74117671]
[ 0. 954.23088299 219.20548594]
[ 0. 0. 1. ]]
[[ -1.07320015e-01 -5.56147908e-01 -1.13339913e-03 1.85969704e-03
2.24131322e+00]]
cameraMatrix2 and distCoeffs2:
[[ 963.41078117 0. 362.85971342]
[ 0. 965.66793023 175.63216871]
[ 0. 0. 1. ]]
[[ -3.31491728e-01 2.26020466e+00 3.86190151e-03 -2.32988011e-03
-9.82275646e+00]]
So after having those I do the following (I fix the intrinsics as I already know them from the previous calibrations):
stereocalibration_criteria = (cv2.TERM_CRITERIA_MAX_ITER + cv2.TERM_CRITERIA_EPS, 100, 1e-5)
stereocalibration_flags = cv2.CALIB_FIX_INTRINSIC
stereocalibration_retval, cameraMatrix1, distCoeffs1, cameraMatrix2, distCoeffs2, R, T, E, F = cv2.stereoCalibrate(objpoints,imgpoints_left,imgpoints_right,cameraMatrix1,distCoeffs1,cameraMatrix2,distCoeffs2,gray_left.shape[::-1],criteria = stereocalibration_criteria, flags = stereocalibration_flags)
I've tried several times to change the flags of the stereoCalibrate and switch the matrices to see if I was mistaken in the order and that mattered but I'm still blocked with this and get a retval of around 30 (and after that I try to rectify the images and of course the result is a disaster).
I've also tried using some calibration images from the internet and I do get the same result so I assume that the problem is not with the images I've taken. If anyone can point me in the right direction or knows what could be it will be very very welcome.

Turns out that the order of the images I was using was not the same for right and left camera... I was using
images_left = glob.glob('Calibration/images/set1/left*' + images_format)
images_right = glob.glob('Calibration/images/set1/right*' + images_format)
When I should have been using something more like:
images_left = sorted(glob.glob('Calibration/images/set1/left*' + images_format))
images_right = sorted(glob.glob('Calibration/images/set1/right*' + images_format))
This is because glob gets the images in an apparently random order so I was trying to match the wrong images. Now I finally get a 0.4 retval, which is not that bad.

Related

SciPy Rotation only gives correct answer when I use inv()

What am I doing wrong here?
I have a function that rotates Euler angle inputs using rotation matrices and NumPy. This is proven to be correct because it is currently running on drones that are using it in real-time in their flight control loop.
I'm trying to implement the same thing using SciPy Rotation. When I do it, the result is gibberish, but if I stick an inv() in there for seemingly no reason, it gives the correct answer. Where did stuff up?
VF, BUL and VTOL are just different reference frames. I load the rotation matrices from a yaml. My input (vehicle_rph) is a time-series array of vectors measured in VF frame but I need to know the answer in VTOL frame. Everything in the yaml is compared to BUL so in order to get to VTOL from VF, if have to go through VF -> BUL -> VTOL
Below I do everything twice - once in NumPy and once in SciPy. As you can see, the NumPy (proven to be correct) matches up with the SciPy when I put a random inv() in there.
Bonus points if you can tell my why the Z axis differs slightly, but it's close enough for my needs as is.
Thanks in advance!!
INPUT:
#using numpy arrays
R_VF_from_BUL = frames['frames']['VF']['R_vf_from_bul']
R_VTOL_from_BUL = frames['frames']['VTOL-B']['R_vtolb_from_bul']
R_BUL_from_VF = np.transpose(R_VF_from_BUL)
R_VTOL_from_VF = np.matmul(R_VTOL_from_BUL, R_BUL_from_VF)
#using scipy rotations
r_vf_from_bul = R.from_matrix(frames['frames']['VF']['R_vf_from_bul'])
r_vtol_from_bul = R.from_matrix(frames['frames']['VTOL-B']['R_vtolb_from_bul'])
r_bul_from_vf = r_vf_from_bul.inv()
r_vtol_from_vf = r_vtol_from_bul*r_bul_from_vf
print(f'Using numpy:\n{R_VTOL_from_VF}')
print('')
print(f'Using scipy:\n{r_vtol_from_vf.as_matrix()}')
vehicle_rph_vtol_sp = (r_vtol_from_vf*R.from_euler('XYZ', vehicle_rph)).as_euler('XYZ')
vehicle_rph_vtol_sp_inv = (r_vtol_from_vf.inv()*R.from_euler('XYZ', vehicle_rph)).as_euler('XYZ')
vehicle_rph_vtol_np = transform_euler_angles_signal_to_vtol_from_vf_frame(vehicle_rph)
print('')
print(f'sp reg: {vehicle_rph_vtol_sp[0]}')
print(f'sp inv: {vehicle_rph_vtol_sp_inv[0]}')
print(f'np reg: {vehicle_rph_vtol_np[0]}')
OUTPUT:
Rotation matrix using NumPy:
[[ 0.52205926 0. 0.85290921]
[ 0. 1. 0. ]
[-0.85290921 0. 0.52205926]]
Rotation matrix using SciPy:
[[ 0.52205926 0. 0.85290921]
[-0. 1. 0. ]
[-0.85290921 -0. 0.52205926]]
sp reg: [-3.11319763 1.13894171 -1.90741245]
sp inv: [-0.01189337 -0.04056495 1.25948718]
np reg: [-0.01189337 -0.04056495 1.29595719]

Scikit FDA - Landmark_registration Problem

After a smoothing procedure, I have a problem with the landmark registration in this line:
skfda.preprocessing.registration.landmark_registration_warping(fd, land)
It return the following error:
ValueError: `x` must be strictly increasing sequence.
fd is a FDataGrid (typical type of data required to represent the function) with 5 samples, while land is an array of landmark that I want to align and it is an increasing sequence of points (see below)
land <- array([[[0.1 , 0.134, 0.258, 0.292, 0.328, 0.558, 0.602],
[0.1 , 0.126, 0.23 , 0.256, 0.292, 0.454, 0.474],
[0.1 , 0.148, 0.25 , 0.278, 0.34 , 0.514, 0.568],
[0.1 , 0.116, 0.25 , 0.276, 0.298, 0.508, 0.612],
[0.1 , 0.132, 0.258, 0.286, 0.376, 0.59 , 0.648]]])
fd <-
Can somebody help me? I'm using scikit fda package to perform this kind of analysis
https://fda.readthedocs.io/en/latest/modules/preprocessing/autosummary/skfda.preprocessing.registration.landmark_registration.html#skfda.preprocessing.registration.landmark_registration
This is the link to the function that I'm using

I had this error when finding my own landmarks. I forgot to pass in the actual domain value at that point (in my case the peak(s) I was wanting to align). Once I did that my error changed to: ValueError: Sample points must be within the domain range. Which brings me to my next point:
Manually specifying the end result landmark locations allowed the code to run, and from what I can tell "work." I'm not sure if this is a bug, or if I am doing something wrong myself. However, the examples they provide do explicitly state that the end result landmark locations shouldn't have to be specified.
Additionally, the end result landmark locations do not seem to end up at the specified points. They end up at the closest point in the grid_point array. This may not be too obvious/a problem for high sample rate data, but for the demo GAIT data scikit-fda provides, there are only 20 sample points so it is noticeably visible that the landmarks do not go exactly where specified. This is also the case for when converting to a basis function as well. One could possibly toy around with the interpolation options and see if it helps.

I can't seem to grasp how to use a radial basis function kernel for a classification task in python

I'm tasked with using Parzen windows with the radial basis function kernel to determine which label to give to a given point.
My training data set has 4 dimensions (4 features per point).
My training label set contains the labels (which can be 0,1,2,... depending on how many classes we have) for all the points in my training set (It's a 1D-array).
My test data set contains a couple of points with 4 dimensions but no labels so it's a nx4 array.
We're interested in giving labels for each of the points in my test data set.
I first compute the rdf kernel $k(x_i,x)$: (using python and numpy)
for (i, ex) in enumerate(test_data):
squared_distances = (np.sum((np.abs(ex - self.train_inputs)) ** 2, axis=1)) ** (1.0 / 2)
k = np.exp(- squared_distances/2*(np.square(self.sigma)))
Let's assume that test_data looks like this :
[[ 0.40614 1.3492 -1.4501 -0.55949]
[ -1.3887 -4.8773 6.4774 0.34179]
[ -3.7503 -13.4586 17.5932 -2.7771 ]
[ -3.5637 -8.3827 12.393 -1.2823 ]
[ -2.5419 -0.65804 2.6842 1.1952 ]]
ex is a point from the test data set. here as an example :
[ 0.40614 1.3492 -1.4501 -0.55949]
self.train_inputs is the training data set and it looks like this
[[ 3.6216 8.6661 -2.8073 -0.44699]
[ 4.5459 8.1674 -2.4586 -1.4621 ]
[ 3.866 -2.6383 1.9242 0.10645]
...
[-1.1667 -1.4237 2.9241 0.66119]
[-2.8391 -6.63 10.4849 -0.42113]
[-4.5046 -5.8126 10.8867 -0.52846]]
k is an array containing all the distances between every x_i (in self.training_inputs) and our current test point x (which is ex in the code).
k = [0.99837982 0.9983832 0.99874063 ... 0.9988909 0.99706044 0.99698724]
It's of the same length as the number of points in self.train_inputs.
My understanding of the radial basis function is that the closest the training points are to the test point the greater the value of k(current training point, test point). However k can never exceed 1 or be below 0.
So the goal is to select the training point that is the closest to the test point. We do this by looking which has the greatest value in k. Then we take its index and use that same index on the array containing the labels only. Therefore we get the label we want our test point to take.
In code it translates to this (the additional code is put below the first code snippet above) :
best_arg = np.argmax(k) #selects the greatest value in k and gives back its index.
classes_pred[i] = self.train_labels[best_arg] #we use the index to select the label in the train labels array.
Here self.train_labels looks like :
[0. 0. 0. ... 1. 1. 1.]
This approach gives for ex = [ 0.40614 1.3492 -1.4501 -0.55949] and k = [0.99837982 0.9983832 0.99874063 ... 0.9988909 0.99706044 0.99698724] :
818 for the index containing the greatest value in the current k and 1. as the label given self.train_labels[818] = 1.
However it seems that I'm doing this wrong. Given an already implemented algorithm by my teacher I get some of the labels wrong (especially when we have more then two classes). My question is am I doing this wrong? If yes where? I'm new to machine learning btw.

Tensorflow vs Numpy math functions

Is there any real difference between the math functions performed by numpy and tensorflow. For example, exponential function, or the max function?
The only difference I noticed is that tensorflow takes input of tensors, and not numpy arrays.
Is this the only difference, and no difference in the results of the function, by value?

As has been mentioned, there is the performance difference. TensorFlow has the advantage that it has been designed to work on both CPUs or GPUs, so if you have a CUDA-enabled GPU, chances are TensorFlow is going to be much faster. You can find several benchmarks on the web with different comparisons, and also with other packages such as Numba or Theano.
However, I think that you are talking about whether NumPy and TensorFlow operations are exactly equivalent. The answer is basically yes, that is, the meaning of the operations is the same. However, since they are completely separate libraries with different implementations for everything, you will find small differences in the results. Take this code, for example (TensorFlow 1.2.0, NumPy 1.13.1):
# Force TensorFlow to run on CPU only
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
import numpy as np
import tensorflow as tf
# float32 NumPy array
a = np.arange(100, dtype=np.float32)
# The same array with the same dtype in TensorFlow
a_tf = tf.constant(a, dtype=tf.float32)
# Square root with NumPy
sqrt = np.sqrt(a)
# Square root with TensorFlow
with tf.Session() as sess:
sqrt_tf = sess.run(tf.sqrt(a_tf))
You would expect to get pretty much the same output from both, I mean, a square root doesn't sound like an extremely complex operation after all. However, printing these arrays in my computer I get:
print(sqrt)
>>> array([ 0. , 1. , 1.41421354, 1.73205078, 2. ,
2.23606801, 2.44948983, 2.64575124, 2.82842708, 3. ,
3.1622777 , 3.31662488, 3.46410155, 3.60555124, 3.7416575 ,
3.87298346, 4. , 4.12310553, 4.2426405 , 4.35889912,
4.47213602, 4.5825758 , 4.69041586, 4.79583168, 4.89897966,
5. , 5.09901953, 5.19615221, 5.29150248, 5.38516474,
5.47722578, 5.56776428, 5.65685415, 5.74456263, 5.83095169,
5.91608 , 6. , 6.08276272, 6.16441393, 6.24499798,
6.3245554 , 6.40312433, 6.48074055, 6.55743837, 6.63324976,
6.70820379, 6.78233004, 6.85565472, 6.92820311, 7. ,
7.07106781, 7.14142847, 7.21110249, 7.28010988, 7.34846926,
7.41619825, 7.48331499, 7.54983425, 7.6157732 , 7.68114567,
7.74596691, 7.81024981, 7.8740077 , 7.93725395, 8. ,
8.06225777, 8.1240387 , 8.18535233, 8.24621105, 8.30662346,
8.36660004, 8.42614937, 8.48528099, 8.54400349, 8.60232544,
8.66025448, 8.71779823, 8.77496433, 8.83176041, 8.88819408,
8.94427204, 9. , 9.05538559, 9.11043358, 9.1651516 ,
9.21954441, 9.2736187 , 9.32737923, 9.38083172, 9.43398094,
9.48683262, 9.53939247, 9.59166336, 9.64365101, 9.69536018,
9.7467947 , 9.79795933, 9.84885788, 9.89949512, 9.94987392], dtype=float32)
print(sqrt_tf)
>>> array([ 0. , 0.99999994, 1.41421342, 1.73205078, 1.99999988,
2.23606801, 2.44948959, 2.64575124, 2.82842684, 2.99999976,
3.1622777 , 3.31662488, 3.46410155, 3.60555077, 3.74165726,
3.87298322, 3.99999976, 4.12310553, 4.2426405 , 4.35889864,
4.47213602, 4.58257532, 4.69041538, 4.79583073, 4.89897919,
5. , 5.09901857, 5.19615221, 5.29150248, 5.38516474,
5.47722483, 5.56776428, 5.65685368, 5.74456215, 5.83095121,
5.91607952, 5.99999952, 6.08276224, 6.16441393, 6.24499846,
6.3245554 , 6.40312433, 6.48074055, 6.5574379 , 6.63324976,
6.70820427, 6.78233004, 6.85565472, 6.92820311, 6.99999952,
7.07106733, 7.14142799, 7.21110153, 7.28010893, 7.34846973,
7.41619825, 7.48331451, 7.54983425, 7.61577368, 7.68114567,
7.74596643, 7.81025028, 7.8740077 , 7.93725395, 7.99999952,
8.06225681, 8.12403774, 8.18535233, 8.24621105, 8.30662346,
8.36660004, 8.42614937, 8.48528099, 8.54400253, 8.60232449,
8.66025352, 8.71779728, 8.77496433, 8.83176041, 8.88819408,
8.94427204, 8.99999905, 9.05538464, 9.11043262, 9.16515064,
9.21954441, 9.27361774, 9.32737923, 9.38083076, 9.43398094,
9.48683357, 9.53939152, 9.59166145, 9.64365005, 9.69535923,
9.7467947 , 9.79795837, 9.84885788, 9.89949417, 9.94987392], dtype=float32)
So, okay, it's similar, but there are obvious differences. TensorFlow couldn't even get right the square roots of 1, 4 or 9, for example. And you would probably get yet a different result if you run it on a GPU (due to the GPU kernels being different from the CPU kernels and the dependence on CUDA routines implemented by NVIDIA, another player in the field).
My impression (although I may be wrong) is that TensorFlow is more willing to sacrifice a bit of precision in exchange of performance (which would make sense considering its typical use case). I have even seen some more complicated operations to produce (very slightly) different results just running it twice (on the same hardware), probably due to unspecified order in aggregation and averaging operations causing rounding errors (I generally use float32, so that's a factor too I guess).

Of course there is a real difference. Numpy works on arrays which can use highly optimized vectorized computations and it's doing pretty well on CPU whereas tensorflow's math functions are optimized for GPU where many matrix multiplications are much more important. So the question is where you want to use what. For CPU, I would just go with numpy whereas for GPU, it makes sense to use TF operations.

Replace numbers below threshold # in numpy array with zeroes

So I have a very big Numpy array (2560x1920). Its actually from a grayscale picture where every pixel is given a number from 0-1 indicating its brightness.
I'm trying to replace all values below a threshold, say 0.5, with zeroes.
This is probably a simple task but I'm a beginner with Numpy and I've searched around and still can't figure it out.
This is what I've attempted and I know its wrong...
for x in np.nditer(Image):
if x < .5:
x == 0
plt.imshow(Image, cmap=plt.cm.gray)
plt.show()
It just outputs the normal image without changing anything.
Also the array looks something like this (abbreviated obviously):
[[ 0.24565263 0.24565263 0.24902149 ..., 0.27528678 0.27265316
0.27606536]
[ 0.24565263 0.24565263 0.24902149 ..., 0.27870309 0.27606536
0.27948296]
[ 0.24228902 0.24228902 0.24565263 ..., 0.28212482 0.27948296
0.282906 ]
...,
[ 0.29706944 0.29706944 0.29706944 ..., 0.17470162 0.17144636
0.17144636]
[ 0.29362457 0.29362457 0.29362457 ..., 0.17144636 0.16495056
0.16170998]
[ 0.2901852 0.2901852 0.2901852 ..., 0.16819602 0.16170998
0.15847427]]

There is numpy's builtin indexing which can be used for replacing elements. This is can be done as:
Image[Image<0.5] = 0

I have returned from the future with suggestions.
The above approaches work great for a simple global threshold.
I'm posting this answer to warn that non-adaptive thresholding, like this, may be too naïve depending on your application.
Without adaptation to an image's average brightness or other qualities, your output won't be consistent if you are analyzing multiple pictures taken in different conditions.
There are much more accurate approaches for this, but they are a tiny bit more complicated. Scikit-Image makes this easy.
One of the most popular approaches is Otsu's (I can't say which is the most accurate for each situation, I haven't researched enough).
https://en.wikipedia.org/wiki/Otsu%27s_method
Scikit-Image has this and a few other algorithms built into their modules.
The answer to the question above, using this approach, is as simple as this:
import matplotlib.pyplot as plt
from skimage.filters import threshold_otsu
thresh = threshold_otsu(Image)
binary = Image > thresh
plt.imshow(Image, cmap=plt.cm.gray)
plt.show()
Read an example here:
http://scikit-image.org/docs/dev/auto_examples/plot_otsu.html
And about usage here:
http://scikit-image.org/docs/dev/api/skimage.filters.html?highlight=local%20otsu

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get a good cv2.stereoCalibrate after successful cv2.calibrateCamera - python

Related

SciPy Rotation only gives correct answer when I use inv()

Scikit FDA - Landmark_registration Problem

I can't seem to grasp how to use a radial basis function kernel for a classification task in python

Tensorflow vs Numpy math functions

Replace numbers below threshold # in numpy array with zeroes

Categories

Resources