How to use optical flow tracking in opencv to segment an image?

How to use optical flow tracking in opencv to segment an image? - python

Using calcopticalflowpyrlk from opencv2 to track the motion flow, of objects I picked on the first frame (green dots):
I draw line between the old points fed to calcopticalflowpyrlk and the ones outputed by calcopticalflowpyrlk
At the end I get this nice track
Quoting #rotating_image answer to a similar question:
You can measure the direction and the magnitude of the displacement
each pixel of interest undergoes in two successive frames to get an
idea of their movement pattern
Indeed, using previous and current spot of the tracked object, I can find the flow vector angle and magnitude.
But I still can't see how does it help me segment the image?
Should I compute the vectors of all the pixels, and those that have ~"the same" angel and magnitude found previously are the object and everything else is the background?
Or am I missing something?

Assuming you have flow images and you want to auto track blob of flows that going to the same direction.
So what you get is sparse flow which would look like sth below
You can use opencv partition for this. The partition is like distance based clustering algorithm which is better than kmean because you dont have to enter the number k. Problem is it is subject to noise and false associations. So I`ll prefer to use it on set of flow vector which is larger than a threshold.
you can find a sample below
int th_distance = 18; // radius tolerance
int th2 = th_distance * th_distance; // squared radius tolerance
vector<int> labels;
int n_labels = partition(pts, labels, [th2](const Point& lhs, const Point& rhs) {
return ((lhs.x - rhs.x)*(lhs.x - rhs.x) + (lhs.y - rhs.y)*(lhs.y - rhs.y)) < th2;
});
->
where each color means a segmentes. You can adjust the parameter for your video
Then based on the initial clustering, use convex hull to get proper shape of each car out.
here is the sample https://docs.opencv.org/2.4/doc/tutorials/imgproc/shapedescriptors/hull/hull.html
At last, aggregate the motion vector into final vector K and denote on the final vector K on the center of the hull.
Then concatenate the final vector K of each image to form a trajectory.

Related

Taking the coordinates of an object and creating a formula to drag an arrow

I am using OpenCV to triangulate the position of an object, and am trying to create some kind of formula to pass the coordinates that I obtain through to drag a pull arrow, casting a fishing rod. I tried using polynomial regression to a very high degree, but it is still inaccurate due to the regression not being able to take into account an (x,y) input to an (x,y) output, rather just an x input to x output etc. I have attached screenshots below for clarity, alongside my obtained formulas from the regression. Any help/ideas/suggestions would be appreciated, thanks.
Edit:
The xy coordinates are organized from the landing position to the position where the arrow was pulled to for the bobber to land there. This is because the fishing blob is the input, and the arrow pull end location comes from the blob location. I am using OpenCV to obtain the x,y coordinates, which I believe is just an x,y coordinate system of the 2d screen.
The avatar position is locked, and the button to cast the rod is located at an absolute position of (957,748).
The camera position is locked with no rotation or movement.
I believe that the angle the rod is cast at is likely a 1:1 opposite of where it is pulled to. Ex: if the rod was pulled to 225 degrees it would cast at 45 degrees. I am not 100% sure, but I think that the strength is linear. I used linear regression partially because I was not sure about this. There is no altitude difference/slope/wind that affects the cast. The only affecting factor of landing position is where the arrow is dragged to. The arrow will not drag past the 180/360 degree position sideways (relative to cast button) and will simply lock the cast angle in the x direction if it is held there.
The x-y data was collected with a simple program to move the mouse to the same position (957,748) and drag the arrow to cast the rod with different drag strengths/positions to create some kind of line of best fit for a general formula for casting the rod. The triang_x and y functions included are what the x and y coordinates were run through respectively to triangulate the ending drag coordinate for the arrow. This does not work very well because matching the x-to-x and y-to-y doesn't account for x and y data in each formula, just x-to-x etc.
Left column is fishing spot coordinates, right column is where arrow is dragged to to hit the fish spot.
(1133,359) to (890,890)
(858,334) to (886, 900)
(755,579) to (1012,811)
(1013,255) to (933,934)
(1166,469) to (885,855)
(1344,654) to (855,794)
(804,260) to (1024,939)
(1288,287) to (822,918)
(624,422) to (1075,869)
(981,460) to (949,851)
(944,203) to (963,957)
(829,367) to (1005,887)
(1129,259) to (885,932)
(773,219) to (1036,949)
(1052,314) to (919,908)
(958,662) to (955,782)
(1448,361) to (775,906)
(1566,492) to (751,837)
(1275,703) to (859,764)
(1210,280) to (852,926)
(668,513) to (1050,836)
(830,243) to (1011,939)
(688,654) to (1022,792)
(635,437) to (1072,864)
(911,252) to (976,935)
(1499,542) to (785,825)
(793,452) to (1017,860)
(1309,354) to (824,891)
(1383,522) to (817,838)
(1262,712) to (867,758)
(927,225) to (980,983)
(644,360) to (1097,919)
(1307,648) to (862,798)
(1321,296) to (812,913)
(798,212) to (1026,952)
(1315,460) to (836,854)
(700,597) to (1028,809)
(868,573) to (981,811)
(1561,497) to (758,838)
(1172,588) to (896,816)
Shows bot actions taken within function and how formula is used.
coeffs_x = np.float64([
-7.9517089428836911e+005,
4.1678460255861210e+003,
-7.5075555590709371e+000,
4.2001528427460097e-003,
2.3767929866943760e-006,
-4.7841176483548307e-009,
6.1781765539212100e-012,
-5.2769581174002655e-015,
-4.3548777375857698e-019,
2.5342561455214514e-021,
-1.4853535063513160e-024,
1.5268121610772846e-027,
-2.9667978919426497e-031,
-9.5670287721717018e-035,
-2.0270490020866057e-037,
-2.8248895597371365e-040,
-4.6436110892973750e-044,
6.7719507722602512e-047,
7.1944028726480678e-050,
1.2976299392064562e-052,
7.3188205383162127e-056,
-6.3972284918241943e-059,
-4.1991571617797430e-062,
2.5577340340980386e-066,
-4.3382682133956009e-068,
1.5534384486024757e-071,
5.1736875087411699e-075,
7.8137258396620031e-078,
2.6423817496804479e-081,
2.5418438527686641e-084,
-2.8489136942892384e-087,
-2.3969101111450846e-091,
-3.3499890707855620e-094,
-1.4462592756075361e-096,
6.8375394909274851e-100,
-2.4083095685910846e-103,
7.0453288171977301e-106,
-2.8342463921987051e-109
])
triang_x = np.polynomial.Polynomial(coeffs_x)
coeffs_y = np.float64([
2.6215449742035207e+005,
-5.7778572049616614e+003,
5.1995066291482431e+001,
-2.3696608508824663e-001,
5.2377319234985116e-004,
-2.5063316505492962e-007,
-9.2022083686040928e-010,
3.8639053124052189e-013,
2.7895763914453325e-015,
7.3703786336356152e-019,
-1.3411964395287408e-020,
1.5532055573746500e-023,
-6.9719956967963252e-027,
1.9573598517734802e-029,
-3.3847482160483597e-032,
-5.5368209294319872e-035,
7.1463648457003723e-038,
4.6713369979545088e-040,
-7.5070219026265008e-043,
-4.5089676791698693e-047,
-3.2970870269153785e-049,
1.6283636917056585e-051,
-1.4312555782661719e-054,
7.8463441723355399e-058,
1.9439588820918080e-060,
2.1292310369635749e-063,
-1.4191866473449773e-065,
-2.1353539347524828e-070,
2.5876946863828411e-071,
-1.6182477348921458e-074
])
triang_y = np.polynomial.Polynomial(coeffs_y)

First you need to clarify few things:
the xy data
Is position of object you want to hit or position what you hit when used specific input data (which is missing in that case)?In what coordinate system?
what position is your avatar?
how is the view defined?
is it fully 3D with 6DOF or just fixed (no rotation or movement) relative to avatar?
what is the physics/logic of your rod casting
is it angle (one or two), strength?Is the strength linear to distance?Does throwing acount for altitude difference between avatar and target?does ground elevation (slope) play a role?Are there any other factors like wind, tupe of rod etc?
You shared the xy data but what against you want to correlate or make formula for it? it does not make sense you obviously forget to add something like each position was taken for what conditions?
I would solve this by (no further details before you clarify stuff above):
transform targets xy to player relative coordinate system aligned to ground
compute azimut angle (geometricaly)
simple atan2(y,x) will do but you need to take into account your coordinate system notations.
compute elevation angle and strength (geometricaly)
simple balistic physics should apply however depends on the physics the game or whatever you write this for uses.
adjust for additional stuff
You know for example wind can slightly change your angle and strength
In case you have real physics and data you can do #3,#4 at the same time. See similar:
C++ intersection time of 2 bullets
[Edit1] puting your data into your image
OK your coordinates obviously do not match your screenshot as the image taken is scaled after some intuition I rescaled it and draw into image in C++ to match again so here the result:
I converted your Cartesian points:
int ava_x=957,ava_y=748; // avatar
int data[]= // target(x1,y1) , drag(x0,y0)
{
1133,359,890,890,
858,334,886, 900,
755,579,1012,811,
1013,255,933,934,
1166,469,885,855,
1344,654,855,794,
804,260,1024,939,
1288,287,822,918,
624,422,1075,869,
981,460,949,851,
944,203,963,957,
829,367,1005,887,
1129,259,885,932,
773,219,1036,949,
1052,314,919,908,
958,662,955,782,
1448,361,775,906,
1566,492,751,837,
1275,703,859,764,
1210,280,852,926,
668,513,1050,836,
830,243,1011,939,
688,654,1022,792,
635,437,1072,864,
911,252,976,935,
1499,542,785,825,
793,452,1017,860,
1309,354,824,891,
1383,522,817,838,
1262,712,867,758,
927,225,980,983,
644,360,1097,919,
1307,648,862,798,
1321,296,812,913,
798,212,1026,952,
1315,460,836,854,
700,597,1028,809,
868,573,981,811,
1561,497,758,838,
1172,588,896,816,
};
Into polar relative to ava_x,ava_y using atan2 and 2D distance formula and simply print the angular difference +180deg and ratio between line sizes (that is the yellow texts in left of the screenshot) first is ordinal number then angle difference [deg] and then ratio between line lengths...
as you can see the angle difference is +/-10.6deg and length ratio is <2.5,3.6> probably because of inaccuracy of OpenCV findings and some randomness for fishing rod castings from the game logic itself.
As you can see polar coordinates are best for this. For starters you could do simply this:
// wanted target in polar (obtained by CV)
x = target_x-ava_x;
y = target_y-ava_y;
a = atan2(y,x);
l = sqrt((x*x)+(y*y));
// aiming drag in polar
a += 3.1415926535897932384626433832795; // +=180 deg
l /= 3.0; // "avg" ratio between line sizes
// aiming drag in cartesian
aim_x = ava_x + l*cos(a);
aim_y = ava_y + l*sin(a);
You can optimize it to:
aim_x = ava_x - ((target_x-ava_x)/3);
aim_y = ava_y - ((target_y-ava_y)/3);
Now to improve precision you could measure the dependency or line ratio and line size (it might be not linear) , also the angular difference might be bigger for bigger lines ...
Also note that second cast (ordinal 2) is probably a bug (wrongly detected x,y by CV) if you render the 2 lines you will see they do not match so you should not account that and throw them away from dataset.
Also note that I code in C++ so my goniometrics use radians (not sure if true for python if not you need to convert to degrees) also equations might need some additional tweaking for your coordinate systems (negate y?)

reconstruction: why undistort image AND normalize coordinates?

From the Tutorial: https://programtalk.com/vs2/?source=python/8176/opencv-python-blueprints/chapter4/scene3D.py
I don't understand why they first undistort the images
# undistort the images
self.img1 = cv2.undistort(self.img1, self.K, self.d)
self.img2 = cv2.undistort(self.img2, self.K, self.d)
and: Compute the Essential Matrix
def _find_fundamental_matrix(self):
self.F, self.Fmask = cv2.findFundamentalMat(self.match_pts1,
self.match_pts2,
cv2.FM_RANSAC, 0.1,0.99)
def _find_essential_matrix(self):
self.E = self.K.T.dot(self.F).dot(self.K)
and also Normalize the coordinates:
first_inliers = []
second_inliers = []
for i in range(len(self.Fmask)):
if self.Fmask[i]:
# normalize and homogenize the image coordinates
first_inliers.append(self.K_inv.dot([self.match_pts1[i][0],
self.match_pts1[i][1], 1.0]))
second_inliers.append(self.K_inv.dot([self.match_pts2[i][0],
self.match_pts2[i][1], 1.0]))
Shouldn't it be either or? Or do I have some wrong understanding here?
Can please somone help me on that?

The first step, undistort, does a number of things to reverse the typical warping caused by small camera lenses. See the Wikipedia article on distortion (optics) for more background.
The last step, homogenizing the coordinates, is a completely different thing. The Wikipedia article on homogenous coordinates explains it, but the basic idea is that you add in an extra fake axis that lets you do all affine and projective transformations with chained simple matrix multiplication and then just project back to 3D at the end. Normalizing is just a step you do to make that math easier—basically, you want your extra coordinate to start off as 1.0 (multiply by the inverse of the projective norm).

The requirement for normalization is explained at page-107 of Multi-View Geometry (Hartley and Zisserman). The normalization is required in addition to the un-distortion.
If are using raw pixel values in homogeneous coordinates, the Z-coordinate which is 1 will be small compared to the X and Y co-coordinates. Eg: (X=320, Y=220, Z=1).
But if the homogenized coordinates are the image pixel positions normalized to a standard range, ie -1.0 to 1.0, then we are talking about coordinate values all of whom are kind of in the same range , Eg: (0.75, -0.89, 1.0).
If the image coordinates are of dramatically different ranges(as in the unnormalized case), then the DLT matrix produced will have a bad condition number, and consequently small variations in input image pixel positions, could produce wide variations in the result.
Please see page 107 for a very good explanation.

Affine transformation between contours in OpenCV

I have a historical time sequence of seafloor images scanned from film that need registration.
from pylab import *
import cv2
import urllib
urllib.urlretrieve('http://geoport.whoi.edu/images/frame014.png','frame014.png');
urllib.urlretrieve('http://geoport.whoi.edu/images/frame015.png','frame015.png');
gray1=cv2.imread('frame014.png',0)
gray2=cv2.imread('frame015.png',0)
figure(figsize=(14,6))
subplot(121);imshow(gray1,cmap=cm.gray);
subplot(122);imshow(gray2,cmap=cm.gray);
I want to use the black region on the left of each image to do the registration, since that region was inside the camera and should be fixed in time. So I just need to compute the affine transformation between the black regions.
I determined these regions by thresholding and finding the largest contour:
def find_biggest_contour(gray,threshold=40):
# threshold a grayscale image
ret,thresh = cv2.threshold(gray,threshold,255,1)
# find the contours
contours,h = cv2.findContours(thresh,mode=cv2.RETR_LIST,method=cv2.CHAIN_APPROX_NONE)
# measure the perimeter
perim = [cv2.arcLength(cnt,True) for cnt in contours]
# find contour with largest perimeter
i=perim.index(max(perim))
return contours[i]
c1=find_biggest_contour(gray1)
c2=find_biggest_contour(gray2)
x1=c1[:,0,0];y1=c1[:,0,1]
x2=c2[:,0,0];y2=c2[:,0,1]
figure(figsize=(8,8))
imshow(gray1,cmap=cm.gray, alpha=0.5);plot(x1,y1,'b-')
imshow(gray2,cmap=cm.gray, alpha=0.5);plot(x2,y2,'g-')
axis([0,1500,1000,0]);
The blue is the longest contour from the 1st frame, the green is the longest contour from the 2nd frame.
What is the best way to determine the rotation and offset between the blue and green contours?
I only want to use the right side of the contours in some region surrounding the step, something like the region between the arrows.
Of course, if there is a better way to register these images, I'd love to hear it. I already tried a standard feature matching approach on the raw images, and it didn't work well enough.

Following Shambool's suggested approach, here's what I've come up with. I used a Ramer-Douglas-Peucker algorithm to simplify the contour in the region of interest and identified the two turning points. I was going to use the two turning points to get my three unknowns (xoffset, yoffset and angle of rotation), but the 2nd turning point is a bit too far toward the right because RDP simplified away the smoother curve in this region. So instead I used the angle of the line segment leading up to the 1st turning point. Differencing this angle between image1 and image2 gives me the rotation angle. I'm still not completely happy with this solution. It worked well enough for these two images, but I'm not sure it will work well on the entire image sequence. We'll see.
It would really be better to fit the contour to the known shape of the black border.
# select region of interest from largest contour
ind1=where((x1>190.) & (y1>200.) & (y1<900.))[0]
ind2=where((x2>190.) & (y2>200.) & (y2<900.))[0]
figure(figsize=(10,10))
imshow(gray1,cmap=cm.gray, alpha=0.5);plot(x1[ind1],y1[ind1],'b-')
imshow(gray2,cmap=cm.gray, alpha=0.5);plot(x2[ind2],y2[ind2],'g-')
axis([0,1500,1000,0])
def angle(x1,y1):
# Returns angle of each segment along an (x,y) track
return array([math.atan2(y,x) for (y,x) in zip(diff(y1),diff(x1))])
def simplify(x,y, tolerance=40, min_angle = 60.*pi/180.):
"""
Use the Ramer-Douglas-Peucker algorithm to simplify the path
http://en.wikipedia.org/wiki/Ramer-Douglas-Peucker_algorithm
Python implementation: https://github.com/sebleier/RDP/
"""
from RDP import rdp
points=vstack((x,y)).T
simplified = array(rdp(points.tolist(), tolerance))
sx, sy = simplified.T
theta=abs(diff(angle(sx,sy)))
# Select the index of the points with the greatest theta
# Large theta is associated with greatest change in direction.
idx = where(theta>min_angle)[0]+1
return sx,sy,idx
sx1,sy1,i1 = simplify(x1[ind1],y1[ind1])
sx2,sy2,i2 = simplify(x2[ind2],y2[ind2])
fig = plt.figure(figsize=(10,6))
ax =fig.add_subplot(111)
ax.plot(x1, y1, 'b-', x2, y2, 'g-',label='original path')
ax.plot(sx1, sy1, 'ko-', sx2, sy2, 'ko-',lw=2, label='simplified path')
ax.plot(sx1[i1], sy1[i1], 'ro', sx2[i2], sy2[i2], 'ro',
markersize = 10, label='turning points')
ax.invert_yaxis()
plt.legend(loc='best')
# determine x,y offset between 1st turning points, and
# angle from difference in slopes of line segments approaching 1st turning point
xoff = sx2[i2[0]] - sx1[i1[0]]
yoff = sy2[i2[0]] - sy1[i1[0]]
iseg1 = [i1[0]-1, i1[0]]
iseg2 = [i2[0]-1, i2[0]]
ang1 = angle(sx1[iseg1], sy1[iseg1])
ang2 = angle(sx2[iseg2], sy2[iseg2])
ang = -(ang2[0] - ang1[0])
print xoff, yoff, ang*180.*pi
-28 14 5.07775871644
# 2x3 affine matrix M
M=array([cos(ang),sin(ang),xoff,-sin(ang),cos(ang),yoff]).reshape(2,3)
print M
[[ 9.99959685e-01 8.97932821e-03 -2.80000000e+01]
[ -8.97932821e-03 9.99959685e-01 1.40000000e+01]]
# warp 2nd image into coordinate frame of 1st
Minv = cv2.invertAffineTransform(M)
gray2b = cv2.warpAffine(gray2,Minv,shape(gray2.T))
figure(figsize=(10,10))
imshow(gray1,cmap=cm.gray, alpha=0.5);plot(x1[ind1],y1[ind1],'b-')
imshow(gray2b,cmap=cm.gray, alpha=0.5);
axis([0,1500,1000,0]);
title('image1 and transformed image2 overlain with 50% transparency');

Good question.
One approach is to represent contours as 2d point clouds and then do registration.
More simple and clear code in Matlab that can give you affine transform.
And more complex C++ code(using VXL lib) with python and matlab wrapper included.
Or you can use some modificated ICP(iterative closest point) algorithm that is robust to noise and can handle affine transform.
Also your contours seems to be not very accurate so it can be a problem.
Another approach is to use some kind of registration that use pixel values.
Matlab code (I think it's using some kind of minimizer+ crosscorrelation metric)
Also maybe there is some kind of optical flow registration(or some other kind) that is used in medical imaging.
Also you can use point features as SIFT(SURF).
You can try it quick in FIJI(ImageJ)
also this link.
Open 2 images
Plugins->feature extraction-> sift (or other)
Set expected transformation to affine
Look at estimated transformation model [3,3] homography matrix in ImageJ log.
If it works good then you can implement it in python using OpenCV or maybe using Jython with ImageJ.
And it will be better if you post original images and describe all conditions (it seems that image is changing between frames)

You can represent these contours with their respective ellipses. These ellipses are centered on the centroid of the contour and they are oriented towards the main density axis. You can compare the centroids and the orientation angle.
1) Fill the contours => drawContours with thickness=CV_FILLED
2) Find moments => cvMoments()
3) And use them.
Centroid: { x, y } = {M10/M00, M01/M00 }
Orientation (theta):
EDIT: I customized the sample code from legacy (enteringblobdetection.cpp) for your case.
/* Image moments */
double M00,X,Y,XX,YY,XY;
CvMoments m;
CvRect r = ((CvContour*)cnt)->rect;
CvMat mat;
cvMoments( cvGetSubRect(pImgFG,&mat,r), &m, 0 );
M00 = cvGetSpatialMoment( &m, 0, 0 );
X = cvGetSpatialMoment( &m, 1, 0 )/M00;
Y = cvGetSpatialMoment( &m, 0, 1 )/M00;
XX = (cvGetSpatialMoment( &m, 2, 0 )/M00) - X*X;
YY = (cvGetSpatialMoment( &m, 0, 2 )/M00) - Y*Y;
XY = (cvGetSpatialMoment( &m, 1, 1 )/M00) - X*Y;
/* Contour description */
CvPoint myCentroid(r.x+(float)X,r.y+(float)Y);
double myTheta = atan( 2*XY/(XX-YY) );
Also, check this with OpenCV 2.0 examples.

If you don't want to find the homography between the two images and want to find the affine transformation you have three unknowns, rotation angle (R), and the displacement in x and y (X,Y). Therefore minimum of two points (with two known values for each) are needed to find the unknowns. Two points should be matched between the two images or two lines, each has two known values, the intercept and slope. If you go with the point matching approach, the further the points are from each other the more robust is the found transform to noise (this is very simple if you remember error propagation rules).
In the two point matching method:
find two points (A and B) in the first image I1 and their corresponding points (A',B') in the second image I2
find the middle point between A and B: C, and the middle point between A' and B': C'
the difference C and C' (C-C') gives the translation between the images (X and Y)
using the dot product of C-A and C'-A' you can find the rotation angle (R)
To detect robust points, I would find the the points along the side of counter you have found with highest absolute value of the second derivative (Hessian) and then try to match them. Since you mentioned this is a video footage you can easily make the assumption the transformation between each two frames is small to reject the outliers.

Find speed of vehicle from images

I am doing a project to find the speed of a vehicle from images. We are taking these images from within the vehicle. We will be marking some object from the 1st image as a reference. Using the properties of the same object in the next image, we must calculate the speed of the moving vehicle. Can anyone help me here??? I am using python opencv. I have succeeded till finding the marked pixel in the 2nd image using Optical flow method. Can anyone help me with the rest?

Knowing the acquisition frequency, you must now find the distance between the successive positions of the marker.
To find this distance, I suggest you estimate the pose of the marker for each image. Loosely speaking, the "pose" is the transformation matrix expressing the coordinates of an object relative to a camera. Once you have those successive coordinates, you can compute the distance, and then the speed.
Pose estimation is the process of computing the position and orientation of a known 3D object relative to a 2D camera. The resulting pose is the transformation matrix describing the object's referential in the camera's referential.
OpenCV implements a pose estimation algorithm: Posit. The doc says:
Given some 3D points (in object
coordinate system) of the object, at
least four non-coplanar points, their
corresponding 2D projections in the
image, and the focal length of the
camera, the algorithm is able to
estimate the object's pose.
This means:
You must know the focal length of your camera
You must know the geometry of your marker
You must be able to match four know points of your marker in the 2D image
You may have to compute the focal length of the camera using the calibration routines provided by OpenCV. I think you have the two other required data.
Edit:
// Algorithm example
MarkerCoords = {Four coordinates of know 3D points}
I1 = take 1st image
F1 = focal(I1)
MarkerPixels1 = {Matching pixels in I1}
Pose1 = posit(MarkerCoords, MarkerPixels1, F1)
I2 = take 2nd image
F2 = focal(I2)
MarkerPixels2 = {Matching pixels in I2 by optical flow}
Pose2 = posit(MarkerCoords, MarkerPixels2, F2)
o1 = origin_of_camera * Pose1 // Origin of camera is
o2 = origin_of_camera * Pose2 // typically [0,0,0]
dist = euclidean_distance(o1, o2)
speed = dist/frequency
Edit 2: (Answers to comments)
"What is the acquisition frequency?"
Computing the speed of your vehicle is equivalent to computing the speed of the marker. (In the first case, the referential is the marker attached to the earth, in the second case, the referential is the camera attached to the vehicle.) This is expressed by the following equation:
speed = D/(t2-t1)
With:
D the distance [o1 o2]
o1 the position of the marker at time t1
o2 the position of the marker at time t2
You can retrieve the elapsed time either by extracting t1 and t2 from the metadata of your photos, or from the acquisition frequency of your imaging device: t2-t1 = T = 1/F.
"Won't it be better to mark simple things like posters? And if doing so can't we consider it as a 2d object?"
This is not possible with the Posit algorithm (or with any other pose estimation algorithm as far as I know): it requires four non-coplanar points. This means you cannot chose a 2D object embedded in a 3D space, you have to chose an object with some depth.
On the other hand, you can use a really simple shape, as far as it is a volume. (A cube for example.)

How do I calculate a 3D centroid?

Is there even such a thing as a 3D centroid? Let me be perfectly clear—I've been reading and reading about centroids for the last 2 days both on this site and across the web, so I'm perfectly aware at the existing posts on the topic, including Wikipedia.
That said, let me explain what I'm trying to do. Basically, I want to take a selection of edges and/or vertices, but NOT faces. Then, I want to place an object at the 3D centroid position.
I'll tell you what I don't want:
The vertices average, which would pull too far in any direction that has a more high-detailed mesh.
The bounding box center, because I already have something working for this scenario.
I'm open to suggestions about center of mass, but I don't see how this would work, because vertices or edges alone don't define any sort of mass, especially when I just have an edge loop selected.
For kicks, I'll show you some PyMEL that I worked up, using #Emile's code as reference, but I don't think it's working the way it should:
from pymel.core import ls, spaceLocator
from pymel.core.datatypes import Vector
from pymel.core.nodetypes import NurbsCurve
def get_centroid(node):
if not isinstance(node, NurbsCurve):
raise TypeError("Requires NurbsCurve.")
centroid = Vector(0, 0, 0)
signed_area = 0.0
cvs = node.getCVs(space='world')
v0 = cvs[len(cvs) - 1]
for i, cv in enumerate(cvs[:-1]):
v1 = cv
a = v0.x * v1.y - v1.x * v0.y
signed_area += a
centroid += sum([v0, v1]) * a
v0 = v1
signed_area *= 0.5
centroid /= 6 * signed_area
return centroid
texas = ls(selection=True)[0]
centroid = get_centroid(texas)
print(centroid)
spaceLocator(position=centroid)

In theory centroid = SUM(pos*volume)/SUM(volume) when you split the part into finite volumes each with a location pos and volume value volume.
This is precisely the calculation done for finding the center of gravity of a composite part.

There is not just a 3D centroid, there is an n-dimensional centroid, and the formula for it is given in the "By integral formula" section of the Wikipedia article you cite.
Perhaps you are having trouble setting up this integral? You have not defined your shape.
[Edit] I'll beef up this answer in response to your comment. Since you have described your shape in terms of edges and vertices, then I'll assume it is a polyhedron. You can partition a polyedron into pyramids, find the centroids of the pyramids, and then the centroid of your shape is the centroid of the centroids (this last calculation is done using ja72's formula).
I'll assume your shape is convex (no hollow parts---if this is not the case then break it into convex chunks). You can partition it into pyramids (triangulate it) by picking a point in the interior and drawing edges to the vertices. Then each face of your shape is the base of a pyramid. There are formulas for the centroid of a pyramid (you can look this up, it's 1/4 the way from the centroid of the face to your interior point). Then as was said, the centroid of your shape is the centroid of the centroids---ja72's finite calculation, not an integral---as given in the other answer.
This is the same algorithm as in Hugh Bothwell's answer, however I believe that 1/4 is correct instead of 1/3. Perhaps you can find some code for it lurking around somewhere using the search terms in this description.

I like the question. Centre of mass sounds right, but the question then becomes, what mass for each vertex?
Why not use the average length of each edge that includes the vertex? This should compensate nicely areas with a dense mesh.

You will have to recreate face information from the vertices (essentially a Delauney triangulation).
If your vertices define a convex hull, you can pick any arbitrary point A inside the object. Treat your object as a collection of pyramidal prisms having apex A and each face as a base.
For each face, find the area Fa and the 2d centroid Fc; then the prism's mass is proportional to the volume (== 1/3 base * height (component of Fc-A perpendicular to the face)) and you can disregard the constant of proportionality so long as you do the same for all prisms; the center of mass is (2/3 A + 1/3 Fc), or a third of the way from the apex to the 2d centroid of the base.
You can then do a mass-weighted average of the center-of-mass points to find the 3d centroid of the object as a whole.
The same process should work for non-convex hulls - or even for A outside the hull - but the face-calculation may be a problem; you will need to be careful about the handedness of your faces.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to use optical flow tracking in opencv to segment an image? - python

Related

Taking the coordinates of an object and creating a formula to drag an arrow

reconstruction: why undistort image AND normalize coordinates?

Affine transformation between contours in OpenCV

Find speed of vehicle from images

How do I calculate a 3D centroid?

Categories

Resources