Calculating distance travelled from gps track points using python [closed]

Calculating distance travelled from gps track points using python [closed] - python

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
I have this very silly question to ask. I have GPS track points for a journey like this:
863.3,2013-10-05T01:21:07Z,0,13.348841,77.686539
863.3,2013-10-05T01:21:08Z,1,13.348841,77.686539
863.3,2013-10-05T01:21:23Z,2,13.348708,77.686248
861.1,2013-10-05T01:21:28Z,3,13.348647,77.686088
867.0,2013-10-05T01:29:03Z,4,13.34732,77.682364
All I want is to find the distance traveled: should I only consider the first track point and the last track point? Or do I need to find the distance traveled between every track point?

Once you parse your gps points, you need to extract the lat/lon points for each. You could use the following formula adapted from here to get the distances between each pair of points and add sum them for your total distance.
import math
def getDistance(lat1,lon1,lat2,lon2):
# This uses the haversine formula, which remains a good numberical computation,
# even at small distances, unlike the Shperical Law of Cosines.
# This method has ~0.3% error built in.
R = 6371 # Radius of Earth in km
dLat = math.radians(float(lat2) - float(lat1))
dLon = math.radians(float(lon2) - float(lon1))
lat1 = math.radians(float(lat1))
lat2 = math.radians(float(lat2))
a = math.sin(dLat/2) * math.sin(dLat/2) + \
math.cos(lat1) * math.cos(lat2) * math.sin(dLon/2) * math.sin(dLon/2)
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))
d = R * c * 0.621371 # Converting km to miles with "* 0.621371"
return d
Note that this function returns your distances in miles, but you can keep things metric in (km) by removing the "* 0.621371" from the end.
Of course these are assuming great circle lines. You're probably traveling some sort of network, so this will certainly not be real world accurate.

In order to get an estimate of the distance travelled between the GPS track points you have, you definitely need to consider the distances between all consecutive points. More precisely, if you have N positions, you need to iterate over all positions you have and sum up the distance between each point P_i and P_i+1 (ordered by the time it has been recorded).
If you would only calculate the distance between the first and the last point, the result would not be of any meaning at all. Imagine a set of N points that have been recorded while moving a track that represents a large circle. The first and the last point would be almost the same, hence resulting in a very small distance, even though the total distance you travelled while moving in the circle is significantly larger.
However, be aware that summing up the distance between consecutive points will still only be an estimate of the total distance travelled. Depending on the resolution of your track (i.e., the frequency in wich the positions of your track have been submitted) the accuracy compared to the real distance may vary significantly.

Related

Is there an algorithm to calculate the area of a Lissajous figure?

Suppose I have measurements of two signals
V = V(t) and U = U(t)
that are periodic in time with a phase difference between them. When plotted against each other in a graph V vs U they form a Lissajous figure, and I want to calculate the area inside it.
Is there an algorithm for such calculation?
I would like to solve this problem using Python. But a response in any language or an algorithm to do it will be very appreciated.
Examples of V and U signals can be generated using expressions like:
V(t) = V0*sin(2*pi*t) ; U(t) = U0*sin(2*pi*t + delta)
Figure 1 shows a graph of V,U vs t for V0=10, U0=5, t=np.arange(0.0,2.0,0.01) and delta = pi/5.
And Figure 2 shows the corresponding Lissajous figure V vs U.
This is an specific problem of a more general question: How to calculate a closed path integral obtained with a discrete (x_i,y_i) data set?

To find area of (closed) parametric curve in Cartesian coordinates, you can use Green's theorem (4-th formula here)
A = 1/2 * Abs(Integral[t=0..t=period] {(V(t) * U'(t) - V'(t) * U(t))dt})
But remember that interpretation - what is real area under self-intersected curves - is ambiguous, as #algrid noticed in comments

for the outer most curves area of usual Lissajous shapes I would try this:
find period of signal
so find T such:
U(t) = U(t+T)
V(t) = V(t+T)
sample data on t=<0,T>
I would use polar coordinate system with center equal to average U,V coordinate on interval t=<0,T> and call it U0,V0. Convert and store the data in polar coordinates so:
a(t)=atan2( V(t)-V0 , U(t)-U0 )
r(t)=sqrt( (U(t)-U0)^2 + (V(t)-V0)^2 )
and remember only the points with max radius for each angle position. That can be done either with arrays (limiting precision in angle) or geometricaly by computing polyline intersection with overlapping segments. and removing inside parts.
Compute the area from sampled data
So compute the the area by summing the pie triangles for each angular position covering whole circle.
This may not work for exotic shapes.

Both solutions above - by #MBo and by #Spektre (and #meowgoesthedog in the comments) - works fine. Thank you guys.
But I found another way to calculate the area A of an elliptical Lissajous curve: use the A = Pi*a*b formula (a and b are, respectively, the major and minor semi axis of the ellipse).
Steps:
1 - Find the period T of the V (or U) signal;
2 - In the time interval 0<t<T:
2.a - calculate the average values of V and U (V0 and U0), in order to determine the center of the ellipse;
2.b - calculate the distance r(t) from the point (V0,U0) using:
r(t)=sqrt( (U(t)-U0)^2 + (V(t)-V0)^2 )
3 - Find a and b values using:
a = max(r(t)); b = min(r(t))
4 - calculate A: A = Pi*a*b
The Lissajous curves will always be elliptical if the U,V signals are sinusoidal-like and have the same frequency.
Seizing the opportunity, I will propose a solution for the case where the V,U signals are triangular and have the same frequency. In this case, the Lissajous curve will be a parallelogram, then one can calculate its area A using A = 2*|D|*|d|*sin(q), where |D| and |d| are, respectively, the length of major and minor semi diagonals of the parallelogram and q is the angle between the vectors D and d.
Repeat steps 1 and 2 for the elliptical case.
In step 3 we will have:
|D| = max(r(t)) = r(t1); |d| = min(r(t)) = r(t2)
4' - Obtain t1 and t2 and use them to get the coordinates (V(t1)=V1,U(t1)=U1) and (V(t2)=V2,U(t2)=U2). Then the vectors D and d can be written as:
D=(V1,U1)-(V0,U0); d=(V2,U2)-(V0,U0)
5' - Calculate the angle q between D and d;
6' - Perform the calculation of A: A = 2*|D|*|d|*sin(q)

Python nearest neighbour - coordinates

I wanted to check I was using scipy's KD tree correctly because it appears slower than a simple bruteforce.
I had three questions regarding this:
Q1.
If I create the following test data:
nplen = 1000000
# WGS84 lat/long
point = [51.349,-0.19]
# This contains WGS84 lat/long
points = np.ndarray.tolist(np.column_stack(
[np.round(np.random.randn(nplen)+51,5),
np.round(np.random.randn(nplen),5)]))
And create three functions:
def kd_test(points,point):
""" KD Tree"""
return points[spatial.KDTree(points).query(point)[1]]
def ckd_test(points,point):
""" C implementation of KD Tree"""
return points[spatial.cKDTree(points).query(point)[1]]
def closest_math(points,point):
""" Simple angle"""
return (min((hypot(x2-point[1],y2-point[0]),y2,x2) for y2,x2 in points))[1:3]
I would expect the cKD tree to be the fastest, however - running this:
print("Co-ordinate: ", f(points,point))
print("Index: ", points.index(list(f(points,point))))
%timeit f(points,point)
Result times - the simple bruteforce method is faster:
closest_math: 1 loops, best of 3: 3.59 s per loop
ckd_test: 1 loops, best of 3: 13.5 s per loop
kd_test: 1 loops, best of 3: 30.9 s per loop
Is this because I am using it wrong - somehow?
Q2.
I would assume that the even to get the ranking (rather than distance) of closest points one still needs to project the data. However, it seems that the projected and un-projected points give me the same nearest neighbour:
def proj_list(points,
inproj = Proj(init='epsg:4326'),
outproj = Proj(init='epsg:27700')):
""" Projected geo coordinates"""
return [list(transform(inproj,outproj,x,y)) for y,x in points]
proj_points = proj_list(points)
proj_point = proj_list([point])[0]
Is this just because my spread of points is not big enough to introduce distortion? I re-ran a few times and still got the same index out of the projected and un-projected lists being returned.
Q3.
Is it generally faster to project the points (like above) and calculate the hypotenuse distance compared to calculating the haversine or vincenty distance on (un-projected) latitude/longitudes? Also which option would be more accurate? I ran a small test:
from math import *
def haversine(origin,
destination):
"""
Find distance between a pair of lat/lng coordinates
"""
lat1, lon1, lat2, lon2 = map(radians, [origin[0],origin[1],destination[0],destination[1]])
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat / 2) ** 2 + cos(lat1) * cos(lat2) * sin(dlon / 2) ** 2
c = 2 * asin(sqrt(a))
r = 6371000 # Metres
return (c * r)
def closest_math_unproj(points,point):
""" Haversine on unprojected """
return (min((haversine(point,pt),pt[0],pt[1]) for pt in points))
def closest_math_proj(points,point):
""" Simple angle since projected"""
return (min((hypot(x2-point[1],y2-point[0]),y2,x2) for y2,x2 in points))
Results:
So this seems to say that projecting and then doing distance is faster than not - however, I am not sure which method will bring more accurate results.
Testing this on an online vincenty calculation is seems the projected co-ordinates are the way to go:

Q1.
The reason for the apparent inefficiency of the k-d tree is quite simple: you are measuring both the construction and querying of the k-d tree at once. This is not how you would or should use a k-d tree: you should construct it only once. If you measure only the querying, the time taken reduces to mere tens of milliseconds (vs seconds using the brute-force approach).
Q2.
This will depend on the spatial distribution of the actual data being used and the projection being used. There might be slight differences based on how efficient the implementation of the k-d tree is at balancing the constructed tree. If you are querying only a single point, then the result will be deterministic and unaffected by the distribution of points anyway.
With the sample data that you are using, which has strong central symmetry, and with your map projection (Transverese Mercator), the difference should be negligible.
Q3.
Technically, the answer to your question is trivial: using the Haversine formula for geographic distance measurement is both more accurate and slower. Whether the tradeoff between accuracy and speed is warranted depends heavily on your use case and the spatial distribution of your data (mostly on the spatial extent, obviously).
If the spatial extent of your points is on the small, regional side, then using a suitable projection and the simple Euclidean distance measure might be accurate enough for your use case and faster than using the Haversine formula.

pyEphem - Calculating Positions of non-Earthy Moons

I'm trying to get the Earth distance and the right ascension (relative to my observer point in Earth) of a satellite not orbiting the Earth, but pyEphem isn't returning the same properties as other solar bodies.
With Ganymede (the largest moon of Jupiter), for instance:
import math, ephem
Observer = ephem.city('London')
Observer.date = '2013-04-23'
Observer.pressure, Observer.elevation = 0, 100
moonGanymede = ephem.Ganymede(Observer)
print math.cos(moonGanymede.ra) # right ascension
print moonGanymede.earth_distance * ephem.meters_per_au # distance
I get this error:
AttributeError: 'Ganymede' object has no attribute 'earth_distance'
The ra attribute exists, but is it relative to my Observer or to Jupiter?
Seems to be relative to the Observer, since if I change the location, the value changes too.
I've read the documentation and I know that these properties are not defined for moons, but I have no idea how to compute those relative to the Earth given the additional defined properties of moon bodies:
On planetary moons, also sets:
Position of moon relative to planet (measured in planet radii)
x — offset +east or –west
y — offset +south or –north
z — offset +front or –behind
Doing:
print moonGanymede.x, moonGanymede.y, moonGanymede.z
Outputs:
-14.8928060532 1.52614057064 -0.37974858284
Since Jupiter has an average radius of 69173 kilometers, those values translate to:
moonGanymede.x = 1030200 kilometers (west)
moonGanymede.y = 105570 kilometers (south)
moonGanymede.z = 26268 kilometers (behind)
Given that I know the distance and right ascension of Jupiter relative to the Observer, how can I calculate the distance and right ascension of moonGanymede (also relative to the Observer)?
I'm using pyEphem 3.7.5.1 (with Python 2.7).

Just some thoughts; You probably need to do it two steps.
Get location of satellite relative to parent planet
Get location of planet relative to observer
Trigonometry calculation; get the location of satellite relative to observer.
You already did 1, and can easily do 2. Convert all values to x,y,z and add then back to angular. Or I'm sure you / ephym can do this for you directly.
HTH

I'm still trying to figure it out (if anyone spots something, please do tell), but it seems that if I do:
sqrt((-14.8928060532)^2 + (1.52614057064)^2 + (-0.37974858284)^2) = 14.9756130481
I'll always get a value that always falls within the min/max distance from orbit center (14.95 - 14.99).
Since that's specified in orbit center radii, I'll need to multiply it by 69173 * 1000 to get the SI unit:
14.9756130481 * 69173 * 1000 = 1.0359080813762213 * 10^9 meters
Since pyEphem deals in distances with AU:
print (1.0359080813762213 * 10**9) / ephem.meters_per_au # 0.00692461785302
At the same time, the Earth-Jupiter distance was 5.79160547256 AU.
Now, to get the distance, I should either add or subtract depending on the sign of the z coordinate:
5.79160547256 - 0.00692461785302 = 5.78468085470698 AU
Running the same code for today (now) returns 6.03799937821 which seems to very close to the value of 6.031 that WolframAlpha is returning at the present time, it doesn't match 100% but perhaps that could be accounted for by some different underlying ephemeris library or data source. Not sure...

Looks like right ascension, declination, azimuth, etc are computed correctly:
In [31]: g = ephem.Ganymede(Observer)
In [32]: j = ephem.Jupiter(Observer)
In [33]: g.ra, g.az, g.dec
Out[33]: (1.3024204969406128, 5.586287021636963, 0.38997682929039)
In [34]: j.ra, j.az, j.dec
Out[34]: (1.303646765055829, 5.5853118896484375, 0.39010250333236757)
Values for Ganimede and Jupiter are close enough, it looks like you get correct results for everything except distance to object.

What is wrong with my gravity simulation?

As per advice given to me in this answer, I have implemented a Runge-Kutta integrator in my gravity simulator.
However, after I simulate one year of the solar system, the positions are still off by cca 110 000 kilometers, which isn't acceptable.
My initial data was provided by NASA's HORIZONS system. Through it, I obtained position and velocity vectors of the planets, Pluto, the Moon, Deimos and Phobos at a specific point in time.
These vectors were 3D, however, some people told me that I could ignore the third dimension as the planets aligned themselves in a plate around the Sun, and so I did. I merely copied the x-y coordinates into my files.
This is the code of my improved update method:
"""
Measurement units:
[time] = s
[distance] = m
[mass] = kg
[velocity] = ms^-1
[acceleration] = ms^-2
"""
class Uni:
def Fg(self, b1, b2):
"""Returns the gravitational force acting between two bodies as a Vector2."""
a = abs(b1.position.x - b2.position.x) #Distance on the x axis
b = abs(b1.position.y - b2.position.y) #Distance on the y axis
r = math.sqrt(a*a + b*b)
fg = (self.G * b1.m * b2.m) / pow(r, 2)
return Vector2(a/r * fg, b/r * fg)
#After this is ran, all bodies have the correct accelerations:
def updateAccel(self):
#For every combination of two bodies (b1 and b2) out of all bodies:
for b1, b2 in combinations(self.bodies.values(), 2):
fg = self.Fg(b1, b2) #Calculate the gravitational force between them
#Add this force to the current force vector of the body:
if b1.position.x > b2.position.x:
b1.force.x -= fg.x
b2.force.x += fg.x
else:
b1.force.x += fg.x
b2.force.x -= fg.x
if b1.position.y > b2.position.y:
b1.force.y -= fg.y
b2.force.y += fg.y
else:
b1.force.y += fg.y
b2.force.y -= fg.y
#For body (b) in all bodies (self.bodies.itervalues()):
for b in self.bodies.itervalues():
b.acceleration.x = b.force.x/b.m
b.acceleration.y = b.force.y/b.m
b.force.null() #Reset the force as it's not needed anymore.
def RK4(self, dt, stage):
#For body (b) in all bodies (self.bodies.itervalues()):
for b in self.bodies.itervalues():
rd = b.rk4data #rk4data is an object where the integrator stores its intermediate data
if stage == 1:
rd.px[0] = b.position.x
rd.py[0] = b.position.y
rd.vx[0] = b.velocity.x
rd.vy[0] = b.velocity.y
rd.ax[0] = b.acceleration.x
rd.ay[0] = b.acceleration.y
if stage == 2:
rd.px[1] = rd.px[0] + 0.5*rd.vx[0]*dt
rd.py[1] = rd.py[0] + 0.5*rd.vy[0]*dt
rd.vx[1] = rd.vx[0] + 0.5*rd.ax[0]*dt
rd.vy[1] = rd.vy[0] + 0.5*rd.ay[0]*dt
rd.ax[1] = b.acceleration.x
rd.ay[1] = b.acceleration.y
if stage == 3:
rd.px[2] = rd.px[0] + 0.5*rd.vx[1]*dt
rd.py[2] = rd.py[0] + 0.5*rd.vy[1]*dt
rd.vx[2] = rd.vx[0] + 0.5*rd.ax[1]*dt
rd.vy[2] = rd.vy[0] + 0.5*rd.ay[1]*dt
rd.ax[2] = b.acceleration.x
rd.ay[2] = b.acceleration.y
if stage == 4:
rd.px[3] = rd.px[0] + rd.vx[2]*dt
rd.py[3] = rd.py[0] + rd.vy[2]*dt
rd.vx[3] = rd.vx[0] + rd.ax[2]*dt
rd.vy[3] = rd.vy[0] + rd.ay[2]*dt
rd.ax[3] = b.acceleration.x
rd.ay[3] = b.acceleration.y
b.position.x = rd.px[stage-1]
b.position.y = rd.py[stage-1]
def update (self, dt):
"""Pushes the uni 'dt' seconds forward in time."""
#Repeat four times:
for i in range(1, 5, 1):
self.updateAccel() #Calculate the current acceleration of all bodies
self.RK4(dt, i) #ith Runge-Kutta step
#Set the results of the Runge-Kutta algorithm to the bodies:
for b in self.bodies.itervalues():
rd = b.rk4data
b.position.x = b.rk4data.px[0] + (dt/6.0)*(rd.vx[0] + 2*rd.vx[1] + 2*rd.vx[2] + rd.vx[3]) #original_x + delta_x
b.position.y = b.rk4data.py[0] + (dt/6.0)*(rd.vy[0] + 2*rd.vy[1] + 2*rd.vy[2] + rd.vy[3])
b.velocity.x = b.rk4data.vx[0] + (dt/6.0)*(rd.ax[0] + 2*rd.ax[1] + 2*rd.ax[2] + rd.ax[3])
b.velocity.y = b.rk4data.vy[0] + (dt/6.0)*(rd.ay[0] + 2*rd.ay[1] + 2*rd.ay[2] + rd.ay[3])
self.time += dt #Internal time variable
The algorithm is as follows:
Update the accelerations of all bodies in the system
RK4(first step)
goto 1
RK4(second)
goto 1
RK4(third)
goto 1
RK4(fourth)
Did I mess something up with my RK4 implementation? Or did I just start with corrupted data (too few important bodies and ignoring the 3rd dimension)?
How can this be fixed?
Explanation of my data etc...
All of my coordinates are relative to the Sun (i.e. the Sun is at (0, 0)).
./my_simulator 1yr
Earth position: (-1.47589927462e+11, 18668756050.4)
HORIZONS (NASA):
Earth position: (-1.474760457316177E+11, 1.900200786726017E+10)
I got the 110 000 km error by subtracting the Earth's x coordinate given by NASA from the one predicted by my simulator.
relative error = (my_x_coordinate - nasa_x_coordinate) / nasa_x_coordinate * 100
= (-1.47589927462e+11 + 1.474760457316177E+11) / -1.474760457316177E+11 * 100
= 0.077%
The relative error seems miniscule, but that's simply because Earth is really far away from the Sun both in my simulation and in NASA's. The distance is still huge and renders my simulator useless.

110 000 km absolute error means what relative error?
I got the 110 000 km value by subtracting my predicted Earth's x
coordinate with NASA's Earth x coordinate.
I'm not sure what you're calculating here or what you mean by "NASA's Earth x coordinate". That's a distance from what origin, in what coordinate system, at what time? (As far as I know, the earth moves in orbit around the sun, so its x-coordinate w.r.t. a coordinate system centered at the sun is changing all the time.)
In any case, you calculated an absolute error of 110,000 km by subtracting your calculated value from "NASA's Earth x coordinate". You seem to think this is a bad answer. What's your expectation? To hit it spot on? To be within a meter? One km? What's acceptable to you and why?
You get a relative error by dividing your error difference by "NASA's Earth x coordinate". Think of it as a percentage. What value do you get? If it's 1% or less, congratulate yourself. That would be quite good.
You should know that floating point numbers aren't exact on computers. (You can't represent 0.1 exactly in binary any more than you can represent 1/3 exactly in decimal.) There are going to be errors. Your job as a simulator is to understand the errors and minimize them as best you can.
You could have a stepsize problem. Try reducing your time step size by half and see if you do better. If you do, it says that your results have not converged. Reduce by half again until you achieve acceptable error.
Your equations might be poorly conditioned. Small initial errors will be amplified over time if that's true.
I'd suggest that you non-dimensionalize your equations and calculate the stability limit step size. Your intuition about what a "small enough" step size should be might surprise you.
I'd also read a bit more about the many body problem. It's subtle.
You might also try a numerical integration library instead of your integration scheme. You'll program your equations and give them to an industrial strength integrator. It could give some insight into whether or not it's your implementation or the physics that causes the problem.
Personally, I don't like your implementation. It'd be a better solution if you'd done it with mathematical vectors in mind. The "if" test for the relative positions leaves me cold. Vector mechanics would make the signs come out naturally.
UPDATE:
OK, your relative errors are pretty small.
Of course the absolute error does matter - depending on your requirements. If you're landing a vehicle on a planet you don't want to be off by that much.
So you need to stop making assumptions about what constitutes too small a step size and do what you must to drive the errors to an acceptable level.
Are all the quantities in your calculation 64-bit IEEE floating point numbers? If not, you'll never get there.
A 64 bit floating point number has about 16 digits of accuracy. If you need more than that, you'll have to use an infinite precision object like Java's BigDecimal or - wait for it - rescale your equations to use a length unit other than kilometers. If you scale all your distances by something meaningful for your problem (e.g., the diameter of the earth or the length of the major/minor axis of the earth's orbit) you might do better.

To do a RK4 integration of the solar system you need a very good precision or your solution will diverge quickly. Assuming you have implemented everything correctly, you may be seeing the drawbacks with RK for this sort of simulation.
To verify if this is the case: try a different integration algorithm. I found that using Verlet integration a solar system simulation will be much less chaotic. Verlet is much simpler to implement than RK4 as well.
The reason Verlet (and derived methods) are often better than RK4 for long term prediction (like full orbits) is that they are symplectic, that is, conserve momentum which RK4 does not. Thus Verlet will give a better behavior even after it diverges (a realistic simulation but with an error in it) whereas RK will give a non-physical behavior once it diverges.
Also: make sure you are using floating point as good as you can. Don't use distances in meters in the solar system, since the precision of floating point numbers is much better in the 0..1 interval. Using AU or some other normalized scale is much better than using meters. As suggested on the other topic, ensure you use an epoch for the time to avoid accumulating errors when adding time steps.

Such simulations are notoriously unreliable. Rounding errors accumulate and introduce instability. Increasing precision doesn't help much; the problem is that you (have to) use a finite step size and nature uses a zero step size.
You can reduce the problem by reducing the step size, so it takes longer for the errors to become apparent. If you are not doing this in real time, you can use a dynamic step size which reduces if two or more bodies are very close.
One thing I do with these kinds of simulations is "re-normalise" after each step to make the total energy the same. The sum of gravitational plus kinetic energy for the system as a whole should be a constant (conservation of energy). Work out the total energy after each step, and then scale all the object speeds by a constant amount to keep the total energy a constant. This at least keeps the output looking more plausible. Without this scaling, a tiny amount of energy is either added to or removed from the system after each step, and orbits tend to blow up to infinity or spiral into the sun.

Very simple changes that will improve things (proper usage of floating point values)
Change the unit system, to use as much mantissa bits as possible. Using meters, you're doing it wrong... Use AU, as suggested above. Even better, scale things so that the solar system fits in a 1x1x1 box
Already said in an other post : your time, compute it as time = epoch_count * time_step, not by adding ! This way, you avoid accumulating errors.
When doing a summation of several values, use a high precision sum algorithm, like Kahan summmation. In python, math.fsum does it for you.

Shouldn't the force decomposition be
th = atan(b, a);
return Vector2(cos(th) * fg, sin(th) * fg)
(see http://www.physicsclassroom.com/class/vectors/Lesson-3/Resolution-of-Forces or https://fiftyexamples.readthedocs.org/en/latest/gravity.html#solution)
BTW: you take the square root to calculate the distance, but you actually need the squared distance...
Why not simplify
r = math.sqrt(a*a + b*b)
fg = (self.G * b1.m * b2.m) / pow(r, 2)
to
r2 = a*a + b*b
fg = (self.G * b1.m * b2.m) / r2
I am not sure about python, but in some cases you get more precise calculations for intermediate results (Intel CPUs support 80 bit floats, but when assigning to variables, they get truncated to 64 bit):
Floating point computation changes if stored in intermediate "double" variable

It is not quite clear in what frame of reference you have your planet coordinates and velocities. If it is in heliocentric frame (the frame of reference is tied to the Sun), then you have two options: (i) perform calculations in non-inertial frame of reference (the Sun is not motionless), (ii) convert positions and velocities into the inertial barycentric frame of reference. If your coordinates and velocities are in barycentric frame of reference, you must have coordinates and velocities for the Sun as well.

Sorting function for a list of points based on distance

[Question has been rewritten for clarification]
I'm trying to come up with a sorting function. What is being sorted is a list of points.
The sorting function takes in 3 points. One from the list of points to be sorted, and two others that are used for comparison. The goal is to determine the relative euclidean distance the point to be sorted is from the other two points. The lowest value of the function should be given when the point lies directly between the two points. The function should make use of the euclidean distance between both points.
So far seems like the formula should either be the some of the squares of the distance, or to create a point in between the two given points, and use the euclidean distance to that point. below I've include the two possible function so far.
p is the point to be sorted
p1,p2 are the given points
def f(p,p1,p2): #Midpoint distance
midPoint = midpoint(p1,p2)
return distance(p,midPoint)
def f(p,p1,p2): #Sum of squares
return distance(p,p1) ** 2 + distance(p,p2) ** 2
def distance(pointA,pointB): #Psudocode
dx = pointA.x - pointB.x
dy = pointA.y - pointB.y
return sqrt(dx ** 2 + dy ** 2)
Below is an example:
The two points being considered here are the ones with the line drawn between them. The circled points should be the three lowest points in the sorting algorithm. The close point to the left is penalized for being close to one of the two points, but far from the other.

Maybe the Least Squares method would help? So you sum the square of the distances. This way the left node would be penalized for being too far from the right node in the line.
Another option is to take the distance to the halfway point on the line made by the two base nodes. This would also prefer the three nodes to the one on the left.

Well using the average seems like the intuitive way to do it (by the way this will be the same as using the sum). One other thing you could do would be to use the 'weighted' average. For instance, if a is the shorter distance, you could give it a higher priority by using (2*a + b) / 3, for instance (or in general (m*a + b*n) / (m + n) where m > n).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.