This question already has answers here:
Vectorizing Haversine distance calculation in Python
(3 answers)
Closed 3 years ago.
I need help calculating the distance between two points-- in this case, the two points are longitude and latitude. I have a .txt file that contains longitude and latitude in columns like this:
-116.148000 32.585000
-116.154000 32.587000
-116.159000 32.584000
The columns do not have headers. I have many more latitudes and longitudes.
So far, i have come up with this code:
from math import sin, cos, sqrt, atan2, radians
R = 6370
lat1 = radians() #insert value
lon1 = radians()
lat2 = radians()
lon2 = radians()
dlon = lon2 - lon1
dlat = lat2- lat1
a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
c = 2 * atan2(sqrt(a), sqrt(1-a))
distance = R * c
print (distance)
Many of the answers/code i've seen on stack overflow for calculating the distance between longitude and latitude have had longitude and latitude assigned as specific values.
I would like the longitude and latitude to equal the values in the columns they are in and for the equation to go through all of the longitudes and latitudes and calculate the distance.
I have not been able to come up with something to do this. Any help would be appreciated
Based on the question it sounds like you would like to calculate the distance between all pairs of points. Scipy has built in functionality to do this.
My suggestion is to first write a function that calcuates distance. Or use an exisitng one like the one in geopy mentioned in another answer.
def get_distance(point1, point2):
R = 6370
lat1 = radians(point1[0]) #insert value
lon1 = radians(point1[1])
lat2 = radians(point2[0])
lon2 = radians(point2[1])
dlon = lon2 - lon1
dlat = lat2- lat1
a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
c = 2 * atan2(sqrt(a), sqrt(1-a))
distance = R * c
return distance
Then you can pass this function into scipy.spatial.distance.cdist
all_points = df[[latitude_column, longitude_column]].values
dm = scipy.spatial.distance.cdist(all_points,
all_points,
get_distance)
As a bonus you can convert the distance matrix to a data frame if you wish to add the index to each point:
pd.DataFrame(dm, index=df.index, columns=df.index)
NOTE: I realized I am assuming, possibly incorrectly that you are using pandas
If you don't mind importing a few libraries, this can be done very simply.
With pandas, you can read your text file into a dataframe, which makes working with tabular data like this super easy.
import pandas as pd
df = pd.read_csv('YOURFILENAME.txt', delimiter=' ', header=None, names=('latitude', 'longitude'))
Then you can use the geopy library to calculate the distance.
Another solution is using the haversine equation with numpy to read in the data and calculate the distances. Any of the previous answers will also work using the other libraries.
import numpy as np
#read in the file, check the data structure using data.shape()
data = np.genfromtxt(fname)
#Create the function of the haversine equation with numpy
def haversine(Olat,Olon, Dlat,Dlon):
radius = 6371. # km
d_lat = np.radians(Dlat - Olat)
d_lon = np.radians(Dlon - Olon)
a = (np.sin(d_lat / 2.) * np.sin(d_lat / 2.) +
np.cos(np.radians(Olat)) * np.cos(np.radians(Dlat)) *
np.sin(d_lon / 2.) * np.sin(d_lon / 2.))
c = 2. * np.arctan2(np.sqrt(a), np.sqrt(1. - a))
d = radius * c
return d
#Call function with your data from some specified point
haversine(10,-110,data[:,1],data[:,0])
In this case, you can pass a single number for Olat,Olon and an array for Dlat,Dlon or vice versa and it will return an array of distances.
For example:
haversine(20,-110,np.arange(0,50,5),-120)
OUTPUT
array([2476.17141062, 1988.11393057, 1544.75756103, 1196.89168113,
1044.73497113, 1167.50120561, 1499.09922502, 1934.97816445,
2419.40097936, 2928.35437829])
Related
I have a high frequency of gps data which i want to downsample to every 50 meters ie keep gps latitude and longitude every 50 meter and discard inbetween points. I found a python code on the internet which basically calculates the distance between two points. But i am not sure how to basically read from a csv the lat and long values and feed it into the function and calculate the distance. If the distance reaches 50 meter i simply save that gps coordinates. So far, i have the following python code
from math import radians, cos, sin, asin, sqrt
def haversine(lon1, lat1, lon2, lat2):
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371 # Radius of earth in kilometers. Use 3956 for miles
return c * r
x1 = 52.19421607
x2 = 52.20000327
y1 = -1.484984011
y2 = -1.48533465
result = haversine(x1,y1,x2,y2) #need to give input from a csv
#if result is greater than 50m , save the coordinates
print(result)
How can i solve the problem?Any direction would be appreciated.
Here is a outline and a working code example - where I made some assumptions about which to keep/drop. I assume the dataframe is sorted.
First calculate distance to next point, indeed use haversine for lat/long pairs. This part is not fast in my implementation - you can find faster.
Use cumsum() of distances, to create distance groups, where group 1 is all distances below 50, group 2 between 50 and 100, etc...
Within each group, keep for instance only the first()
Note that this is approximately each 50 units based on group, so be aware this is different than take a point and jump to next point which is closest to 50 units away and repeat. But for data reduction purposes it should be fine.
Generate some random data around London.
import numpy as np
import sklearn
import pandas as pd
LONDON = (51.509865, -0.118092)
random_gps = np.random.random( (10000,2) ) / 25
random_gps[:,0] += np.arange(random_gps.shape[0]) / 25
random_gps[:,0] += LONDON[0]
random_gps[:,1] += LONDON[1]
gps_data = pd.DataFrame( random_gps, columns=["lat","long"] )
Shift the data to get the lat/long of the next point
gps_data['next_lat'] = gps_data.lat.shift(1)
gps_data['next_long'] = gps_data.long.shift(1)
gps_data.head()
Define the distance metric. This part can be improved in terms of speed by using vector expressions with numpy, so if speed is important change this part.
from sklearn.neighbors import DistanceMetric
dist = DistanceMetric.get_metric('haversine')
EARTH_RADIUS = 6371.009
def haversine_distance(row):
point_a = np.array([[row.lat, row.long]])
point_b = np.array([[row.next_lat, row.next_long]])
return EARTH_RADIUS * dist.pairwise(np.radians(point_a), np.radians(point_b) )[0][0]
and apply our distance function (slow part, which can be improved)
gps_data["distance_to_next"] = gps_data.apply( haversine_distance, axis=1)
gps_data["distance_cumsum"] = gps_data.distance_to_next.cumsum()
Finally, create groups and drop. (!) The haversine is returning the distance in KM - so here i wrongly did an example of 50 km instead of meters.
gps_data["distance_group"] = gps_data.distance_cumsum // 50
filtered = gps_data.groupby(['distance_group']).first()
I want to compute the "MANHATTAN DISTANCE" also called "CITY BLOCK DISTANCE" among pairs of coordinates with LAT, LNG.
Following this post Manhattan Distance for two geolocations I had computed the distance using the haversine formula:
source = (45.070060, 7.663708)
target = (45.068250, 7.663492)
This is my computation:
from math import radians, sin, asin, sqrt, atan2
# convert decimal degrees to radians
lat1, lon1, lat2, lon2 = map(radians, [source[0], source[1], target[0], target[1]])
#haversine formula for delta_lat
dlat = lat2 - lat1
a = sin(dlat / 2) ** 2
c = 2 * atan2(sqrt(a), sqrt(1-a)))
r = 6371
lat_d = c * r
# haversine formula for delta_lon
dlon = lon2 - lon1
a = sin(dlon / 2) ** 2
c = 2 * atan2(sqrt(a), sqrt(1-a))
r = 6371
lon_d = c * r
print lat_d + lon_d
The problem is that my result is 225m, while Google Maps says 270m.
Trying again to compute the distance among
source = (45.070060, 7.663708)
target = (45.072800, 7.665540)
the result I obtained is 508m while Google Maps says 350m.
I will appreciate if someone can help me understanding what is wrong here and how to improve this solution which is far from being acceptable.
Thank you!
I probably got the point on myself, the answer is that if you look at these pictures, which are the same posted above in the original question, you can understand that the haversine method I have implemented, computes the distance as the RED LINE in the image. For this reason in the first case I obtain 225m instead of 270m (lower because the red line is the hypotenuse of the triangle) while in the second case I obtained 508m instead of 350 (higher because sum of legs of the triangle). Hence the way to solve this problem should be ROTATE THE CITY MAP COUNTERCLOCKWISE to align the BLUE DOTTED line with the Y-AXIS of the cartesian reference.
Any suggestion will be appreciated. Thank You
I am working on a Python project where I have two lat/long pairs and I want to calculate the distance between them. In other projects I have calculated distance in Postgres using ST_Distance_Sphere(a.loc_point, b.loc_point), but I would like to avoid having to load all of my data into Postgres just so that I can calculate distance differences. I have searched, but have not been able to find what I would like, which is a purely Python implementation of this so that I don't have to load my data into Postgres.
I know there are other distance calculations that treat the earth as a perfect sphere, but those aren't good enough due to poor accuracy, which is why I would like to use the PostGIS ST_Distance_Sphere() function (or an equivalent).
Here are a couple of sample Lat/Longs that I would like to calculate the distance of:
Lat, Long 1: (49.8755, 6.07594)
Lat, Long 2: (49.87257, 6.0784)
I can't imagine I am the first person to ask this, but does anyone know of a way to use ST_Distance_Sphere() for lat/long distance calculations purely from within a Python script?
I would recommend the geopy package - see section Measuring Distance in the documentation...
For your particular case:
from geopy.distance import great_circle
p1 = (49.8755, 6.07594)
p2 = (49.87257, 6.0784)
print(great_circle(p1, p2).kilometers)
This is a rudimentary function used to calculate distance between two coordinates on a perfect sphere with Radius = Radius of Earth
from math import pi , acos , sin , cos
def calcd(y1,x1, y2,x2):
#
y1 = float(y1)
x1 = float(x1)
y2 = float(y2)
x2 = float(x2)
#
R = 3958.76 # miles
#
y1 *= pi/180.0
x1 *= pi/180.0
y2 *= pi/180.0
x2 *= pi/180.0
#
# approximate great circle distance with law of cosines
#
x = sin(y1)*sin(y2) + cos(y1)*cos(y2)*cos(x2-x1)
if x > 1:
x = 1
return acos( x ) * R
Hope this helps!
See this How can I quickly estimate the distance between two (latitude, longitude) points?
from math import radians, cos, sin, asin, sqrt
def haversine(lon1, lat1, lon2, lat2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
km = 6367 * c
return km
By Aaron D
You can modify it to return miles by adding miles = km * 0.621371
I have since found another way in addition to the answers provided here. Using the python haversine module.
from haversine import haversine as h
# Return results in meters (*1000)
print '{0:30}{1:12}'.format("haversine module:", h(a, b)*1000)
I tested all three answers plus haversine module against what I got using ST_Distance_Sphere(a, b) in Postgres. All answers were excellent (thank you), but the all math answer (calcd) from Sishaar Rao was the closest. Here are the results:
# Short Distance Test
ST_Distance_Sphere(a, b): 370.43790478
vincenty: 370.778186438
great_circle: 370.541763803
calcd: 370.437386736
haversine function: 370.20481753
haversine module: 370.437394767
#Long Distance test:
ST_Distance_Sphere(a, b): 1011734.50495159
vincenty: 1013450.40832
great_circle: 1012018.16318
calcd: 1011733.11203
haversine function: 1011097.90053
haversine module: 1011733.11203
I want to find a user location within 500 meters from given lat and long in Python.
Given lat & long = 19.114315,72.911174
And I want to check whether new lat and long is in the range of 500 meters from given lat and long..
new lat and long = 19.112398,72.912743
I am using this formula in python..
math.acos(math.sin(19.114315) * math.sin(19.112398) + math.cos(19.114315) * math.cos(19.112398) * math.cos(72.912743 - (72.911174))) * 6371 <= 0.500
But its not giving me expected results.. Am I missing something?
please help..
You can use the Haversine formula to get the great-circle distance (along a sphere) between two points. There's some problems about treating the earth like a sphere for great distances, but for 500 meters, you're probably fine (assuming that you're not trying to drop medical packages on a boat or something).
from math import radians, sin, cos, asin, sqrt
def haversine(lat1, long1, lat2, long2, EARTH_RADIUS_KM=6372.8):
# get distance between the points
phi_Lat = radians(lat2 - lat1)
phi_Long = radians(long2 - long1)
lat1 = radians(lat1)
lat2 = radians(lat2)
a = sin(phi_Lat/2)**2 + \
cos(lat1) * cos(lat2) * \
sin(phi_Long/2)**2
c = 2 * asin(sqrt(a))
return EARTH_RADIUS_KM * c
if the distance between the two points is less than your threshold, it's within the range:
points_1 = (19.114315,72.911174)
points_2 = (19.112398,72.912743)
threshold_km = 0.5
distance_km = haversine(points_1[0], points_1[1], points_2[0], points_2[1])
if distance_km < threshold_km:
print('within range')
else:
print('outside range')
hint: write re-usable code, and cleanup your math.
it seems you have an error, right near the end of that formula..
math.cos(longRad - (longRad)) == math.cos(0) == 1
im not good with geo stuff, so I won't correct it..
import math
def inRangeRad(latRad, longRad):
sinLatSqrd = math.sin(latRad) * math.sin(latRad)
cosLatSqrd = math.cos(latRad) * math.cos(latRad)
inner = sinLatSqrd + cosLatSqrd * math.cos(longRad - (longRad))
return math.acos( inner ) * 6371 <= 0.500
def inRangeDeg(latDeg, longDeg):
latRad = 0.0174532925 * latDeg
longRad = 0.0174532925 * longDeg
return inRangeRad(latRad, longRad)
print "test"
print "19.114315, 72.911174"
print inRangeDeg(19.114315, 72.911174)
Be very careful with this! You cannot just use cos and sin for distances using GPS coordinates, because the distance will be incorrect!
https://en.wikipedia.org/wiki/Geodesy#Geodetic_problems
In the case of plane geometry (valid for small areas on the Earth's
surface) the solutions to both problems reduce to simple trigonometry.
On the sphere, the solution is significantly more complex, e.g., in
the inverse problem the azimuths will differ between the two end
points of the connecting great circle, arc, i.e. the geodesic.
Look into GeoPy for these calculations, you really do not want to implement that yourself.
I have been having trouble finding a solution to my problem.
I have a very large csv file containing multiple points. I have created two different functions that compute both distance and speed.
What I need is a way to carry out these functions between the first and second point, then the second and third point, and so on. I have been toying with using arrays and numpy, but I can not seem to figure it out.
Here are my functions:
# distance
def haversine_distance(lat1, long1, lat2, long2):
degrees_to_radians = math.pi/180.0
phi1 = (90.0 - lat1)*degrees_to_radians
phi2 = (90.0 - lat2)*degrees_to_radians
theta1 = long1 * degrees_to_radians
theta2 = long2 * degrees_to_radians
cos = (math.sin(phi1)*math.sin(phi2)*math.cos(theta1 - theta2) +
math.cos(phi1)*math.cos(phi2))
arc = math.acos(cos) * 6371
arc = arc * 1000
return arc
# speed
def speed(lat1, long1, time1, lat2, long2, time2):
distance = haversine_distance(lat1, long1, lat2, long2)
delta_time = time2 - time1
speed = (distance / delta_time)
speed = speed * 3.6
return speed
I assume you can read your csv data into numpy arrays. Let's call them capital Lat,Long and Time. Then all you need to do is to call your functions over the appropriate points:
# initialize correct vectors
l1=Lat[:-1] # all points but last
l2=Lat[1:] # all points but first
lg1=Long[:-1]
lg2=Long[1:]
t1=Time[:-1]
t2=Time[1:]
speed(l1,lg1,t1,l2,lg2,t2) # call the function which will run over your arrays