Finding the distance (Haversine) between all elements in a single dataframe

Finding the distance (Haversine) between all elements in a single dataframe - python

I currently have a dataframe which includes five columns as seen below. I group the elements of the original dataframe such that they are within a 100km x 100km grid. For each grid element, I need to determine whether there is at least one set of points which are 100m away from each other. In order to do this, I am using the Haversine formula and calculating the distance between all points within a grid element using a for loop. This is rather slow as my parent data structure can have billions of points, and each grid element millions. Is there a quicker way to do this?
Here is a view into a group in the dataframe. "approx_LatSp" & "approx_LonSp" are what I use for groupBy in a previous function.
print(group.head())
Time Lat Lon approx_LatSp approx_LonSp
197825 1.144823 -69.552576 -177.213646 -70.0 -177.234835
197826 1.144829 -69.579416 -177.213370 -70.0 -177.234835
197827 1.144834 -69.606256 -177.213102 -70.0 -177.234835
197828 1.144840 -69.633091 -177.212856 -70.0 -177.234835
197829 1.144846 -69.659925 -177.212619 -70.0 -177.234835
This group is equivalent to one grid element. This group gets passed to the following function which seems to be the crux of my issue (from a performance perspective):
def get_pass_in_grid(group):
'''
Checks if there are two points within 100m
'''
check_100m = 0
check_1km = 0
row_mins = []
for index, row in group.iterrows():
# Get distance
distance_from_row = get_distance_lla(row['Lat'], row['Lon'], group['Lat'].drop(index), group['Lon'].drop(index))
minimum = np.amin(distance_from_row)
row_mins = row_mins + [minimum]
array = np.array(row_mins)
m_100 = array[array < 0.1]
km_1 = array[array < 1.0]
if m_100.size > 0:
check_100m = 1
if km_1.size > 0:
check_1km = 1
return check_100m, check_1km
And the Haversine formula is calculated as follows
def get_distance_lla(row_lat, row_long, group_lat, group_long):
def radians(degrees):
return degrees * np.pi / 180.0
global EARTH_RADIUS
lon1 = radians(group_long)
lon2 = radians(row_long)
lat1 = radians(group_lat)
lat2 = radians(row_lat)
# Haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = np.sin(dlat / 2)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon / 2)**2
c = 2 * np.arcsin(np.sqrt(a))
# calculate the result
return(c * EARTH_RADIUS)
One way in which I know I can improve this code is to stop the for loop if the 100m is met for any two points. If this is the only way to improve the speed then I will apply this. But I am hoping there is a better way to resolve my problem. Any thoughts are greatly appreciated! Let me know if I can help to clear something up.

Convert all points to carthesian coordinates to have much easier task (distance of 100m is small enough to disregard that Earth is not flat)
Divide each grid into NxN subgrids (20x20, 100x100? check what is faster), for each point determine in which subgrid it is located. Determine distances within smaller subgrids (and their neighbours) instead of searching whole grid.
Use numpy to vectorize calculations (doing point no1 will definitely help you)

Thanks #Corralien for his advice. I was able to use the BallTree in order to quickly find the closest elements. Improvement is something like 100x over my original code from a performance standpoint. Here is the new get_pass_in_grid:
def get_pass_in_grid(group):
'''
Checks if there is a pass within 100m to meet SWE L-Band requirement
'''
check_100m = 0
check_1km = 0
if len(group) < 2:
return check_100m, check_1km
row_mins = []
group['Lat'] = np.deg2rad(group['Lat'])
group['Lon'] = np.deg2rad(group['Lon'])
temp = np.array([group['Lat'],group['Lon']]).T
tree = BallTree(temp, leaf_size=2, metric='haversine')
for _, row in group.iterrows():
# Get distance
row_arr = np.array([row['Lat'], row['Lon']]).reshape((-1,2))
closest_elem_lst, _ = tree.query(row_arr, k=2)
# First element is always just this one (since d=0)
closest_elem = closest_elem_lst[0,1] * EARTH_RADIUS
row_mins = row_mins + [closest_elem]
if closest_elem < 0.1:
break
array = np.array(row_mins)
m_100 = array[array < 0.1]
km_1 = array[array < 1.0]
if m_100.size > 0:
check_100m = 1
if km_1.size > 0:
check_1km = 1
return check_100m, check_1km

Related

GPS data outliers

I have set of latitude and longitude points but i am having hard time to build algorithm that could remove outlier points(shown in orange circles). What i tried is to get the vector angle between points and then remove if reach certain threshold(i.e. large angle), is there a better way? also, when i use use the dataframe in pandas, i have problem with the looping. loop when the vector is large, back to the base point(red circle) then remove until the next point which is somewhat straight. thanks in advance.enter image description here
Here is my sample code:
df_4 = df_3.copy()
def angle(vector1, vector2):
unit_vector_1 = vector1 / np.linalg.norm(vector1)
unit_vector_2 = vector2 / np.linalg.norm(vector2)
dot_product = np.dot(unit_vector_1, unit_vector_2)
angle = np.arccos(dot_product)
return np.degrees(angle)
df_4.reset_index(inplace=True)
th = 20
angle_vector = 0
idx=0
a=0
for x in range(len(df_4)):
print('for:',str(idx))
vector1 = [(df_4.loc[idx+1,'Longitude'] - df_4.loc[idx,'Longitude']),(df_4.loc[idx+1,'Latitude']-df_4.loc[idx,'Latitude'])]
vector2 = [(df_4.loc[idx+2,'Longitude'] - df_4.loc[idx+1,'Longitude']),(df_4.loc[idx+2,'Latitude']-df_4.loc[idx+1,'Latitude'])]
angle_vector = angle(vector1,vector2)
df_4.at[idx+1,'angle_vector'] = angle_vector
#print(angle_vector)
print('before:',len(df_4))
i = 0
while(angle_vector > th & (angle_vector not in [a for a in range(86,96)])):
df_4.drop(df_4.index[idx+1], inplace=True )
df_4.reset_index(inplace=True,drop=True)
print('after:',len(df_4))
print('while:',str(i))
vector3 = [(df_4.loc[idx+i,'Longitude'] - df_4.loc[idx+i-a,'Longitude']), (df_4.loc[idx+i,'Latitude']-df_4.loc[idx+i-a,'Latitude'])]
angle_vector = angle(vector1,vector3)
i += 1
idx+=1
`

Calculate distance from one point to all others

I am working with a list of ID, X, and Y data for fire hydrant locations. I am trying to find the three closest fire hydrants for each fire hydrant in the list.
a = [[ID, X, Y],[ID, X, Y]]
I have tried implementing this using a for loop but I am having trouble because I cannot keep the original point data the same while iterating through the list of points.
Is there a strait forward way to calculate the distance from one point to each of the other points and iterate this for each point in the list? I am very new to python and have not seen anything about how to do this online.
Any help would be greatly appreciated.

You do not have to calculate all distances of all points to all others to get the three nearest neighbours for all points.
A kd-tree search will be much more efficient due to its O(log n) complexity instead of a O(n**2) time complexity for the brute force method (calculating all distances).
Example
import numpy as np
from scipy import spatial
#Create some coordinates and indices
#It is assumed that the coordinates are unique (only one entry per hydrant)
Coords=np.random.rand(1000*2).reshape(1000,2)
Coords*=100
Indices=np.arange(1000) #Indices
def get_indices_of_nearest_neighbours(Coords,Indices):
tree=spatial.cKDTree(Coords)
#k=4 because the first entry is the nearest neighbour
# of a point with itself
res=tree.query(Coords, k=4)[1][:,1:]
return Indices[res]

Here you go. Let's say you have an input list with this format [[ID, X, Y],[ID, X, Y]].
You can simply loop through each hydrant when looping through each hydrant and calculate the min distance between them. You just need to have some variable to store the min distance for each hydrant and the ID of the closest hydrant.
import math # for sqrt calculation
def distance(p0, p1):
""" Calculate the distance between two hydrant """
return math.sqrt((p0[1] - p1[1])**2 + (p0[2] - p1[2])**2)
input = [[0, 1, 2], [1, 2, -3], [2, -3, 5]] # your input list of hydrant
for current_hydrant in input: # loop through each hydrant
min_distance = 999999999999999999999999
closest_hydrant = 0
for other_hydrant in input: # loop through each other hydrant
if current_hydrant != other_hydrant:
curr_distance = distance(current_hydrant, other_hydrant) # call the distance function
if curr_distance < min_distance: # find the closet hydrant
min_distance = curr_distance
closest_hydrant = other_hydrant[0]
print("Closest fire hydrants to the", current_hydrant[0], "is the hydrants",
closest_hydrant, "with the distance of", min_distance) # print the closet hydrant
Since the distance function is not very complicated i rewrite it, you can use some other function in scipy or numpy library to get the distance.
Hope this can help ;)

If you have geolocation, we can perform simple distance calculation(https://en.m.wikipedia.org/wiki/Haversine_formula) to get kilometers distance between two locations. This code is NOT meant to be efficient. If this is what you want we can use numpy to speed it up:
import math
def distance(lat,lon, lat2,lon2):
R = 6372.8 # Earth radius in kilometers
# change lat and lon to radians to find diff
rlat = math.radians(lat)
rlat2 = math.radians(lat2)
rlon = math.radians(lon)
rlon2 = math.radians(lon2)
dlat = math.radians(lat2 - lat)
dlon = math.radians(lon2 - lon)
m = math.sin(dlat/2)**2 + \
math.cos(rlat)*math.cos(rlat2)*math.sin(dlon/2)**2
return 2 * R * math.atan2(math.sqrt(m),
math.sqrt(1 - m))
a = [['ID1', 52.5170365, 13.3888599],
['ID2', 54.5890365, 12.5865499],
['ID3', 50.5170365, 10.3888599],
]
b = []
for id, lat, lon in a:
for id2, lat2, lon2 in a:
if id != id2:
d = distance(lat,lon,lat2,lon2)
b.append([id,id2,d])
print(b)

How to speedup this for loop in python

I'm dealing with two sets of three large lists of the same size containing longitude, latitude and altitude coordinates in UTM format (see lists below). The arrays contain overlapping coordinates (i.e. longitude and latitude values are equal). If the values in Lon are equal to Lon2 and the values in Lat are equal to Lat2 then I want to calculate the mean altitude at those indexes. However, if they're not equal then the longitude, latitude and altitude values will remain. I only want to replace the overlapping data to one set of longitude and latitude coordinates and calculate the mean at those coordinates.
This is my attempt so far
import numpy as np
Lon = [450000.50,459000.50,460000,470000]
Lat = [5800000.50,459000.50,500000,470000]
Alt = [-1,-9,-2,1]
Lon2 = [450000.50,459000.50,460000,470000]
Lat2 = [5800000.50,459000.50,800000,470000]
Alt2= [-3,-1,-20,2]
MeanAlt = []
appendAlt = MeanAlt.append
LonOverlap = []
appendLon = LonOverlap.append
LatOverlap = []
appendLat = LatOverlap.append
for i, a in enumerate(Lon and Lat and Alt):
for j, b in enumerate(Lon2 and Lat2 and Alt2):
if Lon[i]==Lon2[j] and Lat[i]==Lat2[j]:
MeanAltData = (Alt[i]+Alt2[j])/2
appendAlt(MeanAltData)
LonOverlapData = Lon[i]
appendLat(LonOverlapData)
LatOverlapData = Lat[i]
appendLon(LatOverlapData)
print(MeanAlt) # correct ans should be MeanAlt = [-2.0,-5,1.5]
print(LonOverlap)
print(LatOverlap)
I'm working in a jupyter notebook and my laptop is rather slow so I need to make this code much more efficient. I would appreciate any help on this. Thank you :)

I believe your code can be improved in 2 ways:
Firstly, the usage of tuples instead of lists, as iterating over a tuple is generally faster than iterating over a list.
Secondly, your for loops can be reduced to only one loop that iterates over the indices of the tuples you are going to read. Of course, this assumption holds if and only if all your tuples contain the same amount of items (i.e.: len(Lat) == len(Lon) == len(Alt) == len(Lat2) == len(Lon2) == len(Alt2)).
Here is the improved code (I took the liberty of removing the import numpy statement as it was not being used in the piece of code you provided):
# use of tuples
Lon = (450000.50, 459000.50, 460000, 470000)
Lat = (5800000.50, 459000.50, 500000, 470000)
Alt = (-1, -9, -2, 1)
Lon2 = (40000.50, 459000.50, 460000, 470000)
Lat2 = (5800000.50, 459000.50, 800000, 470000)
Alt2 = (-3, -1, -20, 2)
MeanAlt = []
appendAlt = MeanAlt.append
LonOverlap = []
appendLon = LonOverlap.append
LatOverlap = []
appendLat = LatOverlap.append
# only one loop
for i in range(len(Lon)):
if (Lon[i] == Lon2[i]) and (Lat[i] == Lat2[i]):
MeanAltData = (Alt[i] + Alt2[i]) / 2
appendAlt(MeanAltData)
LonOverlapData = Lon[i]
appendLat(LonOverlapData)
LatOverlapData = Lat[i]
appendLon(LatOverlapData)
print(MeanAlt) # correct ans should be MeanAlt = [-2.0,-5,1.5]
print(LonOverlap)
print(LatOverlap)
I executed this program 1 million times on my laptop. Following my code, the amount of time required for all executions is: 1.41 seconds. On the other hand, with your approach the amount of time it takes is: 4.01 seconds.

This is not 100% functionally equivalent, but I am guessing it is closer to what you actually want:
Lon = [450000.50,459000.50,460000,470000]
Lat = [5800000.50,459000.50,500000,470000]
Alt = [-1,-9,-2,1]
Lon2 = [40000.50,459000.50,460000,470000]
Lat2 = [5800000.50,459000.50,800000,470000]
Alt2= [-3,-1,-20,2]
MeanAlt = []
appendAlt = MeanAlt.append
LonOverlap = []
appendLon = LonOverlap.append
LatOverlap = []
appendLat = LatOverlap.append
ll = dict((str(la)+'/'+str(lo), al) for (la, lo, al) in zip(Lat, Lon, Alt))
for la, lo, al in zip(Lon2, Lat2, Alt2):
al2 = ll.get(str(la)+'/'+str(lo))
if al2:
MeanAltData = (al+al2)/2
appendAlt(MeanAltData)
LonOverlapData = lo
appendLat(LonOverlapData)
LatOverlapData = la
appendLon(LatOverlapData)
print(MeanAlt) # correct ans should be MeanAlt = [-2.0,-5,1.5]
print(LonOverlap)
print(LatOverlap)
Or simpler:
Lon = [450000.50,459000.50,460000,470000]
Lat = [5800000.50,459000.50,500000,470000]
Alt = [-1,-9,-2,1]
Lon2 = [40000.50,459000.50,460000,470000]
Lat2 = [5800000.50,459000.50,800000,470000]
Alt2= [-3,-1,-20,2]
ll = dict((str(la)+'/'+str(lo), al) for (la, lo, al) in zip(Lat, Lon, Alt))
result = []
for la, lo, al in zip(Lon2, Lat2, Alt2):
al2 = ll.get(str(la)+'/'+str(lo))
if al2:
result.append((la, lo, (al+al2)/2))
print(result)
In practice, I would try to start with better structured input data to begin with, making the conversion to dict, or at the very least the "zip()" unnecessary.

Use numpy to vectorize computations. For 1,000,000 long arrays execution time should be on the order of 15-25ms of microseconds if inputs are already numpy.ndarrays and ~140ms if inputs are Python lists.
import numpy as np
def mean_alt(lon, lon2, lat, lat2, alt, alt2):
lon = np.asarray(lon)
lon2 = np.asarray(lon2)
lat = np.asarray(lat)
lat2 = np.asarray(lat2)
alt = np.asarray(alt)
alt2 = np.asarray(alt2)
ind = np.where((lon == lon2) & (lat == lat2))
mean_alt = (0.5 * (alt[ind] + alt2[ind])).tolist()
return (lon[ind].tolist(), lat[ind].tolist(), mean_alt)

Python: Connected components on a sphere

I have been banging my head against this for some time now. My problem is very simple to explain:
I have data containing longitudes and latitudes. For simplicity, let us assume these are coordinates of cities. What I want is to separate these city coordinates into groups, so that all cities within a group lie within a given 'maximum distance' to it's nearest neighbour. All cities within a group must have at least one neighbour within this distance limit. The minimum distance between these separated groups is therefore greater than 'maximum distance' mentioned above.
My understanding is that this is a clustering problem (e.g. minimum spanning tree). The distance on the sphere can be calculated with the haversine distance, but I can't wrap my head around how to implement this...my restriction are that I can only use numpy, scipy, and scikit-learn.
I hope someone can help
thanks

Ok, so I have implemented a brute force approach to solve this. I am not 100% sure if the results are correct in all cases, though...if some of you have time to check this, it would be greatly appreciated.
import numpy as np
import matplotlib.pyplot as plt
# -------------------------------------------------------------------
def distance_sphere(lon1, lat1, lon2, lat2):
# Calculate distance on sphere
return np.degrees(np.arccos(np.sin(np.radians(lat1)) * np.sin(np.radians(lat2)) +
np.cos(np.radians(lat1)) * np.cos(np.radians(lat2)) *
np.cos(np.radians(lon1 - lon2))))
# -------------------------------------------------------------------
def distance_euclid(lon1, lat1, lon2, lat2):
# Calculate distance
return np.sqrt((lon1 - lon2)**2 + (lat1 - lat2)**2)
# -------------------------------------------------------------------
# Maximum allowed distance in degrees
max_distance = 10
# Generate city coordinates
lon_all = np.random.random(100) * 100
lat_all = np.random.random(100) * 100
# Start with as many groups as cities
group = np.arange(len(lon_all))
# Loop over all city coordinates
for lon, lat in zip(lon_all, lat_all):
# Calculate distance to all other cities
dis = distance_euclid(lon1=lon, lat1=lat, lon2=lon_all, lat2=lat_all)
# Get index of those which are within the given limits
idx = np.where(dis <= max_distance)[0]
# If there is no other city, we continue
if len(idx) == 0:
continue
# Set common group for all cities within the limits
for i in idx:
group[group == group[i]] = min(group[idx])
# Rewrite labels starting with 0
for old, new in zip(set(group), range(len(set(group)))):
idx = [i for i, j in enumerate(group) if j == old]
group[idx] = new
# -------------------------------------------------------------------
# Plot results
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=[10, 10])
for g, lon, lat in zip(group, lon_all, lat_all):
ax.annotate(str(g), xy=(lon, lat), xycoords="data", size=12, ha="center", va="center")
circ = plt.Circle((lon, lat), radius=max_distance/2, lw=0, color="gray")
ax.add_patch(circ)
ax.set_xlim(-10, 110)
ax.set_ylim(-10, 110)
plt.show()

From the graphical output as it stands in your answer, I believe that your clusters are being terminated prematurely. This is my approach to the problem; the code is ugly because really I just wanted to demonstrate the concept and I don't have time to think about the most elegant way to illustrate this. Also, it's not in numpy because then I could steal my old distance calculation function to save me some time. Hopefully the concept though is clear enough and you'll see how it could be made faster and cleaner e.g. not repeatedly rebuilding available_locations and maybe not re-scanning items in the cluster from previous iteration.
Edit: Illustrated behaviour:
1) Always converges on same solution for each DISTANCE_CAP regardless of all the randomisation in the initialisation and progression of the solution
2) Modifying DISTANCE_CAP can result in single-location clusters or a giant blob
import math
from random import choice, shuffle
DISTANCE_CAP = 20
def crow_flies(lat1, lon1, lat2, lon2):
dx1,dy1 = (lat1/180)*3.141593,(lon1/180)*3.141593
dx2,dy2 = (lat2/180)*3.141593,(lon2/180)*3.141593
dlat,dlon = abs(dx2-dx1),abs(dy2-dy1)
a = (math.sin(dlat/2))**2 + (math.cos(dx1) * math.cos(dx2)
* (math.sin(dlon/2))**2)
c = 2*(math.atan2(math.sqrt(a),math.sqrt(1-a)))
km = 6373 * c
return km
# Aim: separate these back out
manchester = [[53.486286, -2.251476, 1],
[53.483586, -2.254534, 2],
[53.475158, -2.248011, 3],
[53.397161, -2.509189, 4]]
stoke = [[53.037375, -2.262903, 5],
[53.031031, -2.199587, 6]]
birmingham = [[52.443368, -1.975714, 7],
[52.429641, -1.902849, 8],
[52.483326, -1.817483, 9]]
# Mix them all together
combined_list = [item for item in manchester]
for item in stoke:
combined_list.append(item)
for item in birmingham:
combined_list.append(item)
shuffle(combined_list)
# Build a matrix:
matrix = {}
for item in combined_list:
for pair_item in combined_list:
if item[2] != pair_item[2]:
distance = crow_flies(item[0], item[1], pair_item[0], pair_item[1])
matrix[(item[2], pair_item[2])] = distance
# pick a random starting location
available_locations = [combined_list[x][2] for x in range(len(combined_list))]
start_loc = choice(available_locations)
available_locations = [a for a in available_locations if a != start_loc]
all_clusters = []
single_cluster = []
single_cluster.append(start_loc)
# RECURSIVELY add items to our cluster until it cannot get larger, then start a
# new one
cluster_got_bigger = True
while available_locations:
if cluster_got_bigger == True:
cluster_got_bigger = False
for loc in single_cluster:
for item in available_locations:
distance = matrix[(loc, item)]
if distance < DISTANCE_CAP:
single_cluster.append(item)
available_locations = [a for a in available_locations if a != item]
cluster_got_bigger = True
if cluster_got_bigger == False:
all_clusters.append(single_cluster)
single_cluster = []
new_seed = choice(available_locations)
single_cluster.append(new_seed)
available_locations = [a for a in available_locations if a != new_seed]
cluster_got_bigger = True
if not available_locations:
all_clusters.append(single_cluster)
print all_clusters

May be my answer is too late.
But a quick solution is to construct a network data-structure from your cities and get the connected components of your graph:
Each city is a node
There is an edge between two cities if their inter-distance is lower than some threshold
Finally, use some python network module (i.e NetworkX).
The code will be something like this:
import networkx as nx
graph = nx.Graph()
# Add all vertices (cities) to the graph
for i, city in enumerate(cities):
graph.add_vertex(i)
# Add edges between cities that lie under a distance threshold
for i, city_one in enumerate(cities):
for j, city_two in enumerate(cities):
if j > i:
link_exists = calculate_distance(city_one, city_two) < threshold
if link_exists:
graph.add_edge(i,j)
# A list of sets, each set has the indices of cities
components = [c for c in sorted(nx.connected_components(G), reverse=False)]
The calculate_distance and threshold are supposed to be known, the first is a function and the second is the distance threshold.

how to find user location within 500 meters from given lat and long in python

I want to find a user location within 500 meters from given lat and long in Python.
Given lat & long = 19.114315,72.911174
And I want to check whether new lat and long is in the range of 500 meters from given lat and long..
new lat and long = 19.112398,72.912743
I am using this formula in python..
math.acos(math.sin(19.114315) * math.sin(19.112398) + math.cos(19.114315) * math.cos(19.112398) * math.cos(72.912743 - (72.911174))) * 6371 <= 0.500
But its not giving me expected results.. Am I missing something?
please help..

You can use the Haversine formula to get the great-circle distance (along a sphere) between two points. There's some problems about treating the earth like a sphere for great distances, but for 500 meters, you're probably fine (assuming that you're not trying to drop medical packages on a boat or something).
from math import radians, sin, cos, asin, sqrt
def haversine(lat1, long1, lat2, long2, EARTH_RADIUS_KM=6372.8):
# get distance between the points
phi_Lat = radians(lat2 - lat1)
phi_Long = radians(long2 - long1)
lat1 = radians(lat1)
lat2 = radians(lat2)
a = sin(phi_Lat/2)**2 + \
cos(lat1) * cos(lat2) * \
sin(phi_Long/2)**2
c = 2 * asin(sqrt(a))
return EARTH_RADIUS_KM * c
if the distance between the two points is less than your threshold, it's within the range:
points_1 = (19.114315,72.911174)
points_2 = (19.112398,72.912743)
threshold_km = 0.5
distance_km = haversine(points_1[0], points_1[1], points_2[0], points_2[1])
if distance_km < threshold_km:
print('within range')
else:
print('outside range')

hint: write re-usable code, and cleanup your math.
it seems you have an error, right near the end of that formula..
math.cos(longRad - (longRad)) == math.cos(0) == 1
im not good with geo stuff, so I won't correct it..
import math
def inRangeRad(latRad, longRad):
sinLatSqrd = math.sin(latRad) * math.sin(latRad)
cosLatSqrd = math.cos(latRad) * math.cos(latRad)
inner = sinLatSqrd + cosLatSqrd * math.cos(longRad - (longRad))
return math.acos( inner ) * 6371 <= 0.500
def inRangeDeg(latDeg, longDeg):
latRad = 0.0174532925 * latDeg
longRad = 0.0174532925 * longDeg
return inRangeRad(latRad, longRad)
print "test"
print "19.114315, 72.911174"
print inRangeDeg(19.114315, 72.911174)

Be very careful with this! You cannot just use cos and sin for distances using GPS coordinates, because the distance will be incorrect!
https://en.wikipedia.org/wiki/Geodesy#Geodetic_problems
In the case of plane geometry (valid for small areas on the Earth's
surface) the solutions to both problems reduce to simple trigonometry.
On the sphere, the solution is significantly more complex, e.g., in
the inverse problem the azimuths will differ between the two end
points of the connecting great circle, arc, i.e. the geodesic.
Look into GeoPy for these calculations, you really do not want to implement that yourself.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Finding the distance (Haversine) between all elements in a single dataframe - python

Related

GPS data outliers

Calculate distance from one point to all others

How to speedup this for loop in python

Python: Connected components on a sphere

how to find user location within 500 meters from given lat and long in python

Categories

Resources