query = db.GqlQuery("SELECT * FROM Place
WHERE location >= :1 AND
location <= :2",
db.GeoPt(lat=minLat, lon=minLon),
db.GeoPt(lat=maxLat, lon=maxLon) )
From what I understand, GAE ignore the long in this case.
Is this true?
Short answer: Yes.
Long answer: GeoPt properties are sorted first by latitude, then by longitude. This query will find entities that fall between the two latitudes, only considering the longitudes if the latitudes are identical.
Related
Say I have a location A, with IP address, and a geolocator API with latitude and longitude. Now I want to find all instances that are within a 25-mile radius of location A. How can I compute this with the least amount of steps?
solution A: I can compute all distances between location A and all instances in the database, and display instances within 25 radius. (way too slow especially if I want a dynamic location, with a large database of locations)
solution B: I can group all instances in terms of zip code in addition to IP, and (lat, long). so that fewer distances between location A and instances needed to be computed. (better, but what if the IP address is at the border of another zip code, this will add to the amount of needed computation)
solution C: I can use trigonometry. using the latitude and longitude of location A. i can find each instance with in the 25 mile radius.
Can someone please describe a better way of comparing distances? ideas and suggestions are much appreciated (if further explanation is needed, pls ask) Thanks.
I'd use a combination of your proposed solutions a and c. You can query your database directly using a filter that only selects the locations within a 25 miles radius (or any other radius). Calculating the longitudinal difference in miles is a little tricky because the mileage of one degree in longitude differs with latitude. Kudos go out to this explanation: https://gis.stackexchange.com/questions/142326/calculating-longitude-length-in-miles#142327
Assuming you have the following DB schema with existing locations (only latitude and longitude as columns):
CREATE TABLE location (
lat REAL,
lon REAL
);
You're able to filter only the locations within a 25 mile radius using this query:
query = """
SELECT (lat - ?) AS difflat,
(lon - ?) AS difflon
FROM location
WHERE POWER(POWER(difflat * 69.172, 2) + POWER(difflon * COS(lat*3.14/180.) * 69.172, 2), 0.5) < ?;
"""
Using the query then like this:
radius = 25 #miles
cursor.execute(query, (querylocation['lat'], querylocation['lat'], radius))
SQLite3 unfortunately doesn't support basic mathematical functions like COS and POWER but they can easily be created:
import math
con = sqlite3.connect(db_path)
con.create_function('POWER', 2, math.pow)
con.create_function('COS', 1, math.cos)
I am working with a SQlite hexagon index database and other information, the Hexagons index is the primary key. This database is generated by code written in python, and other codes written in C use the hexagonal index to access the information stored in the database.
for res_hex in [12,11,10,9,8]:
index_hex = h3.geo_to_h3(sonde[1], sonde[0], res_hex)
sonde[1] is latitude, sonde[0] is longitude res_hex is the resolution.
In fact, I have a list of objects represented by their latitude and longitude in a text file, I calculate the indexes around them with different resolutions (8 to 12). that I enter the database.
But my problem is that when I calculate the hexagon in code c with lat, lon and the resolution, I do not find it in the base. This even if the calculation is based on the same file.
GeoCoord geo = {latitude, longitude};
H3Index currentIndex = geoToH3(&geo, resolution);
Thanks for your help
I have find a solution, in C Lat/Lon must be in radians but it is not the case in python
I have a table in a Postgres DB containing places and their corresponding latitude and longitude values.
Places (id, name, lat, lng) # primary-key(id)
I will get an input of a pair of latitudes and longitudes forming a rectangle.
{
"long_ne": 12.34,
"lat_ne": 34.45,
"long_sw": 15.34,
"lat_sw": 35.56
}
I want to get all the rows that fall inside the rectangle.
The rows can't be sorted based on their lat-lng values as that will cause trouble while inserting new values.
What would be the best way to go about solving this to optimize queries to get the result?
I can obviously do it using the WHERE clause, but would it be the ost optimized solution? There would be a massive number of rows in the table and is there a way this query can be optimized to speed up the result?
It this what you want?
select *
from places
where
lat between :lat_no and :lat_sw
and lng between :long_no and :long_sw
Where :lat_no, :lat_ws :long_no and :long_sw are the input parameters to the query.
This gives you all rows whose latitude and longitude fall between the square boundaries. Note that between considers the inteval inclusive of their bounds on both ends. You can change this as needed with inequality conditions, for example, this makes the match inclusive on the lower bound and exclusive on the upper bound:
where
lat >= :lat_no and lat < :lat_sw
and lng >= :long_no and lng < :long_sw
Won't a simple SQL query suffice here?
SELECT *
FROM Places
WHERE (lat > lat_sw)
AND (long > long_sw)
AND (lat < lat_ne)
AND (long < long_ne)
I am new to pandas. I have a csv file which has a latitude and longitude columns and also a tile ID column, the file has around 1 million rows. I have a list of around a hundred tile ID's and want to get the latitude and longitude coordinates for these tile ID's. Currently I have:
good_tiles_str = [str(q) for q in good_tiles]#setting list elements to string data type
file['tile'] = file.tile.astype(str)#setting title column to string data type
for i in range (len(good_tiles_str)):
x = good_tiles_str[i]
lat = file.loc[file['tile'].str.contains(x), 'BL_Latitude'] #finding lat coordinates
long = file.loc[file['tile'].str.contains(x), 'BL_Longitude'] #finding long coordinates
print(lat)
print(long)
This method is very slow and I know it is not the correct way as I heard you should not use for loops like this whilst using pandas. Also, it does not work as it doesn't find all the latitude and longitude points for the tile ID's
Any help would be very gladly appreciated
There is no need to iterate rows explicitly , I think as far as I understood your question.
If you wish a particular assignment given a condition, you can do so explicitly. Here's one way using numpy.where; we use ~ to indicate "negative".
rule1= file['tile'].str.contains(x)
rule2= file['tile'].str.contains(x)
file['flag'] = np.where(rule1 , 'BL_Latitude', " " )
file['flag'] = np.where(rule2 & ~rule1, 'BL_Longitude', file['flag'])
Try this:
search_for = '|'.join(good_tiles_str)
good = file[file.tile.str.contains(search_for)]
good = good[['BL_Latitude', 'BL_Longitude']].drop_duplicates()
I've got some data (NOAA-provided weather forecasts) I'm trying to work with. There are various data series (temperature, humidity, etc), each of which contains a series of data points, and indexes into an array of datetimes, on various time scales (Some series are hourly, others 3-hourly, some daily). Is there any sort of library for dealing with data like this, and accessing it in a user-friendly way.
Ideal usage would be something like:
db = TimeData()
db.set_val('2010-12-01 12:00','temp',34)
db.set_val('2010-12-01 15:00','temp',37)
db.set_val('2010-12-01 12:00','wind',5)
db.set_val('2010-12-01 13:00','wind',6)
db.query('2010-12-01 13:00') # {'wind':6, 'temp':34}
Basically the query would return the most recent value of each series.
I looked at scikits.timeseries, but it isn't very amenable to this use case, due to the amount of pre-computation involved (it expects all the data in one shot, no random-access setting).
If your data is sorted you can use the bisect module to quickly get the entry with the greatest time less than or equal to the specified time.
Something like:
i = bisect_right(times, time)
# times[j] <= time for j<i
# times[j] > time for j>=i
if times[i-1] == time:
# exact match
value = values[i-1]
else:
# interpolate
value = (values[i-1]+values[i])/2
SQLite has a date type. You can also convert all the times to seconds since epoch (by going through time.gmtime() or time.localtime()), which makes comparisons trivial.
It is a classic row-to-column problem, in a good SQL DBMS you can use unions:
SELECT MAX(d_t) AS d_t, SUM(temp) AS temp, SUM(wind) AS wind, ... FROM (
SELECT d_t, 0 AS temp, value AS wind FROM table
WHERE type='wind' AND d_t >= some_date
ORDER BY d_t DESC LIMIT 1
UNION
SELECT d_t, value, 0 FROM table
WHERE type='temp' AND d_t >= some_date
ORDER BY d_t DESC LIMIT 1
UNION
...
) q1;
The trick is to make a subquery for each dimension while providing placeholder columns for the other dimensions. In Python you can use SQLAlchemy to dynamically generate a query like this.