I am plotting some maps using folium.
Works pretty smoothly.
However, I could not figure out how to pre-calculate the right level of zoom
I can set it automatically
import folium
m = folium.Map([point_of_interest.iloc[0].Lat, point_of_interest.iloc[0].Long])
but in my use case I would need to pre-calculate zoom_start such that:
all couples (Lat,Long) from my pandas dataframe of point_of_interest are within the map
the zoom level is the mnimum possilbe
folium's fit_bounds method should work for you
Some random sample data
import folium
import numpy as np
import pandas as pd
center_point = [40, -90]
data = (
np.random.normal(size=(100, 2)) *
np.array([[.5, .5]]) +
np.array([center_point])
)
df = pd.DataFrame(data, columns=['Lat', 'Long'])
Creating a map with some markers
m = folium.Map(df[['Lat', 'Long']].mean().values.tolist())
for lat, lon in zip(df['Lat'], df['Long']):
folium.Marker([lat, lon]).add_to(m)
fit_bounds requires the 'bounds' of our data in the form of the southwest and northeast corners. There are some padding parameters you can use as well
sw = df[['Lat', 'Long']].min().values.tolist()
ne = df[['Lat', 'Long']].max().values.tolist()
m.fit_bounds([sw, ne])
m
Related
I am trying plot the intersection between a buffer circle and the mesh blocks (or boundaries) within that circle of some radius (in this case, 80 km).
I got the intersection using sjoin() as follows:
intersection_MeshBlock = gpd.sjoin(buffer_df, rest_VIC, how='inner', predicate='intersects')
My buffer variable looks like this:
buffer_df
And the intersection looks like this:
intersection
The problem is I am not able to plot the intersection polygons.
Here is the plot I get after I plot it using the polygon plotting in folium:
for _, r in intersection_MeshBlock.iterrows():
# Without simplifying the representation of each borough,
# the map might not be displayed
sim_geo = gpd.GeoSeries(r['geometry']).simplify(tolerance=0.00001)
geo_j = sim_geo.to_json()
geo_j = folium.GeoJson(data=geo_j,
style_function=lambda x: {'fillColor': 'orange'} )
folium.Popup(r['SA1_CODE21']).add_to(geo_j)
geo_j.add_to(m)
m
Plot:
color filled maps
What am I doing in wrong ways?
EDIT:
I might have solved the issue partially. Now, I am able to plot the polygons inside some buffer radius. This is how my plot looks like:
If you see the image, you will realise that there are certain meshblocks that cross the circular boundary region. How do I get rid of everything which is outside that circular region?
have located some geometry for Melbourne to demonstrate
fundamentally, you want to use overlay() not sjoin()
generation of folium map is much simpler using GeoPandas 0.10 capability explore()
import geopandas as gpd
import numpy as np
import shapely.geometry
import folium
rest_VIC = gpd.read_file(
"https://raw.githubusercontent.com/codeforgermany/click_that_hood/main/public/data/melbourne.geojson"
)
# select a point randomly from total bounds of geometry
buffer_df = gpd.GeoDataFrame(
geometry=[
shapely.geometry.Point(
np.random.uniform(*rest_VIC.total_bounds[[0, 2]], size=1)[0],
np.random.uniform(*rest_VIC.total_bounds[[1, 3]], size=1)[0],
)
],
crs=rest_VIC.crs,
)
buffer_df = gpd.GeoDataFrame(
geometry=buffer_df.to_crs(buffer_df.estimate_utm_crs())
.buffer(8 * 10**3)
.to_crs(buffer_df.crs)
)
# need overlay not sjoin
intersection_MeshBlock = gpd.overlay(buffer_df, rest_VIC, how="intersection")
m = rest_VIC.explore(name="base", style_kwds={"fill":False}, width=400, height=300)
m = buffer_df.explore(m=m, name="buffer", style_kwds={"fill":False})
m = intersection_MeshBlock.explore(m=m, name="intersection", style_kwds={"fillColor":"orange"})
folium.LayerControl().add_to(m)
m
I want to do a spatial binning (using median as aggregation function)
starting from a CSV file containing pollutant values measured at positions long and lat.
The resulting map should be something as:
But for data applied to a city's extent.
At this regard I found this tutorial that is close to what I want to do, but I was not able to get the desired result.
I think that I'm missing something on how to correctly use dissolve and plot the resulting data (better using Folium)
Any useful example code?
you have not provided sample data. So I have used global earthquakes as set of points and geometry of California for scope / extent
it's simple to create grid using shapely.geometry.box()
I have shown use of median and also another aggfunc to demonstrate multiple metrics can be calculated
have used folium to plot. This feature is new in geopandas 0.10.0 https://geopandas.org/en/stable/docs/user_guide/interactive_mapping.html
import geopandas as gpd
import shapely.geometry
import numpy as np
# equivalent of CSV, all earthquake points globally
gdf_e = gpd.read_file(
"https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.geojson"
)
# get geometry of bounding area. Have selected a state rather than a city
gdf_CA = gpd.read_file(
"https://raw.githubusercontent.com/glynnbird/usstatesgeojson/master/california.geojson"
).loc[:, ["geometry"]]
BOXES = 50
a, b, c, d = gdf_CA.total_bounds
# create a grid for Califormia, could be a city
gdf_grid = gpd.GeoDataFrame(
geometry=[
shapely.geometry.box(minx, miny, maxx, maxy)
for minx, maxx in zip(np.linspace(a, c, BOXES), np.linspace(a, c, BOXES)[1:])
for miny, maxy in zip(np.linspace(b, d, BOXES), np.linspace(b, d, BOXES)[1:])
],
crs="epsg:4326",
)
# remove grid boxes created outside actual geometry
gdf_grid = gdf_grid.sjoin(gdf_CA).drop(columns="index_right")
# get earthquakes that have occured within one of the grid geometries
gdf_e_CA = gdf_e.loc[:, ["geometry", "mag"]].sjoin(gdf_grid)
# get median magnitude of eargquakes in grid
gdf_grid = gdf_grid.join(
gdf_e_CA.dissolve(by="index_right", aggfunc="median").drop(columns="geometry")
)
# how many earthquakes in the grid
gdf_grid = gdf_grid.join(
gdf_e_CA.dissolve(by="index_right", aggfunc=lambda d: len(d))
.drop(columns="geometry")
.rename(columns={"mag": "number"})
)
# drop grids geometries that have no measures and create folium map
m = gdf_grid.dropna().explore(column="mag")
# for good measure - boundary on map too
gdf_CA["geometry"].apply(lambda g: shapely.geometry.MultiLineString([p.exterior for p in g.geoms])).explore(m=m)
Thanks to #Rob Raymond,
finally solved with the following code:
import pandas as pd
import geopandas as gpd
import pyproj
import matplotlib.pyplot as plt
import numpy as np
import shapely
from folium import plugins
df=pd.read_csv('../Desktop/test_esri.csv')
gdf_monica = gpd.GeoDataFrame(
df, geometry=gpd.points_from_xy(df.long, df.lat))
gdf_monica=gdf_monica.set_crs('epsg:4326')
gdf_area = gpd.read_file('https://raw.githubusercontent.com/openpolis/geojson-italy/master/geojson/limits_IT_municipalities.geojson')#.loc[:, ["geometry"]]
gdf_area =gdf_area[gdf_area['name']=='Portici'].loc[:,['geometry']]
BOXES = 50
a, b, c, d = gdf_area.total_bounds
gdf_grid = gpd.GeoDataFrame(
geometry=[
shapely.geometry.box(minx, miny, maxx, maxy)
for minx, maxx in zip(np.linspace(a, c, BOXES), np.linspace(a, c, BOXES)[1:])
for miny, maxy in zip(np.linspace(b, d, BOXES), np.linspace(b, d, BOXES)[1:])
],
crs="epsg:4326",
)
# remove grid boxes created outside actual geometry
gdf_grid = gdf_grid.sjoin(gdf_area).drop(columns="index_right")
gdf_monica_binned = gdf_monica.loc[:, ["geometry", "CO"]].sjoin(gdf_grid)
# get median magnitude of CO pollutant
gdf_grid = gdf_grid.join(
gdf_monica_binned.dissolve(by="index_right", aggfunc="median").drop(columns="geometry")
)
# how many earthquakes in the grid
gdf_grid = gdf_grid.join(
gdf_monica_binned.dissolve(by="index_right", aggfunc=lambda d: len(d))
.drop(columns="geometry")
.rename(columns={"CO": "number"})
)
# drop grids geometries that have no measures and create folium map
m = gdf_grid.dropna().explore(column="CO")
# for good measure - boundary on map too
gdf_area["geometry"].apply(lambda g: shapely.geometry.MultiLineString([p.exterior for p in g.geoms])).explore(m=m)
that produce:
As you can understand, I have little or no knowledge regarding spatial analysis. I was not able to get correct results without using geojson data that describe a geometry within which the points of interest fall.
If anyone could add more insights... thanks!
I want to convert a pandas DataFrame to a spatial enabled geopandas one as:
df=pd.read_csv('../Desktop/test_esri.csv')
df.head()
Then converted using:
gdf = geopandas.GeoDataFrame(
df, geometry=geopandas.points_from_xy(df.long, df.lat))
from pyproj import crs
crs_epsg = crs.CRS.from_epsg(4326)
gdf=gdf.set_crs('epsg:4326')
Then I want to overimpose a spatial grid as:
import numpy as np
import shapely
from pyproj import crs
# total area for the grid
xmin, ymin, xmax, ymax= gdf.total_bounds
# how many cells across and down
n_cells=30
cell_size = (xmax-xmin)/n_cells
# projection of the grid
# crs = "+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs"
# create the cells in a loop
grid_cells = []
for x0 in np.arange(xmin, xmax+cell_size, cell_size ):
for y0 in np.arange(ymin, ymax+cell_size, cell_size):
# bounds
x1 = x0-cell_size
y1 = y0+cell_size
grid_cells.append( shapely.geometry.box(x0, y0, x1, y1) )
cell = geopandas.GeoDataFrame(grid_cells, columns=['geometry'],
crs=crs.CRS('epsg:4326'))
Then merge the grid with geopandas dataframe:
merged = geopandas.sjoin(gdf, cell, how='left', predicate='within')
To finally compute the desired metric inside "dissolve":
# Compute stats per grid cell -- aggregate fires to grid cells with dissolve
dissolve = merged.dissolve(by="index_right", aggfunc="median")
But I think I did something wrong with the "cell" grid and I can't figure it out!!
An extract of csv file used con be found here.
I have plotted a heatmap with the following data.
I have thousands of rows. Its just a sample. I also wanted to see the google map view of that coordinate. So I did something like this.
import folium
from folium.plugins import HeatMap
from folium.plugins import FastMarkerCluster
default_location=[11.1657, 45.4515]
m = folium.Map(location=default_location, zoom_start=13)
heat_data = [[row['lat'],row['lon']] for index, row in test.iterrows()]
# Plot it on the map
HeatMap(heat_data).add_to(m)
callback = ('function (row) {'
'var marker = L.marker(new L.LatLng(row[0], row[1]), {color: "red"});'
'var icon = L.AwesomeMarkers.icon({'
"icon: 'info-sign',"
"iconColor: 'white',"
"markerColor: 'green',"
"prefix: 'glyphicon',"
"extraClasses: 'fa-rotate-0'"
'});'
'marker.setIcon(icon);'
"var popup = L.popup({maxWidth: '300'});"
"const display_text = {text1: row[0], text2: row[1]};"
"var mytext = $(`<div id='mytext' class='display_text' style='width: 100.0%; height: 100.0%;'>\
<a href=https://https://www.google.com/maps?ll=${display_text.text1},${display_text.text2} target='_blank'>Open Google Maps</a></div>`)[0];"
"popup.setContent(mytext);"
"marker.bindPopup(popup);"
'return marker};')
m.add_child(FastMarkerCluster(heat_data, callback=callback))
# Display the map
m
Now for every gps coordinate I want to plot a small arrow or few small arrows in the angle of heading_direction and if possible show the distance_of_item in that angle from the gps coordinate. The expected outcome may be something like this.
In the above image, the location pointer is the gps coordinate, the direction and angle would be according to heading direction angle and there is a little star plotted which is the object. The object should be placed at a distance(in meters) mentioned in the dataset. I am not sure how to achieve that. Any lead or suggestions are most welcome. Thanks!
given your sample data is an image, have used alternate GPS data (UK hospitals) then added distance and direction columns as random values
given requirement is to plot a marker at location defined by distance and direction, first step is to calculate GPS co-ordinates of this.
use UTM CRS so that distance is meaningful
use high school maths to calculate x and y in UTM CRS
convert CRS back to WSG 84 so that have GPS co-ordinates
you have tagged question as plotly so I have used mapbox line and scatter traces to demonstrate building a tiled map
sample data is 1200+ hospitals, performance is decent
geopandas data frame could also be used to build folium tiles / markers. Key step is calculating the GPS co-ordinates
import geopandas as gpd
import pandas as pd
import numpy as np
import shapely
import math
import plotly.express as px
import plotly.graph_objects as go
import io, requests
# get some public addressess - hospitals. data that has GPS lat / lon
dfhos = pd.read_csv(io.StringIO(requests.get("http://media.nhschoices.nhs.uk/data/foi/Hospital.csv").text),
sep="¬",engine="python",).loc[:, ["OrganisationName", "Latitude", "Longitude"]]
# debug with fewer records
# df = dfhos.loc[0:500]
df = dfhos
# to use CRS transformations use geopandas, initial data is WSG 84, transform to UTM geometry
# directions and distances are random
gdf = gpd.GeoDataFrame(
data=df.assign(
heading_direction=lambda d: np.random.randint(0, 360, len(d)),
distance_of_item=lambda d: np.random.randint(10 ** 3, 10 ** 4, len(d)),
),
geometry=df.loc[:, ["Longitude", "Latitude"]].apply(
lambda r: shapely.geometry.Point(r["Longitude"], r["Latitude"]), axis=1
),
crs="EPSG:4326",
).pipe(lambda d: d.to_crs(d.estimate_utm_crs()))
# standard high school geometry...
def new_point(point, d, alpha):
alpha = math.radians(alpha)
return shapely.geometry.Point(
point.x + (d * math.cos(alpha)),
point.y + (d * math.sin(alpha)),
)
# calculate points based on direction and distance in UTM CRS. Then convert back to WSG 84 CRS
gdf["geometry2"] = gpd.GeoSeries(
gdf.apply(
lambda r: new_point(
r["geometry"], r["distance_of_item"], r["heading_direction"]
),
axis=1,
),
crs=gdf.geometry.crs,
).to_crs("EPSG:4326")
gdf = gdf.to_crs("EPSG:4326")
# plot lines to show start point and direct. plot markers of destinations for text of distance, etc
fig = px.line_mapbox(
lon=np.stack(
[gdf.geometry.x.values, gdf.geometry2.x.values, np.full(len(gdf), np.nan)],
axis=1,
).reshape([1, len(gdf) * 3])[0],
lat=np.stack(
[gdf.geometry.y.values, gdf.geometry2.y.values, np.full(len(gdf), np.nan)],
axis=1,
).reshape([1, len(gdf) * 3])[0],
).add_traces(
px.scatter_mapbox(
gdf,
lat=gdf.geometry2.y,
lon=gdf.geometry2.x,
hover_data=["distance_of_item", "OrganisationName"],
).data
)
# c = gdf.loc[]
fig.update_layout(mapbox={"style": "open-street-map", "zoom": 8, 'center': {'lat': 52.2316838387109, 'lon': -1.4577750831062155}}, margin={"l":0,"r":0,"t":0,"r":0})
I'm trying a create a Choropleth in Python3 using shapely, fiona & bokeh for display.
I have a file with about 7000 lines that have the location of a town and a counter.
Example:
54.7604;9.55827;208
54.4004;9.95918;207
53.8434;9.95271;203
53.5979;10.0013;201
53.728;10.2526;197
53.646;10.0403;196
54.3977;10.1054;193
52.4385;9.39217;193
53.815;10.3476;192
...
I want to show these in a 12,5km grid, for which a shapefile is available on
https://opendata-esri-de.opendata.arcgis.com/datasets/3c1f46241cbb4b669e18b002e4893711_0
The code I have works.
It's very slow, because it's a brute force algorithm that checks each of the 7127 grid points against all of the 7000 points.
import pandas as pd
import fiona
from shapely.geometry import Polygon, Point, MultiPoint, MultiPolygon
from shapely.prepared import prep
sf = r'c:\Temp\geo_de\Hexagone_125_km\Hexagone_125_km.shp'
shp = fiona.open(sf)
district_xy = [ [ xy for xy in feat["geometry"]["coordinates"][0]] for feat in shp]
district_poly = [ Polygon(xy) for xy in district_xy] # coords to Polygon
df_p = pd.read_csv('points_file.csv', sep=';', header=None)
df_p.columns = ('lat', 'lon', 'count')
map_points = [Point(x,y) for x,y in zip(df_p.lon, df_p.lat)] # Convert Points to Shapely Points
all_points = MultiPoint(map_points) # all points
def calc_points_per_poly(poly, points, values): # Returns total for poly
poly = prep(poly)
return sum([v for p, v in zip(points, values) if poly.contains(p)])
# this is the slow part
# for each shape this sums um the points
sum_hex = [calc_points_per_poly(x, all_points, df_p['count']) for x in district_poly]
Since this is extremly slow, I'm wondering if there is a faster way to get the num_hex value, especially, since the real world list of points may be a lot larger and a smaller grid with more shapes would deliver a better result.
I would recommend using 'geopandas' and its built-in rtree spatial index. It allows you to do the check only if there is a possibility that point lies within polygon.
import pandas as pd
import geopandas as gpd
from shapely.geometry import Polygon, Point
sf = 'Hexagone_125_km.shp'
shp = gpd.read_file(sf)
df_p = pd.read_csv('points_file.csv', sep=';', header=None)
df_p.columns = ('lat', 'lon', 'count')
gdf_p = gpd.GeoDataFrame(df_p, geometry=[Point(x,y) for x,y in zip(df_p.lon, df_p.lat)])
sum_hex = []
spatial_index = gdf_p.sindex
for index, row in shp.iterrows():
polygon = row.geometry
possible_matches_index = list(spatial_index.intersection(polygon.bounds))
possible_matches = gdf_p.iloc[possible_matches_index]
precise_matches = possible_matches[possible_matches.within(polygon)]
sum_hex.append(sum(precise_matches['count']))
shp['sum'] = sum_hex
This solution should be faster than your. You can then plot your GeoDataFrame via Bokeh. If you want more details on spatial indexing I recommend this article by Geoff Boeing: https://geoffboeing.com/2016/10/r-tree-spatial-index-python/
I want to create a visualization on a map using folium. In the map I want to observe how many items are related to a particular geographical point building a heatmap. Below is the code I'm using.
import pandas as pd
import folium
from folium import plugins
data = [[41.895278,12.482222,2873494.0,20.243001,20414,7.104243],
[41.883850,12.333330,3916.0,0.835251,4,1.021450],
[41.854241,12.567000,22263.0,1.132390,35,1.572115],
[41.902147,12.590388,19505.0,0.839181,37,1.896950],
[41.994240,12.48520,16239.0,1.383981,25,1.539504]]
df = pd.DataFrame(columns=['latitude','longitude','population','radius','count','normalized'],data=data)
middle_lat = df['latitude'].median()
middle_lon = df['longitude'].median()
m = folium.Map(location=[middle_lat, middle_lon],tiles = "Stamen Terrain",zoom_start=11)
# convert to (n, 2) nd-array format for heatmap
points = df[['latitude', 'longitude', 'normalized']].dropna().values
# plot heatmap
plugins.HeatMap(points, radius=15).add_to(m)
m.save(outfile='map.html')
Here the result
In this map, each point has the same radius. Insted, I want to create a heatmap in which the points radius is proportional with the one of the city it belongs to. I already tried to pass the radii in a list, but it is not working, as well as passing the values with a for loop.
Any idea?
You need to add one point after another. So you can specify the radius for each point. Like this:
import random
import numpy
pointArrays = numpy.split(points, len(points))
radii = [5, 10, 15, 20, 25]
for point, radius in zip(pointArrays, radii):
plugins.HeatMap(point, radius=radius).add_to(m)
m.save(outfile='map.html')
Here you can see, each point has a different size.