I am working on a shapefile in python using geopandas and gdal.
I am looking to create meshgrid (with regular 1000m interval points) inside the polygon shapefile. I have reprojected the file so that units can be meters. However, I could not find any direct way to implement this.
Can any one guide in this regard?
I am sharing the code, I have tried so far:
from osgeo import gdal, ogr
import numpy as np
import matplotlib.pyplot as plt
import os
import sys
import pandas as pd
import geopandas as gpd
from shapely.geometry import Polygon
source_ds = ogr.Open(r"E:\review paper\sample tb data for recon\descend\tiffbt\alaska_bound.shp")
boundFile =gpd.read_file(r"E:\review paper\sample tb data for recon\descend\tiffbt\alaska_bound.shp")
bound_project = boundFile.to_crs({'init': 'EPSG:3572'})
print(bound_project.crs)
print(bound_project.total_bounds)
The coordinate system and bounding box coordinates are as below (output of above code):
+init=epsg:3572 +type=crs
[-2477342.73003557 -3852592.48050272 1305143.81797914 -2054961.64359753]
It's not clear if you are trying to create a grid of boxes or a grid of points. To change to points use:
# create a grid for geometry
gdf_grid = gpd.GeoDataFrame(
geometry=[
shapely.geometry.Point(x, y)
for x in np.arange(a, c, STEP)
for y in np.arange(b, d, STEP)
],
crs=crs,
).to_crs(gdf.crs)
have used 50km instead of 1000m for demonstration purposes
with Alaska it for polygons it is necessary to take into account the antimeridian. Without this you will have polygons than span in excess of 350 degrees when re-projected to EPSG:4326
approach is simple
obtain Alaska geometry shape file
project to a CRS in meters. Have used UTM
get total_bounds
construct grid of geometry objects using 3
restrict grid of geometry to ones that intersect with geometry
you will observe at such latitudes there will be distortion between UTM and EPSG:4326 as expected (the nature of projections)
full code
import geopandas as gpd
import numpy as np
import shapely.geometry
gdf = gpd.read_file("https://www2.census.gov/geo/tiger/TIGER2018/ANRC/tl_2018_02_anrc.zip")
STEP = 50000
crs = gdf.estimate_utm_crs()
# crs = "EPSG:3338"
a, b, c, d = gdf.to_crs(crs).total_bounds
# create a grid for geometry
gdf_grid = gpd.GeoDataFrame(
geometry=[
shapely.geometry.box(minx, miny, maxx, maxy)
for minx, maxx in zip(np.arange(a, c, STEP), np.arange(a, c, STEP)[1:])
for miny, maxy in zip(np.arange(b, d, STEP), np.arange(b, d, STEP)[1:])
],
crs=crs,
).to_crs(gdf.crs)
# exclude geometries that cross antimeridian
gdf_grid = gdf_grid.loc[~gdf_grid["geometry"].bounds.pipe(lambda d: d["maxx"] - d["minx"]).ge(350)]
# restrict grid to only squares that intersect with geometry
gdf_grid = (
gdf_grid.sjoin(gdf.dissolve().loc[:,["geometry"]])
.pipe(lambda d: d.groupby(d.index).first())
.set_crs(gdf.crs)
.drop(columns=["index_right"])
)
m = gdf.explore(color="red", style_kwds={"fillOpacity":0})
gdf_grid.explore(m=m)
output
Related
I am trying plot the intersection between a buffer circle and the mesh blocks (or boundaries) within that circle of some radius (in this case, 80 km).
I got the intersection using sjoin() as follows:
intersection_MeshBlock = gpd.sjoin(buffer_df, rest_VIC, how='inner', predicate='intersects')
My buffer variable looks like this:
buffer_df
And the intersection looks like this:
intersection
The problem is I am not able to plot the intersection polygons.
Here is the plot I get after I plot it using the polygon plotting in folium:
for _, r in intersection_MeshBlock.iterrows():
# Without simplifying the representation of each borough,
# the map might not be displayed
sim_geo = gpd.GeoSeries(r['geometry']).simplify(tolerance=0.00001)
geo_j = sim_geo.to_json()
geo_j = folium.GeoJson(data=geo_j,
style_function=lambda x: {'fillColor': 'orange'} )
folium.Popup(r['SA1_CODE21']).add_to(geo_j)
geo_j.add_to(m)
m
Plot:
color filled maps
What am I doing in wrong ways?
EDIT:
I might have solved the issue partially. Now, I am able to plot the polygons inside some buffer radius. This is how my plot looks like:
If you see the image, you will realise that there are certain meshblocks that cross the circular boundary region. How do I get rid of everything which is outside that circular region?
have located some geometry for Melbourne to demonstrate
fundamentally, you want to use overlay() not sjoin()
generation of folium map is much simpler using GeoPandas 0.10 capability explore()
import geopandas as gpd
import numpy as np
import shapely.geometry
import folium
rest_VIC = gpd.read_file(
"https://raw.githubusercontent.com/codeforgermany/click_that_hood/main/public/data/melbourne.geojson"
)
# select a point randomly from total bounds of geometry
buffer_df = gpd.GeoDataFrame(
geometry=[
shapely.geometry.Point(
np.random.uniform(*rest_VIC.total_bounds[[0, 2]], size=1)[0],
np.random.uniform(*rest_VIC.total_bounds[[1, 3]], size=1)[0],
)
],
crs=rest_VIC.crs,
)
buffer_df = gpd.GeoDataFrame(
geometry=buffer_df.to_crs(buffer_df.estimate_utm_crs())
.buffer(8 * 10**3)
.to_crs(buffer_df.crs)
)
# need overlay not sjoin
intersection_MeshBlock = gpd.overlay(buffer_df, rest_VIC, how="intersection")
m = rest_VIC.explore(name="base", style_kwds={"fill":False}, width=400, height=300)
m = buffer_df.explore(m=m, name="buffer", style_kwds={"fill":False})
m = intersection_MeshBlock.explore(m=m, name="intersection", style_kwds={"fillColor":"orange"})
folium.LayerControl().add_to(m)
m
ERROR:
TypeError: 'LineString' object is not iterable
I am trying to find the bottom right corner of the Polygon in this .shp file. This .shp file is a square but other files may be a rectangle/triangle.
I want to use Geopandas and apparently I can obtain this using the read_file() method. I am quite new to SHPs and I have the .shx, .dpf files however when I enter just the .shp in this method, I am not able to loop through the polygon coordinates.
Here is my code - I want to capture the bottom right corner in a variable, currently all_cords will capture all of them, so I need to find a way to get over this error and then capture the bottom right corner,
import geopandas as gpd
import numpy as np
shapepath = r"FieldAlyticsCanada_WesternSales_AlexOlson_CouttsAgro2022_CouttsAgro_7-29-23.shp"
df = gpd.read_file(shapepath)
g = [i for i in df.geometry]
all_coords = []
for b in g[0].boundary: # error happens here
coords = np.dstack(b.coords.xy).tolist()
all_coords.append(*coords)
print(all_coords)
You can use much simpler concepts
bounds provides tuple (minx, miny, maxx, maxy)
shapely.ops.nearest_points() can be used to find nearest point to maxx, miny point
import geopandas as gpd
import numpy as np
import shapely.geometry
import shapely.ops
import pandas as pd
shapepath = (
r"FieldAlyticsCanada_WesternSales_AlexOlson_CouttsAgro2022_CouttsAgro_7-29-23.shp"
)
shapepath = gpd.datasets.get_path("naturalearth_lowres")
df = gpd.read_file(shapepath)
# find point closest to bottom right corner of geometry (minx, maxy)
df["nearest"] = df["geometry"].apply(
lambda g: shapely.ops.nearest_points(
g,
shapely.geometry.Point(
g.bounds[2], g.bounds[1]
),
)[0]
)
# visualize outcome...
gdfn = gpd.GeoDataFrame(df["name"], geometry=df["nearest"], crs=df.crs)
m = df.drop(columns=["nearest"]).explore()
gdfn.explore(m=m, color="red")
I want to do a spatial binning (using median as aggregation function)
starting from a CSV file containing pollutant values measured at positions long and lat.
The resulting map should be something as:
But for data applied to a city's extent.
At this regard I found this tutorial that is close to what I want to do, but I was not able to get the desired result.
I think that I'm missing something on how to correctly use dissolve and plot the resulting data (better using Folium)
Any useful example code?
you have not provided sample data. So I have used global earthquakes as set of points and geometry of California for scope / extent
it's simple to create grid using shapely.geometry.box()
I have shown use of median and also another aggfunc to demonstrate multiple metrics can be calculated
have used folium to plot. This feature is new in geopandas 0.10.0 https://geopandas.org/en/stable/docs/user_guide/interactive_mapping.html
import geopandas as gpd
import shapely.geometry
import numpy as np
# equivalent of CSV, all earthquake points globally
gdf_e = gpd.read_file(
"https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.geojson"
)
# get geometry of bounding area. Have selected a state rather than a city
gdf_CA = gpd.read_file(
"https://raw.githubusercontent.com/glynnbird/usstatesgeojson/master/california.geojson"
).loc[:, ["geometry"]]
BOXES = 50
a, b, c, d = gdf_CA.total_bounds
# create a grid for Califormia, could be a city
gdf_grid = gpd.GeoDataFrame(
geometry=[
shapely.geometry.box(minx, miny, maxx, maxy)
for minx, maxx in zip(np.linspace(a, c, BOXES), np.linspace(a, c, BOXES)[1:])
for miny, maxy in zip(np.linspace(b, d, BOXES), np.linspace(b, d, BOXES)[1:])
],
crs="epsg:4326",
)
# remove grid boxes created outside actual geometry
gdf_grid = gdf_grid.sjoin(gdf_CA).drop(columns="index_right")
# get earthquakes that have occured within one of the grid geometries
gdf_e_CA = gdf_e.loc[:, ["geometry", "mag"]].sjoin(gdf_grid)
# get median magnitude of eargquakes in grid
gdf_grid = gdf_grid.join(
gdf_e_CA.dissolve(by="index_right", aggfunc="median").drop(columns="geometry")
)
# how many earthquakes in the grid
gdf_grid = gdf_grid.join(
gdf_e_CA.dissolve(by="index_right", aggfunc=lambda d: len(d))
.drop(columns="geometry")
.rename(columns={"mag": "number"})
)
# drop grids geometries that have no measures and create folium map
m = gdf_grid.dropna().explore(column="mag")
# for good measure - boundary on map too
gdf_CA["geometry"].apply(lambda g: shapely.geometry.MultiLineString([p.exterior for p in g.geoms])).explore(m=m)
Thanks to #Rob Raymond,
finally solved with the following code:
import pandas as pd
import geopandas as gpd
import pyproj
import matplotlib.pyplot as plt
import numpy as np
import shapely
from folium import plugins
df=pd.read_csv('../Desktop/test_esri.csv')
gdf_monica = gpd.GeoDataFrame(
df, geometry=gpd.points_from_xy(df.long, df.lat))
gdf_monica=gdf_monica.set_crs('epsg:4326')
gdf_area = gpd.read_file('https://raw.githubusercontent.com/openpolis/geojson-italy/master/geojson/limits_IT_municipalities.geojson')#.loc[:, ["geometry"]]
gdf_area =gdf_area[gdf_area['name']=='Portici'].loc[:,['geometry']]
BOXES = 50
a, b, c, d = gdf_area.total_bounds
gdf_grid = gpd.GeoDataFrame(
geometry=[
shapely.geometry.box(minx, miny, maxx, maxy)
for minx, maxx in zip(np.linspace(a, c, BOXES), np.linspace(a, c, BOXES)[1:])
for miny, maxy in zip(np.linspace(b, d, BOXES), np.linspace(b, d, BOXES)[1:])
],
crs="epsg:4326",
)
# remove grid boxes created outside actual geometry
gdf_grid = gdf_grid.sjoin(gdf_area).drop(columns="index_right")
gdf_monica_binned = gdf_monica.loc[:, ["geometry", "CO"]].sjoin(gdf_grid)
# get median magnitude of CO pollutant
gdf_grid = gdf_grid.join(
gdf_monica_binned.dissolve(by="index_right", aggfunc="median").drop(columns="geometry")
)
# how many earthquakes in the grid
gdf_grid = gdf_grid.join(
gdf_monica_binned.dissolve(by="index_right", aggfunc=lambda d: len(d))
.drop(columns="geometry")
.rename(columns={"CO": "number"})
)
# drop grids geometries that have no measures and create folium map
m = gdf_grid.dropna().explore(column="CO")
# for good measure - boundary on map too
gdf_area["geometry"].apply(lambda g: shapely.geometry.MultiLineString([p.exterior for p in g.geoms])).explore(m=m)
that produce:
As you can understand, I have little or no knowledge regarding spatial analysis. I was not able to get correct results without using geojson data that describe a geometry within which the points of interest fall.
If anyone could add more insights... thanks!
I want to convert a pandas DataFrame to a spatial enabled geopandas one as:
df=pd.read_csv('../Desktop/test_esri.csv')
df.head()
Then converted using:
gdf = geopandas.GeoDataFrame(
df, geometry=geopandas.points_from_xy(df.long, df.lat))
from pyproj import crs
crs_epsg = crs.CRS.from_epsg(4326)
gdf=gdf.set_crs('epsg:4326')
Then I want to overimpose a spatial grid as:
import numpy as np
import shapely
from pyproj import crs
# total area for the grid
xmin, ymin, xmax, ymax= gdf.total_bounds
# how many cells across and down
n_cells=30
cell_size = (xmax-xmin)/n_cells
# projection of the grid
# crs = "+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs"
# create the cells in a loop
grid_cells = []
for x0 in np.arange(xmin, xmax+cell_size, cell_size ):
for y0 in np.arange(ymin, ymax+cell_size, cell_size):
# bounds
x1 = x0-cell_size
y1 = y0+cell_size
grid_cells.append( shapely.geometry.box(x0, y0, x1, y1) )
cell = geopandas.GeoDataFrame(grid_cells, columns=['geometry'],
crs=crs.CRS('epsg:4326'))
Then merge the grid with geopandas dataframe:
merged = geopandas.sjoin(gdf, cell, how='left', predicate='within')
To finally compute the desired metric inside "dissolve":
# Compute stats per grid cell -- aggregate fires to grid cells with dissolve
dissolve = merged.dissolve(by="index_right", aggfunc="median")
But I think I did something wrong with the "cell" grid and I can't figure it out!!
An extract of csv file used con be found here.
I have plotted a heatmap with the following data.
I have thousands of rows. Its just a sample. I also wanted to see the google map view of that coordinate. So I did something like this.
import folium
from folium.plugins import HeatMap
from folium.plugins import FastMarkerCluster
default_location=[11.1657, 45.4515]
m = folium.Map(location=default_location, zoom_start=13)
heat_data = [[row['lat'],row['lon']] for index, row in test.iterrows()]
# Plot it on the map
HeatMap(heat_data).add_to(m)
callback = ('function (row) {'
'var marker = L.marker(new L.LatLng(row[0], row[1]), {color: "red"});'
'var icon = L.AwesomeMarkers.icon({'
"icon: 'info-sign',"
"iconColor: 'white',"
"markerColor: 'green',"
"prefix: 'glyphicon',"
"extraClasses: 'fa-rotate-0'"
'});'
'marker.setIcon(icon);'
"var popup = L.popup({maxWidth: '300'});"
"const display_text = {text1: row[0], text2: row[1]};"
"var mytext = $(`<div id='mytext' class='display_text' style='width: 100.0%; height: 100.0%;'>\
<a href=https://https://www.google.com/maps?ll=${display_text.text1},${display_text.text2} target='_blank'>Open Google Maps</a></div>`)[0];"
"popup.setContent(mytext);"
"marker.bindPopup(popup);"
'return marker};')
m.add_child(FastMarkerCluster(heat_data, callback=callback))
# Display the map
m
Now for every gps coordinate I want to plot a small arrow or few small arrows in the angle of heading_direction and if possible show the distance_of_item in that angle from the gps coordinate. The expected outcome may be something like this.
In the above image, the location pointer is the gps coordinate, the direction and angle would be according to heading direction angle and there is a little star plotted which is the object. The object should be placed at a distance(in meters) mentioned in the dataset. I am not sure how to achieve that. Any lead or suggestions are most welcome. Thanks!
given your sample data is an image, have used alternate GPS data (UK hospitals) then added distance and direction columns as random values
given requirement is to plot a marker at location defined by distance and direction, first step is to calculate GPS co-ordinates of this.
use UTM CRS so that distance is meaningful
use high school maths to calculate x and y in UTM CRS
convert CRS back to WSG 84 so that have GPS co-ordinates
you have tagged question as plotly so I have used mapbox line and scatter traces to demonstrate building a tiled map
sample data is 1200+ hospitals, performance is decent
geopandas data frame could also be used to build folium tiles / markers. Key step is calculating the GPS co-ordinates
import geopandas as gpd
import pandas as pd
import numpy as np
import shapely
import math
import plotly.express as px
import plotly.graph_objects as go
import io, requests
# get some public addressess - hospitals. data that has GPS lat / lon
dfhos = pd.read_csv(io.StringIO(requests.get("http://media.nhschoices.nhs.uk/data/foi/Hospital.csv").text),
sep="¬",engine="python",).loc[:, ["OrganisationName", "Latitude", "Longitude"]]
# debug with fewer records
# df = dfhos.loc[0:500]
df = dfhos
# to use CRS transformations use geopandas, initial data is WSG 84, transform to UTM geometry
# directions and distances are random
gdf = gpd.GeoDataFrame(
data=df.assign(
heading_direction=lambda d: np.random.randint(0, 360, len(d)),
distance_of_item=lambda d: np.random.randint(10 ** 3, 10 ** 4, len(d)),
),
geometry=df.loc[:, ["Longitude", "Latitude"]].apply(
lambda r: shapely.geometry.Point(r["Longitude"], r["Latitude"]), axis=1
),
crs="EPSG:4326",
).pipe(lambda d: d.to_crs(d.estimate_utm_crs()))
# standard high school geometry...
def new_point(point, d, alpha):
alpha = math.radians(alpha)
return shapely.geometry.Point(
point.x + (d * math.cos(alpha)),
point.y + (d * math.sin(alpha)),
)
# calculate points based on direction and distance in UTM CRS. Then convert back to WSG 84 CRS
gdf["geometry2"] = gpd.GeoSeries(
gdf.apply(
lambda r: new_point(
r["geometry"], r["distance_of_item"], r["heading_direction"]
),
axis=1,
),
crs=gdf.geometry.crs,
).to_crs("EPSG:4326")
gdf = gdf.to_crs("EPSG:4326")
# plot lines to show start point and direct. plot markers of destinations for text of distance, etc
fig = px.line_mapbox(
lon=np.stack(
[gdf.geometry.x.values, gdf.geometry2.x.values, np.full(len(gdf), np.nan)],
axis=1,
).reshape([1, len(gdf) * 3])[0],
lat=np.stack(
[gdf.geometry.y.values, gdf.geometry2.y.values, np.full(len(gdf), np.nan)],
axis=1,
).reshape([1, len(gdf) * 3])[0],
).add_traces(
px.scatter_mapbox(
gdf,
lat=gdf.geometry2.y,
lon=gdf.geometry2.x,
hover_data=["distance_of_item", "OrganisationName"],
).data
)
# c = gdf.loc[]
fig.update_layout(mapbox={"style": "open-street-map", "zoom": 8, 'center': {'lat': 52.2316838387109, 'lon': -1.4577750831062155}}, margin={"l":0,"r":0,"t":0,"r":0})
I am learning how to work with spatial data in python at the moment, and have run into a problem: I'm trying to overlay some gridded data onto a basemap instance, but the projections don't seem to match up. Could anyone please tell me what I've done wrong?
I have been assuming that lon/lat positions are absolute, and the projection should be handled by the basemap instance - is this not the case? Basically, is my error in the grid coordinate conversion stage, or the plotting stage?
Thanks very much in advance!
My example:
First, download the gridded data.
import numpy as np
import pyproj
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
# Import the gridded data I want to plot
# NB: the header describes the grid layout: LL corner is -200000 eastings, -200000 northings, in the British National Grid (BNG, EPSG:27700) system. The 'no data' value is -9999
grid = np.loadtxt('grid_data.txt', skiprows = 6, dtype = 'float')
grid = np.ma.masked_where(grid<0,grid)
# Take a look at the grid: it's a map of the UK!
plt.imshow(grid)
# Define a BNG coordinate grid for the data, based on information in the header:
# LLcorner eastings = -200000
# LLcorner northings = -200000
# grid box size = 5000
# All units in metres
lle = -200000 + 2500 # because we want to plot the box point at the centre of each box, not its lower left corner.
lln = -200000 + 2500
delta = 5000
nn,ne = grid.shape
ns = np.arange(0.,nn,1)*delta + lln
es = np.arange(0.,ne,1)*delta + lle
# convert to mesh of eastings/northings.
eastings,northings = np.meshgrid(es,ns)
# convert mesh to lon/lat (wgs84) using pyproj:
bng = pyproj.Proj(init='epsg:27700')
wgs84 = pyproj.Proj(init='epsg:4326')
lons,lats = pyproj.transform(bng,wgs84,eastings,northings)
lats = lats[::-1] # because otherwise the map is upside down (differences between imshow and pcolormesh)
# Get a basemap of the UK
m = Basemap(projection='merc',llcrnrlon=-15 ,llcrnrlat=47.8,urcrnrlon=7.2 ,urcrnrlat=62.5,resolution='h')
# project the lon/lat grids onto map coordinates
x,y = m(lons,lats)
# Draw the map!
m.drawcoastlines()
m.pcolormesh(x,y,grid,cmap=plt.cm.jet)
plt.show()